aboutsummaryrefslogtreecommitdiffstats
path: root/security
AgeCommit message (Collapse)AuthorFilesLines
2020-01-04Merge tag 'apparmor-pr-2020-01-04' of ↵Linus Torvalds5-47/+55
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor Pull apparmor fixes from John Johansen: - performance regression: only get a label reference if the fast path check fails - fix aa_xattrs_match() may sleep while holding a RCU lock - fix bind mounts aborting with -ENOMEM * tag 'apparmor-pr-2020-01-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: apparmor: fix aa_xattrs_match() may sleep while holding a RCU lock apparmor: only get a label reference if the fast path check fails apparmor: fix bind mounts aborting with -ENOMEM
2020-01-04apparmor: fix aa_xattrs_match() may sleep while holding a RCU lockJohn Johansen3-42/+46
aa_xattrs_match() is unfortunately calling vfs_getxattr_alloc() from a context protected by an rcu_read_lock. This can not be done as vfs_getxattr_alloc() may sleep regardles of the gfp_t value being passed to it. Fix this by breaking the rcu_read_lock on the policy search when the xattr match feature is requested and restarting the search if a policy changes occur. Fixes: 8e51f9087f40 ("apparmor: Add support for attaching profiles via xattr, presence and value") Reported-by: Jia-Ju Bai <baijiaju1990@gmail.com> Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: John Johansen <john.johansen@canonical.com>
2020-01-02apparmor: only get a label reference if the fast path check failsJohn Johansen1-4/+8
The common fast path check can be done under rcu_read_lock() and doesn't need a reference count on the label. Only take a reference count if entering the slow path. Fixes reported hackbench regression - sha1 79e178a57dae ("Merge tag 'apparmor-pr-2019-12-03' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor") hackbench -l (256000/#grp) -g #grp 128 groups 19.679 ±0.90% - previous sha1 01d1dff64662 ("Merge tag 's390-5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux") hackbench -l (256000/#grp) -g #grp 128 groups 3.1689 ±3.04% Reported-by: Vincent Guittot <vincent.guittot@linaro.org> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: bce4e7e9c45e ("apparmor: reduce rcu_read_lock scope for aa_file_perm mediation") Signed-off-by: John Johansen <john.johansen@canonical.com>
2020-01-02apparmor: fix bind mounts aborting with -ENOMEMPatrick Steinhardt1-1/+1
With commit df323337e507 ("apparmor: Use a memory pool instead per-CPU caches, 2019-05-03"), AppArmor code was converted to use memory pools. In that conversion, a bug snuck into the code that polices bind mounts that causes all bind mounts to fail with -ENOMEM, as we erroneously error out if `aa_get_buffer` returns a pointer instead of erroring out when it does _not_ return a valid pointer. Fix the issue by correctly checking for valid pointers returned by `aa_get_buffer` to fix bind mounts with AppArmor. Fixes: df323337e507 ("apparmor: Use a memory pool instead per-CPU caches") Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-12-31Merge tag 'tomoyo-fixes-for-5.5' of ↵Linus Torvalds5-44/+27
git://git.osdn.net/gitroot/tomoyo/tomoyo-test1 Pull tomoyo fixes from Tetsuo Handa: "Two bug fixes: - Suppress RCU warning at list_for_each_entry_rcu() - Don't use fancy names on sockets" * tag 'tomoyo-fixes-for-5.5' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1: tomoyo: Suppress RCU warning at list_for_each_entry_rcu(). tomoyo: Don't use nifty names on sockets.
2019-12-18Merge tag 'tpmdd-next-20191219' of git://git.infradead.org/users/jjs/linux-tpmddLinus Torvalds5-12/+4
Pull tpm fixes from Jarkko Sakkinen: "Bunch of fixes for rc3" * tag 'tpmdd-next-20191219' of git://git.infradead.org/users/jjs/linux-tpmdd: tpm/tpm_ftpm_tee: add shutdown call back tpm: selftest: cleanup after unseal with wrong auth/policy test tpm: selftest: add test covering async mode tpm: fix invalid locking in NONBLOCKING mode security: keys: trusted: fix lost handle flush tpm_tis: reserve chip for duration of tpm_tis_core_init KEYS: asymmetric: return ENOMEM if akcipher_request_alloc() fails KEYS: remove CONFIG_KEYS_COMPAT
2019-12-17security: keys: trusted: fix lost handle flushJames Bottomley1-0/+1
The original code, before it was moved into security/keys/trusted-keys had a flush after the blob unseal. Without that flush, the volatile handles increase in the TPM until it becomes unusable and the system either has to be rebooted or the TPM volatile area manually flushed. Fix by adding back the lost flush, which we now have to export because of the relocation of the trusted key code may cause the consumer to be modular. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Fixes: 2e19e10131a0 ("KEYS: trusted: Move TPM2 trusted keys code") Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2019-12-16tomoyo: Suppress RCU warning at list_for_each_entry_rcu().Tetsuo Handa4-13/+26
John Garry has reported that allmodconfig kernel on arm64 causes flood of "RCU-list traversed in non-reader section!!" warning. I don't know what change caused this warning, but this warning is safe because TOMOYO uses SRCU lock instead. Let's suppress this warning by explicitly telling that the caller is holding SRCU lock. Reported-and-tested-by: John Garry <john.garry@huawei.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2019-12-12KEYS: remove CONFIG_KEYS_COMPATEric Biggers4-12/+3
KEYS_COMPAT now always takes the value of COMPAT && KEYS. But the security/keys/ directory is only compiled if KEYS is enabled, so in practice KEYS_COMPAT is the same as COMPAT. Therefore, remove the unnecessary KEYS_COMPAT and just use COMPAT directly. (Also remove an outdated comment from compat.c.) Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2019-12-11tomoyo: Don't use nifty names on sockets.Tetsuo Handa1-31/+1
syzbot is reporting that use of SOCKET_I()->sk from open() can result in use after free problem [1], for socket's inode is still reachable via /proc/pid/fd/n despite destruction of SOCKET_I()->sk already completed. At first I thought that this race condition applies to only open/getattr permission checks. But James Morris has pointed out that there are more permission checks where this race condition applies to. Thus, get rid of tomoyo_get_socket_name() instead of conditionally bypassing permission checks on sockets. As a side effect of this patch, "socket:[family=\$:type=\$:protocol=\$]" in the policy files has to be rewritten to "socket:[\$]". [1] https://syzkaller.appspot.com/bug?id=73d590010454403d55164cca23bd0565b1eb3b74 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+0341f6a4d729d4e0acf1@syzkaller.appspotmail.com> Reported-by: James Morris <jmorris@namei.org>
2019-12-09treewide: Use sizeof_field() macroPankaj Bharadiya1-2/+2
Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except at places where these are defined. Later patches will remove the unused definition of FIELD_SIZEOF(). This patch is generated using following script: EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h" git grep -l -e "\bFIELD_SIZEOF\b" | while read file; do if [[ "$file" =~ $EXCLUDE_FILES ]]; then continue fi sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file; done Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: David Miller <davem@davemloft.net> # for net
2019-12-03Merge tag 'apparmor-pr-2019-12-03' of ↵Linus Torvalds15-165/+526
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor Pull apparmor updates from John Johansen: "Features: - increase left match history buffer size to provide improved conflict resolution in overlapping execution rules. - switch buffer allocation to use a memory pool and GFP_KERNEL where possible. - add compression of policy blobs to reduce memory usage. Cleanups: - fix spelling mistake "immutible" -> "immutable" Bug fixes: - fix unsigned len comparison in update_for_len macro - fix sparse warning for type-casting of current->real_cred" * tag 'apparmor-pr-2019-12-03' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: apparmor: make it so work buffers can be allocated from atomic context apparmor: reduce rcu_read_lock scope for aa_file_perm mediation apparmor: fix wrong buffer allocation in aa_new_mount apparmor: fix unsigned len comparison with less than zero apparmor: increase left match history buffer size apparmor: Switch to GFP_KERNEL where possible apparmor: Use a memory pool instead per-CPU caches apparmor: Force type-casting of current->real_cred apparmor: fix spelling mistake "immutible" -> "immutable" apparmor: fix blob compression when ns is forced on a policy load apparmor: fix missing ZLIB defines apparmor: fix blob compression build failure on ppc apparmor: Initial implementation of raw policy blob compression
2019-12-01Merge tag 'y2038-cleanups-5.5' of ↵Linus Torvalds1-7/+3
git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground Pull y2038 cleanups from Arnd Bergmann: "y2038 syscall implementation cleanups This is a series of cleanups for the y2038 work, mostly intended for namespace cleaning: the kernel defines the traditional time_t, timeval and timespec types that often lead to y2038-unsafe code. Even though the unsafe usage is mostly gone from the kernel, having the types and associated functions around means that we can still grow new users, and that we may be missing conversions to safe types that actually matter. There are still a number of driver specific patches needed to get the last users of these types removed, those have been submitted to the respective maintainers" Link: https://lore.kernel.org/lkml/20191108210236.1296047-1-arnd@arndb.de/ * tag 'y2038-cleanups-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (26 commits) y2038: alarm: fix half-second cut-off y2038: ipc: fix x32 ABI breakage y2038: fix typo in powerpc vdso "LOPART" y2038: allow disabling time32 system calls y2038: itimer: change implementation to timespec64 y2038: move itimer reset into itimer.c y2038: use compat_{get,set}_itimer on alpha y2038: itimer: compat handling to itimer.c y2038: time: avoid timespec usage in settimeofday() y2038: timerfd: Use timespec64 internally y2038: elfcore: Use __kernel_old_timeval for process times y2038: make ns_to_compat_timeval use __kernel_old_timeval y2038: socket: use __kernel_old_timespec instead of timespec y2038: socket: remove timespec reference in timestamping y2038: syscalls: change remaining timeval to __kernel_old_timeval y2038: rusage: use __kernel_old_timeval y2038: uapi: change __kernel_time_t to __kernel_old_time_t y2038: stat: avoid 'time_t' in 'struct stat' y2038: ipc: remove __kernel_time_t reference from headers y2038: vdso: powerpc: avoid timespec references ...
2019-11-30Merge tag 'selinux-pr-20191126' of ↵Linus Torvalds9-5/+74
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: "Only three SELinux patches for v5.5: - Remove the size limit on SELinux policies, the limitation was a lingering vestige and no longer necessary. - Allow file labeling before the policy is loaded. This should ease some of the burden when the policy is initially loaded (no need to relabel files), but it should also help enable some new system concepts which dynamically create the root filesystem in the initrd. - Add support for the "greatest lower bound" policy construct which is defined as the intersection of the MLS range of two SELinux labels" * tag 'selinux-pr-20191126' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: default_range glblub implementation selinux: allow labeling before policy is loaded selinux: remove load size limit
2019-11-30Merge tag 'powerpc-5.5-1' of ↵Linus Torvalds12-97/+328
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights: - Infrastructure for secure boot on some bare metal Power9 machines. The firmware support is still in development, so the code here won't actually activate secure boot on any existing systems. - A change to xmon (our crash handler / pseudo-debugger) to restrict it to read-only mode when the kernel is lockdown'ed, otherwise it's trivial to drop into xmon and modify kernel data, such as the lockdown state. - Support for KASLR on 32-bit BookE machines (Freescale / NXP). - Fixes for our flush_icache_range() and __kernel_sync_dicache() (VDSO) to work with memory ranges >4GB. - Some reworks of the pseries CMM (Cooperative Memory Management) driver to make it behave more like other balloon drivers and enable some cleanups of generic mm code. - A series of fixes to our hardware breakpoint support to properly handle unaligned watchpoint addresses. Plus a bunch of other smaller improvements, fixes and cleanups. Thanks to: Alastair D'Silva, Andrew Donnellan, Aneesh Kumar K.V, Anthony Steinhauser, Cédric Le Goater, Chris Packham, Chris Smart, Christophe Leroy, Christopher M. Riedl, Christoph Hellwig, Claudio Carvalho, Daniel Axtens, David Hildenbrand, Deb McLemore, Diana Craciun, Eric Richter, Geert Uytterhoeven, Greg Kroah-Hartman, Greg Kurz, Gustavo L. F. Walbon, Hari Bathini, Harish, Jason Yan, Krzysztof Kozlowski, Leonardo Bras, Mathieu Malaterre, Mauro S. M. Rodrigues, Michal Suchanek, Mimi Zohar, Nathan Chancellor, Nathan Lynch, Nayna Jain, Nick Desaulniers, Oliver O'Halloran, Qian Cai, Rasmus Villemoes, Ravi Bangoria, Sam Bobroff, Santosh Sivaraj, Scott Wood, Thomas Huth, Tyrel Datwyler, Vaibhav Jain, Valentin Longchamp, YueHaibing" * tag 'powerpc-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (144 commits) powerpc/fixmap: fix crash with HIGHMEM x86/efi: remove unused variables powerpc: Define arch_is_kernel_initmem_freed() for lockdep powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp powerpc: Avoid clang warnings around setjmp and longjmp powerpc: Don't add -mabi= flags when building with Clang powerpc: Fix Kconfig indentation powerpc/fixmap: don't clear fixmap area in paging_init() selftests/powerpc: spectre_v2 test must be built 64-bit powerpc/powernv: Disable native PCIe port management powerpc/kexec: Move kexec files into a dedicated subdir. powerpc/32: Split kexec low level code out of misc_32.S powerpc/sysdev: drop simple gpio powerpc/83xx: map IMMR with a BAT. powerpc/32s: automatically allocate BAT in setbat() powerpc/ioremap: warn on early use of ioremap() powerpc: Add support for GENERIC_EARLY_IOREMAP powerpc/fixmap: Use __fix_to_virt() instead of fix_to_virt() powerpc/8xx: use the fixmapped IMMR in cpm_reset() powerpc/8xx: add __init to cpm1 init functions ...
2019-11-30Merge tag 'notifications-pipe-prep-20191115' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull pipe rework from David Howells: "This is my set of preparatory patches for building a general notification queue on top of pipes. It makes a number of significant changes: - It removes the nr_exclusive argument from __wake_up_sync_key() as this is always 1. This prepares for the next step: - Adds wake_up_interruptible_sync_poll_locked() so that poll can be woken up from a function that's holding the poll waitqueue spinlock. - Change the pipe buffer ring to be managed in terms of unbounded head and tail indices rather than bounded index and length. This means that reading the pipe only needs to modify one index, not two. - A selection of helper functions are provided to query the state of the pipe buffer, plus a couple to apply updates to the pipe indices. - The pipe ring is allowed to have kernel-reserved slots. This allows many notification messages to be spliced in by the kernel without allowing userspace to pin too many pages if it writes to the same pipe. - Advance the head and tail indices inside the pipe waitqueue lock and use wake_up_interruptible_sync_poll_locked() to poke poll without having to take the lock twice. - Rearrange pipe_write() to preallocate the buffer it is going to write into and then drop the spinlock. This allows kernel notifications to then be added the ring whilst it is filling the buffer it allocated. The read side is stalled because the pipe mutex is still held. - Don't wake up readers on a pipe if there was already data in it when we added more. - Don't wake up writers on a pipe if the ring wasn't full before we removed a buffer" * tag 'notifications-pipe-prep-20191115' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: pipe: Remove sync on wake_ups pipe: Increase the writer-wakeup threshold to reduce context-switch count pipe: Check for ring full inside of the spinlock in pipe_write() pipe: Remove redundant wakeup from pipe_write() pipe: Rearrange sequence in pipe_write() to preallocate slot pipe: Conditionalise wakeup in pipe_read() pipe: Advance tail pointer inside of wait spinlock in pipe_read() pipe: Allow pipes to have kernel-reserved slots pipe: Use head and tail pointers for the ring, not cursor and length Add wake_up_interruptible_sync_poll_locked() Remove the nr_exclusive argument from __wake_up_sync_key() pipe: Reduce #inclusion of pipe_fs_i.h
2019-11-29x86/efi: remove unused variablesYueHaibing1-5/+0
commit ad723674d675 ("x86/efi: move common keyring handler functions to new file") leave this unused. Fixes: ad723674d675 ("x86/efi: move common keyring handler functions to new file") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Link: https://lore.kernel.org/r/20191115130830.13320-1-yuehaibing@huawei.com
2019-11-27Merge tag 'drm-next-2019-11-27' of git://anongit.freedesktop.org/drm/drmLinus Torvalds1-2/+13
Pull drm updates from Dave Airlie: "Lots of stuff in here, though it hasn't been too insane this merge apart from dealing with the security fun. uapi: - export different colorspace properties on DP vs HDMI - new fourcc for ARM 16x16 block format - syncobj: allow querying last submitted timeline value - DRM_FORMAT_BIG_ENDIAN defined as unsigned core: - allow using gem vma manager in ttm - connector/encoder/bridge doc fixes - allow more than 3 encoders for a connector - displayport mst suspend/resume reprobing support - vram lazy unmapping, uniform vram mm and gem vram - edid cleanups + AVI informframe bar info - displayport helpers - dpcd parser added dp_cec: - Allow a connector to be associated with a cec device ttm: - pipelining with no_gpu_wait fix - always keep BOs on the LRU sched: - allow free_job routine to sleep i915: - Block userptr from mappable GTT - i915 perf uapi versioning - OA stream dynamic reconfiguration - make context persistence optional - introduce DRM_I915_UNSTABLE Kconfig - add fake lmem testing under unstable - BT.2020 support for DP MSA - struct mutex elimination - Tigerlake display/PLL/power management improvements - Jasper Lake PCH support - refactor PMU for multiple GPUs - Icelake firmware update - Split out vga + switcheroo code amdgpu: - implement dma-buf import/export without helpers - vega20 RAS enablement - DC i2c over aux fixes - renoir GPU reset - DC HDCP support - BACO support for CI/VI asics - MSI-X support - Arcturus EEPROM support - Arcturus VCN encode support - VCN dynamic powergating on RV/RV2 amdkfd: - add navi12/14/renoir support to kfd radeon: - SI dpm fix ported from amdgpu - fix bad DMA on ppc platforms gma500: - memory leak fixes qxl: - convert to new gem mmap exynos: - build warning fix komeda: - add aclk sysfs attribute v3d: - userspace cleanup uapi change i810: - fix for underflow in dispatch ioctls ast: - refactor show_cursor mgag200: - refactor show_cursor arcgpu: - encoder finding improvements mediatek: - mipi_tx, dsi and partial crtc support for MT8183 SoC - rotation support meson: - add suspend/resume support omap: - misc refactors tegra: - DisplayPort support for Tegra 210, 186 and 194. - IOMMU-backed DMA API fixes panfrost: - fix lockdep issue - simplify devfreq integration rcar-du: - R8A774B1 SoC support - fixes for H2 ES2.0 sun4i: - vcc-dsi regulator support virtio-gpu: - vmexit vs spinlock fix - move to gem shmem helpers - handle large command buffers with cma" * tag 'drm-next-2019-11-27' of git://anongit.freedesktop.org/drm/drm: (1855 commits) drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10 drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF. drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VF drm/amdgpu/gfx10: re-init clear state buffer after gpu reset merge fix for "ftrace: Rework event_create_dir()" drm/amdgpu: Update Arcturus golden registers drm/amdgpu/gfx10: fix out-of-bound mqd_backup array access drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt Revert "drm/amd/display: enable S/G for RAVEN chip" drm/amdgpu: disable gfxoff on original raven drm/amdgpu: remove experimental flag for Navi14 drm/amdgpu: disable gfxoff when using register read interface drm/amdgpu/powerplay: properly set PP_GFXOFF_MASK (v2) drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2 drm/radeon: fix bad DMA from INTERRUPT_CNTL2 drm/amd/display: Fix debugfs on MST connectors drm/amdgpu/nv: add asic func for fetching vbios from rom directly drm/amdgpu: put flush_delayed_work at first drm/amdgpu/vcn2.5: fix the enc loop with hw fini ...
2019-11-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds1-2/+2
Pull networking fixes from David Miller: "This is mostly to fix the iwlwifi regression: 1) Flush GRO state properly in iwlwifi driver, from Alexander Lobakin. 2) Validate TIPC link name with properly length macro, from John Rutherford. 3) Fix completion init and device query timeouts in ibmvnic, from Thomas Falcon. 4) Fix SKB size calculation for netlink messages in psample, from Nikolay Aleksandrov. 5) Similar kind of fix for OVS flow dumps, from Paolo Abeni. 6) Handle queue allocation failure unwind properly in gve driver, we could try to release pages we didn't allocate. From Jeroen de Borst. 7) Serialize TX queue SKB list accesses properly in mscc ocelot driver. From Yangbo Lu" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: net: usb: aqc111: Use the correct style for SPDX License Identifier net: phy: Use the correct style for SPDX License Identifier net: wireless: intel: iwlwifi: fix GRO_NORMAL packet stalling net: mscc: ocelot: use skb queue instead of skbs list net: mscc: ocelot: avoid incorrect consuming in skbs list gve: Fix the queue page list allocated pages count net: inet_is_local_reserved_port() port arg should be unsigned short openvswitch: fix flow command message size net: phy: dp83869: Fix return paths to return proper values net: psample: fix skb_over_panic net: usbnet: Fix -Wcast-function-type net: hso: Fix -Wcast-function-type net: port < inet_prot_sock(net) --> inet_port_requires_bind_service(net, port) ibmvnic: Serialize device queries ibmvnic: Bound waits for device queries ibmvnic: Terminate waiting device threads after loss of service ibmvnic: Fix completion structure initialization net-sctp: replace some sock_net(sk) with just 'net' net: Fix a documentation bug wrt. ip_unprivileged_port_start tipc: fix link name length check
2019-11-26Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main changes in this cycle were: - Dynamic tick (nohz) updates, perhaps most notably changes to force the tick on when needed due to lengthy in-kernel execution on CPUs on which RCU is waiting. - Linux-kernel memory consistency model updates. - Replace rcu_swap_protected() with rcu_prepace_pointer(). - Torture-test updates. - Documentation updates. - Miscellaneous fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) security/safesetid: Replace rcu_swap_protected() with rcu_replace_pointer() net/sched: Replace rcu_swap_protected() with rcu_replace_pointer() net/netfilter: Replace rcu_swap_protected() with rcu_replace_pointer() net/core: Replace rcu_swap_protected() with rcu_replace_pointer() bpf/cgroup: Replace rcu_swap_protected() with rcu_replace_pointer() fs/afs: Replace rcu_swap_protected() with rcu_replace_pointer() drivers/scsi: Replace rcu_swap_protected() with rcu_replace_pointer() drm/i915: Replace rcu_swap_protected() with rcu_replace_pointer() x86/kvm/pmu: Replace rcu_swap_protected() with rcu_replace_pointer() rcu: Upgrade rcu_swap_protected() to rcu_replace_pointer() rcu: Suppress levelspread uninitialized messages rcu: Fix uninitialized variable in nocb_gp_wait() rcu: Update descriptions for rcu_future_grace_period tracepoint rcu: Update descriptions for rcu_nocb_wake tracepoint rcu: Remove obsolete descriptions for rcu_barrier tracepoint rcu: Ensure that ->rcu_urgent_qs is set before resched IPI workqueue: Convert for_each_wq to use built-in list check rcu: Several rcu_segcblist functions can be static rcu: Remove unused function hlist_bl_del_init_rcu() Documentation: Rename rcu_node_context_switch() to rcu_note_context_switch() ...
2019-11-26Merge branch 'perf-core-for-linus' of ↵Linus Torvalds4-1/+103
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "The main kernel side changes in this cycle were: - Various Intel-PT updates and optimizations (Alexander Shishkin) - Prohibit kprobes on Xen/KVM emulate prefixes (Masami Hiramatsu) - Add support for LSM and SELinux checks to control access to the perf syscall (Joel Fernandes) - Misc other changes, optimizations, fixes and cleanups - see the shortlog for details. There were numerous tooling changes as well - 254 non-merge commits. Here are the main changes - too many to list in detail: - Enhancements to core tooling infrastructure, perf.data, libperf, libtraceevent, event parsing, vendor events, Intel PT, callchains, BPF support and instruction decoding. - There were updates to the following tools: perf annotate perf diff perf inject perf kvm perf list perf maps perf parse perf probe perf record perf report perf script perf stat perf test perf trace - And a lot of other changes: please see the shortlog and Git log for more details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (279 commits) perf parse: Fix potential memory leak when handling tracepoint errors perf probe: Fix spelling mistake "addrees" -> "address" libtraceevent: Fix memory leakage in copy_filter_type libtraceevent: Fix header installation perf intel-bts: Does not support AUX area sampling perf intel-pt: Add support for decoding AUX area samples perf intel-pt: Add support for recording AUX area samples perf pmu: When using default config, record which bits of config were changed by the user perf auxtrace: Add support for queuing AUX area samples perf session: Add facility to peek at all events perf auxtrace: Add support for dumping AUX area samples perf inject: Cut AUX area samples perf record: Add aux-sample-size config term perf record: Add support for AUX area sampling perf auxtrace: Add support for AUX area sample recording perf auxtrace: Move perf_evsel__find_pmu() perf record: Add a function to test for kernel support for AUX area sampling perf tools: Add kernel AUX area sampling definitions perf/core: Make the mlock accounting simple again perf report: Jump to symbol source view from total cycles view ...
2019-11-26net: port < inet_prot_sock(net) --> inet_port_requires_bind_service(net, port)Maciej Żenczykowski1-2/+2
Note that the sysctl write accessor functions guarantee that: net->ipv4.sysctl_ip_prot_sock <= net->ipv4.ip_local_ports.range[0] invariant is maintained, and as such the max() in selinux hooks is actually spurious. ie. even though if (snum < max(inet_prot_sock(sock_net(sk)), low) || snum > high) { per logic is the same as if ((snum < inet_prot_sock(sock_net(sk)) && snum < low) || snum > high) { it is actually functionally equivalent to: if (snum < low || snum > high) { which is equivalent to: if (snum < inet_prot_sock(sock_net(sk)) || snum < low || snum > high) { even though the first clause is spurious. But we want to hold on to it in case we ever want to change what what inet_port_requires_bind_service() means (for example by changing it from a, by default, [0..1024) range to some sort of set). Test: builds, git 'grep inet_prot_sock' finds no other references Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds1-1/+3
Pull networking updates from David Miller: "Another merge window, another pull full of stuff: 1) Support alternative names for network devices, from Jiri Pirko. 2) Introduce per-netns netdev notifiers, also from Jiri Pirko. 3) Support MSG_PEEK in vsock/virtio, from Matias Ezequiel Vara Larsen. 4) Allow compiling out the TLS TOE code, from Jakub Kicinski. 5) Add several new tracepoints to the kTLS code, also from Jakub. 6) Support set channels ethtool callback in ena driver, from Sameeh Jubran. 7) New SCTP events SCTP_ADDR_ADDED, SCTP_ADDR_REMOVED, SCTP_ADDR_MADE_PRIM, and SCTP_SEND_FAILED_EVENT. From Xin Long. 8) Add XDP support to mvneta driver, from Lorenzo Bianconi. 9) Lots of netfilter hw offload fixes, cleanups and enhancements, from Pablo Neira Ayuso. 10) PTP support for aquantia chips, from Egor Pomozov. 11) Add UDP segmentation offload support to igb, ixgbe, and i40e. From Josh Hunt. 12) Add smart nagle to tipc, from Jon Maloy. 13) Support L2 field rewrite by TC offloads in bnxt_en, from Venkat Duvvuru. 14) Add a flow mask cache to OVS, from Tonghao Zhang. 15) Add XDP support to ice driver, from Maciej Fijalkowski. 16) Add AF_XDP support to ice driver, from Krzysztof Kazimierczak. 17) Support UDP GSO offload in atlantic driver, from Igor Russkikh. 18) Support it in stmmac driver too, from Jose Abreu. 19) Support TIPC encryption and auth, from Tuong Lien. 20) Introduce BPF trampolines, from Alexei Starovoitov. 21) Make page_pool API more numa friendly, from Saeed Mahameed. 22) Introduce route hints to ipv4 and ipv6, from Paolo Abeni. 23) Add UDP segmentation offload to cxgb4, Rahul Lakkireddy" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1857 commits) libbpf: Fix usage of u32 in userspace code mm: Implement no-MMU variant of vmalloc_user_node_flags slip: Fix use-after-free Read in slip_open net: dsa: sja1105: fix sja1105_parse_rgmii_delays() macvlan: schedule bc_work even if error enetc: add support Credit Based Shaper(CBS) for hardware offload net: phy: add helpers phy_(un)lock_mdio_bus mdio_bus: don't use managed reset-controller ax88179_178a: add ethtool_op_get_ts_info() mlxsw: spectrum_router: Fix use of uninitialized adjacency index mlxsw: spectrum_router: After underlay moves, demote conflicting tunnels bpf: Simplify __bpf_arch_text_poke poke type handling bpf: Introduce BPF_TRACE_x helper for the tracing tests bpf: Add bpf_jit_blinding_enabled for !CONFIG_BPF_JIT bpf, testing: Add various tail call test cases bpf, x86: Emit patchable direct jump as tail call bpf: Constant map key tracking for prog array pokes bpf: Add poke dependency tracking for prog array maps bpf: Add initial poke descriptor table for jit images bpf: Move owner type, jited info into array auxiliary data ...
2019-11-22apparmor: make it so work buffers can be allocated from atomic contextJohn Johansen6-38/+62
In some situations AppArmor needs to be able to use its work buffers from atomic context. Add the ability to specify when in atomic context and hold a set of work buffers in reserve for atomic context to reduce the chance that a large work buffer allocation will need to be done. Fixes: df323337e507 ("apparmor: Use a memory pool instead per-CPU caches") Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-11-22apparmor: reduce rcu_read_lock scope for aa_file_perm mediationJohn Johansen1-3/+3
Now that the buffers allocation has changed and no longer needs the full mediation under an rcu_read_lock, reduce the rcu_read_lock scope to only where it is necessary. Fixes: df323337e507 ("apparmor: Use a memory pool instead per-CPU caches") Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-11-22apparmor: fix wrong buffer allocation in aa_new_mountJohn Johansen1-4/+4
Fix the following trace caused by the dev_path buffer not being allocated. [ 641.044262] AppArmor WARN match_mnt: ((devpath && !devbuffer)): [ 641.044284] WARNING: CPU: 1 PID: 30709 at ../security/apparmor/mount.c:385 match_mnt+0x133/0x180 [ 641.044286] Modules linked in: snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_hda_codec snd_hda_core qxl ttm snd_hwdep snd_pcm drm_kms_helper snd_seq_midi snd_seq_midi_event drm snd_rawmidi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iptable_mangle aesni_intel aes_x86_64 xt_tcpudp crypto_simd snd_seq cryptd bridge stp llc iptable_filter glue_helper snd_seq_device snd_timer joydev input_leds snd serio_raw fb_sys_fops 9pnet_virtio 9pnet syscopyarea sysfillrect soundcore sysimgblt qemu_fw_cfg mac_hid sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 8139too psmouse 8139cp i2c_piix4 pata_acpi mii floppy [ 641.044318] CPU: 1 PID: 30709 Comm: mount Tainted: G D W 5.1.0-rc4+ #223 [ 641.044320] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 641.044323] RIP: 0010:match_mnt+0x133/0x180 [ 641.044325] Code: 41 5d 41 5e 41 5f c3 48 8b 4c 24 18 eb b1 48 c7 c6 08 84 26 83 48 c7 c7 f0 56 54 83 4c 89 54 24 08 48 89 14 24 e8 7d d3 bb ff <0f> 0b 4c 8b 54 24 08 48 8b 14 24 e9 25 ff ff ff 48 c7 c6 08 84 26 [ 641.044327] RSP: 0018:ffffa9b34ac97d08 EFLAGS: 00010282 [ 641.044329] RAX: 0000000000000000 RBX: ffff9a86725a8558 RCX: 0000000000000000 [ 641.044331] RDX: 0000000000000002 RSI: 0000000000000001 RDI: 0000000000000246 [ 641.044333] RBP: ffffa9b34ac97db0 R08: 0000000000000000 R09: 0000000000000000 [ 641.044334] R10: 0000000000000000 R11: 00000000000077f5 R12: 0000000000000000 [ 641.044336] R13: ffffa9b34ac97e98 R14: ffff9a865e000008 R15: ffff9a86c4cf42b8 [ 641.044338] FS: 00007fab73969740(0000) GS:ffff9a86fbb00000(0000) knlGS:0000000000000000 [ 641.044340] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 641.044342] CR2: 000055f90bc62035 CR3: 00000000aab5f006 CR4: 00000000003606e0 [ 641.044346] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 641.044348] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 641.044349] Call Trace: [ 641.044355] aa_new_mount+0x119/0x2c0 [ 641.044363] apparmor_sb_mount+0xd4/0x430 [ 641.044367] security_sb_mount+0x46/0x70 [ 641.044372] do_mount+0xbb/0xeb0 [ 641.044377] ? memdup_user+0x4b/0x70 [ 641.044380] ksys_mount+0x7e/0xd0 [ 641.044384] __x64_sys_mount+0x21/0x30 [ 641.044388] do_syscall_64+0x5a/0x1a0 [ 641.044392] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 641.044394] RIP: 0033:0x7fab73a8790a [ 641.044397] Code: 48 8b 0d 89 85 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 56 85 0c 00 f7 d8 64 89 01 48 [ 641.044399] RSP: 002b:00007ffe0ffe4238 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 [ 641.044401] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fab73a8790a [ 641.044429] RDX: 000055f90bc6203b RSI: 00007ffe0ffe57b1 RDI: 00007ffe0ffe57a5 [ 641.044431] RBP: 00007ffe0ffe4250 R08: 0000000000000000 R09: 00007fab73b51d80 [ 641.044433] R10: 00000000c0ed0004 R11: 0000000000000206 R12: 000055f90bc610b0 [ 641.044434] R13: 00007ffe0ffe4330 R14: 0000000000000000 R15: 0000000000000000 [ 641.044457] irq event stamp: 0 [ 641.044460] hardirqs last enabled at (0): [<0000000000000000>] (null) [ 641.044463] hardirqs last disabled at (0): [<ffffffff82290114>] copy_process.part.30+0x734/0x23f0 [ 641.044467] softirqs last enabled at (0): [<ffffffff82290114>] copy_process.part.30+0x734/0x23f0 [ 641.044469] softirqs last disabled at (0): [<0000000000000000>] (null) [ 641.044470] ---[ end trace c0d54bdacf6af6b2 ]--- Fixes: df323337e507 ("apparmor: Use a memory pool instead per-CPU caches") Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-11-22apparmor: fix unsigned len comparison with less than zeroColin Ian King1-5/+7
The sanity check in macro update_for_len checks to see if len is less than zero, however, len is a size_t so it can never be less than zero, so this sanity check is a no-op. Fix this by making len a ssize_t so the comparison will work and add ulen that is a size_t copy of len so that the min() macro won't throw warnings about comparing different types. Addresses-Coverity: ("Macro compares unsigned to 0") Fixes: f1bd904175e8 ("apparmor: add the base fns() for domain labels") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-11-15y2038: move itimer reset into itimer.cArnd Bergmann1-7/+3
Preparing for a change to the itimer internals, stop using the do_setitimer() symbol and instead use a new higher-level interface. The do_getitimer()/do_setitimer functions can now be made static, allowing the compiler to potentially produce better object code. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-11-14Merge v5.4-rc7 into drm-nextDave Airlie1-0/+1
We have the i915 security fixes to backmerge, but first let's clear the decks for other drivers to avoid a bigger mess. Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-11-13Merge branch 'topic/secureboot' into nextMichael Ellerman11-92/+326
Merge the secureboot support, as well as the IMA changes needed to support it. From Nayna's cover letter: In order to verify the OS kernel on PowerNV systems, secure boot requires X.509 certificates trusted by the platform. These are stored in secure variables controlled by OPAL, called OPAL secure variables. In order to enable users to manage the keys, the secure variables need to be exposed to userspace. OPAL provides the runtime services for the kernel to be able to access the secure variables. This patchset defines the kernel interface for the OPAL APIs. These APIs are used by the hooks, which load these variables to the keyring and expose them to the userspace for reading/writing. Overall, this patchset adds the following support: * expose secure variables to the kernel via OPAL Runtime API interface * expose secure variables to the userspace via kernel sysfs interface * load kernel verification and revocation keys to .platform and .blacklist keyring respectively. The secure variables can be read/written using simple linux utilities cat/hexdump. For example: Path to the secure variables is: /sys/firmware/secvar/vars Each secure variable is listed as directory. $ ls -l total 0 drwxr-xr-x. 2 root root 0 Aug 20 21:20 db drwxr-xr-x. 2 root root 0 Aug 20 21:20 KEK drwxr-xr-x. 2 root root 0 Aug 20 21:20 PK The attributes of each of the secure variables are (for example: PK): $ ls -l total 0 -r--r--r--. 1 root root 4096 Oct 1 15:10 data -r--r--r--. 1 root root 65536 Oct 1 15:10 size --w-------. 1 root root 4096 Oct 1 15:12 update The "data" is used to read the existing variable value using hexdump. The data is stored in ESL format. The "update" is used to write a new value using cat. The update is to be submitted as AUTH file.
2019-11-12KEYS: trusted: Remove set but not used variable 'keyhndl'zhengbin1-2/+0
Fixes gcc '-Wunused-but-set-variable' warning: security/keys/trusted-keys/trusted_tpm1.c: In function tpm_unseal: security/keys/trusted-keys/trusted_tpm1.c:588:11: warning: variable keyhndl set but not used [-Wunused-but-set-variable] Fixes: 00aa975bd031 ("KEYS: trusted: Create trusted keys subsystem") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2019-11-12KEYS: trusted: Move TPM2 trusted keys codeSumit Garg3-2/+317
Move TPM2 trusted keys code to trusted keys subsystem. The reason being it's better to consolidate all the trusted keys code to a single location so that it can be maintained sanely. Also, utilize existing tpm_send() exported API which wraps the internal tpm_transmit_cmd() API. Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2019-11-12KEYS: trusted: Create trusted keys subsystemSumit Garg3-2/+9
Move existing code to trusted keys subsystem. Also, rename files with "tpm" as suffix which provides the underlying implementation. Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2019-11-12KEYS: Use common tpm_buf for trusted and asymmetric keysSumit Garg1-55/+43
Switch to utilize common heap based tpm_buf code for TPM based trusted and asymmetric keys rather than using stack based tpm1_buf code. Also, remove tpm1_buf code. Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2019-11-12tpm: Move tpm_buf code to include/linux/Sumit Garg1-6/+6
Move tpm_buf code to common include/linux/tpm.h header so that it can be reused via other subsystems like trusted keys etc. Also rename trusted keys and asymmetric keys usage of TPM 1.x buffer implementation to tpm1_buf to avoid any compilation errors. Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2019-11-13powerpc: Load firmware trusted keys/hashes into kernel keyringNayna Jain3-1/+108
The keys used to verify the Host OS kernel are managed by firmware as secure variables. This patch loads the verification keys into the .platform keyring and revocation hashes into .blacklist keyring. This enables verification and loading of the kernels signed by the boot time keys which are trusted by firmware. Signed-off-by: Nayna Jain <nayna@linux.ibm.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Eric Richter <erichte@linux.ibm.com> [mpe: Search by compatible in load_powerpc_certs(), not using format] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573441836-3632-5-git-send-email-nayna@linux.ibm.com
2019-11-13x86/efi: move common keyring handler functions to new fileNayna Jain4-67/+115
The handlers to add the keys to the .platform keyring and blacklisted hashes to the .blacklist keyring is common for both the uefi and powerpc mechanisms of loading the keys/hashes from the firmware. This patch moves the common code from load_uefi.c to keyring_handler.c Signed-off-by: Nayna Jain <nayna@linux.ibm.com> Acked-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Eric Richter <erichte@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1573441836-3632-4-git-send-email-nayna@linux.ibm.com
2019-11-12ima: Check against blacklisted hashes for files with modsigNayna Jain5-6/+60
Asymmetric private keys are used to sign multiple files. The kernel currently supports checking against blacklisted keys. However, if the public key is blacklisted, any file signed by the blacklisted key will automatically fail signature verification. Blacklisting the public key is not fine enough granularity, as we might want to only blacklist a particular file. This patch adds support for checking against the blacklisted hash of the file, without the appended signature, based on the IMA policy. It defines a new policy option "appraise_flag=check_blacklist". In addition to the blacklisted binary hashes stored in the firmware "dbx" variable, the Linux kernel may be configured to load blacklisted binary hashes onto the .blacklist keyring as well. The following example shows how to blacklist a specific kernel module hash. $ sha256sum kernel/kheaders.ko 77fa889b35a05338ec52e51591c1b89d4c8d1c99a21251d7c22b1a8642a6bad3 kernel/kheaders.ko $ grep BLACKLIST .config CONFIG_SYSTEM_BLACKLIST_KEYRING=y CONFIG_SYSTEM_BLACKLIST_HASH_LIST="blacklist-hash-list" $ cat certs/blacklist-hash-list "bin:77fa889b35a05338ec52e51591c1b89d4c8d1c99a21251d7c22b1a8642a6bad3" Update the IMA custom measurement and appraisal policy rules (/etc/ima-policy): measure func=MODULE_CHECK template=ima-modsig appraise func=MODULE_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig After building, installing, and rebooting the kernel: 545660333 ---lswrv 0 0 \_ blacklist: bin:77fa889b35a05338ec52e51591c1b89d4c8d1c99a21251d7c22b1a8642a6bad3 measure func=MODULE_CHECK template=ima-modsig appraise func=MODULE_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig modprobe: ERROR: could not insert 'kheaders': Permission denied 10 0c9834db5a0182c1fb0cdc5d3adcf11a11fd83dd ima-sig sha256:3bc6ed4f0b4d6e31bc1dbc9ef844605abc7afdc6d81a57d77a1ec9407997c40 2 /usr/lib/modules/5.4.0-rc3+/kernel/kernel/kheaders.ko 10 82aad2bcc3fa8ed94762356b5c14838f3bcfa6a0 ima-modsig sha256:3bc6ed4f0b4d6e31bc1dbc9ef844605abc7afdc6d81a57d77a1ec9407997c40 2 /usr/lib/modules/5.4.0rc3+/kernel/kernel/kheaders.ko sha256:77fa889b3 5a05338ec52e51591c1b89d4c8d1c99a21251d7c22b1a8642a6bad3 3082029a06092a864886f70d010702a082028b30820287020101310d300b0609608648 016503040201300b06092a864886f70d01070131820264.... 10 25b72217cc1152b44b134ce2cd68f12dfb71acb3 ima-buf sha256:8b58427fedcf8f4b20bc8dc007f2e232bf7285d7b93a66476321f9c2a3aa132 b blacklisted-hash 77fa889b35a05338ec52e51591c1b89d4c8d1c99a21251d7c22b1a8642a6bad3 Signed-off-by: Nayna Jain <nayna@linux.ibm.com> [zohar@linux.ibm.com: updated patch description] Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1572492694-6520-8-git-send-email-zohar@linux.ibm.com
2019-11-12ima: Make process_buffer_measurement() genericNayna Jain2-18/+43
process_buffer_measurement() is limited to measuring the kexec boot command line. This patch makes process_buffer_measurement() more generic, allowing it to measure other types of buffer data (e.g. blacklisted binary hashes or key hashes). process_buffer_measurement() may be called directly from an IMA hook or as an auxiliary measurement record. In both cases the buffer measurement is based on policy. This patch modifies the function to conditionally retrieve the policy defined PCR and template for the IMA hook case. Signed-off-by: Nayna Jain <nayna@linux.ibm.com> [zohar@linux.ibm.com: added comment in process_buffer_measurement()] Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1572492694-6520-6-git-send-email-zohar@linux.ibm.com
2019-11-11Merge tag 'v5.4-rc7' into perf/core, to pick up fixesIngo Molnar1-0/+1
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-0/+1
The only slightly tricky merge conflict was the netdevsim because the mutex locking fix overlapped a lot of driver reload reorganization. The rest were (relatively) trivial in nature. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-31efi/efi_test: Lock down /dev/efi_test and require CAP_SYS_ADMINJavier Martinez Canillas1-0/+1
The driver exposes EFI runtime services to user-space through an IOCTL interface, calling the EFI services function pointers directly without using the efivar API. Disallow access to the /dev/efi_test character device when the kernel is locked down to prevent arbitrary user-space to call EFI runtime services. Also require CAP_SYS_ADMIN to open the chardev to prevent unprivileged users to call the EFI runtime services, instead of just relying on the chardev file mode bits for this. The main user of this driver is the fwts [0] tool that already checks if the effective user ID is 0 and fails otherwise. So this change shouldn't cause any regression to this tool. [0]: https://wiki.ubuntu.com/FirmwareTestSuite/Reference/uefivarinfo Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Laszlo Ersek <lersek@redhat.com> Acked-by: Matthew Garrett <mjg59@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: https://lkml.kernel.org/r/20191029173755.27149-7-ardb@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-31Merge branch 'for-mingo' of ↵Ingo Molnar1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU and LKMM changes from Paul E. McKenney: - Documentation updates. - Miscellaneous fixes. - Dynamic tick (nohz) updates, perhaps most notably changes to force the tick on when needed due to lengthy in-kernel execution on CPUs on which RCU is waiting. - Replace rcu_swap_protected() with rcu_prepace_pointer(). - Torture-test updates. - Linux-kernel memory consistency model updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-30security/safesetid: Replace rcu_swap_protected() with rcu_replace_pointer()Paul E. McKenney1-2/+2
This commit replaces the use of rcu_swap_protected() with the more intuitively appealing rcu_replace_pointer() as a step towards removing rcu_swap_protected(). Link: https://lore.kernel.org/lkml/CAHk-=wiAsJLw1egFEE=Z7-GGtM6wcvtyytXZA1+BHqta4gg6Hw@mail.gmail.com/ Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Reported-by: kbuild test robot <lkp@intel.com> [ paulmck: From rcu_replace() to rcu_replace_pointer() per Ingo Molnar. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Micah Morton <mortonm@chromium.org> Cc: James Morris <jmorris@namei.org> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: <linux-security-module@vger.kernel.org>
2019-10-28powerpc/xmon: Restrict when kernel is locked downChristopher M. Riedl1-0/+2
Xmon should be either fully or partially disabled depending on the kernel lockdown state. Put xmon into read-only mode for lockdown=integrity and prevent user entry into xmon when lockdown=confidentiality. Xmon checks the lockdown state on every attempted entry: (1) during early xmon'ing (2) when triggered via sysrq (3) when toggled via debugfs (4) when triggered via a previously enabled breakpoint The following lockdown state transitions are handled: (1) lockdown=none -> lockdown=integrity set xmon read-only mode (2) lockdown=none -> lockdown=confidentiality clear all breakpoints, set xmon read-only mode, prevent user re-entry into xmon (3) lockdown=integrity -> lockdown=confidentiality clear all breakpoints, set xmon read-only mode, prevent user re-entry into xmon Suggested-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Christopher M. Riedl <cmr@informatik.wtf> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190907061124.1947-3-cmr@informatik.wtf
2019-10-26Merge tag 'drm-next-5.5-2019-10-09' of ↵Dave Airlie1-2/+13
git://people.freedesktop.org/~agd5f/linux into drm-next drm-next-5.5-2019-10-09: amdgpu: - Additional RAS enablement for vega20 - RAS page retirement and bad page storage in EEPROM - No GPU reset with unrecoverable RAS errors - Reserve vram for page tables rather than trying to evict - Fix issues with GPU reset and xgmi hives - DC i2c over aux fixes - Direct submission for clears, PTE/PDE updates - Improvements to help support recoverable GPU page faults - Silence harmless SAD block messages - Clean up code for creating a bo at a fixed location - Initial DC HDCP support - Lots of documentation fixes - GPU reset for renoir - Add IH clockgating support for soc15 asics - Powerplay improvements - DC MST cleanups - Add support for MSI-X - Misc cleanups and bug fixes amdkfd: - Query KFD device info by asic type rather than pci ids - Add navi14 support - Add renoir support - Add navi12 support - gfx10 trap handler improvements - pasid cleanups - Check against device cgroup ttm: - Return -EBUSY with pipelining with no_gpu_wait radeon: - Silence harmless SAD block messages device_cgroup: - Export devcgroup_check_permission Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191010041713.3412-1-alexander.deucher@amd.com
2019-10-23pipe: Reduce #inclusion of pipe_fs_i.hDavid Howells1-1/+0
Remove some #inclusions of linux/pipe_fs_i.h that don't seem to be necessary any more. Signed-off-by: David Howells <dhowells@redhat.com>
2019-10-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-1/+8
Several cases of overlapping changes which were for the most part trivially resolvable. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-17perf_event: Add support for LSM and SELinux checksJoel Fernandes (Google)4-1/+103
In current mainline, the degree of access to perf_event_open(2) system call depends on the perf_event_paranoid sysctl. This has a number of limitations: 1. The sysctl is only a single value. Many types of accesses are controlled based on the single value thus making the control very limited and coarse grained. 2. The sysctl is global, so if the sysctl is changed, then that means all processes get access to perf_event_open(2) opening the door to security issues. This patch adds LSM and SELinux access checking which will be used in Android to access perf_event_open(2) for the purposes of attaching BPF programs to tracepoints, perf profiling and other operations from userspace. These operations are intended for production systems. 5 new LSM hooks are added: 1. perf_event_open: This controls access during the perf_event_open(2) syscall itself. The hook is called from all the places that the perf_event_paranoid sysctl is checked to keep it consistent with the systctl. The hook gets passed a 'type' argument which controls CPU, kernel and tracepoint accesses (in this context, CPU, kernel and tracepoint have the same semantics as the perf_event_paranoid sysctl). Additionally, I added an 'open' type which is similar to perf_event_paranoid sysctl == 3 patch carried in Android and several other distros but was rejected in mainline [1] in 2016. 2. perf_event_alloc: This allocates a new security object for the event which stores the current SID within the event. It will be useful when the perf event's FD is passed through IPC to another process which may try to read the FD. Appropriate security checks will limit access. 3. perf_event_free: Called when the event is closed. 4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event. 5. perf_event_write: Called from the ioctl(2) syscalls for the event. [1] https://lwn.net/Articles/696240/ Since Peter had suggest LSM hooks in 2016 [1], I am adding his Suggested-by tag below. To use this patch, we set the perf_event_paranoid sysctl to -1 and then apply selinux checking as appropriate (default deny everything, and then add policy rules to give access to domains that need it). In the future we can remove the perf_event_paranoid sysctl altogether. Suggested-by: Peter Zijlstra <peterz@infradead.org> Co-developed-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: James Morris <jmorris@namei.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: rostedt@goodmis.org Cc: Yonghong Song <yhs@fb.com> Cc: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: jeffv@google.com Cc: Jiri Olsa <jolsa@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: primiano@google.com Cc: Song Liu <songliubraving@fb.com> Cc: rsavitski@google.com Cc: Namhyung Kim <namhyung@kernel.org> Cc: Matthew Garrett <matthewgarrett@google.com> Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-08Merge tag 'selinux-pr-20191007' of ↵Linus Torvalds1-1/+8
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinuxfix from Paul Moore: "One patch to ensure we don't copy bad memory up into userspace" * tag 'selinux-pr-20191007' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: fix context string corruption in convert_context()
2019-10-07selinux: default_range glblub implementationJoshua Brindle7-1/+62
A policy developer can now specify glblub as a default_range default and the computed transition will be the intersection of the mls range of the two contexts. The glb (greatest lower bound) lub (lowest upper bound) of a range is calculated as the greater of the low sensitivities and the lower of the high sensitivities and the and of each category bitmap. This can be used by MLS solution developers to compute a context that satisfies, for example, the range of a network interface and the range of a user logging in. Some examples are: User Permitted Range | Network Device Label | Computed Label ---------------------|----------------------|---------------- s0-s1:c0.c12 | s0 | s0 s0-s1:c0.c12 | s0-s1:c0.c1023 | s0-s1:c0.c12 s0-s4:c0.c512 | s1-s1:c0.c1023 | s1-s1:c0.c512 s0-s15:c0,c2 | s4-s6:c0.c128 | s4-s6:c0,c2 s0-s4 | s2-s6 | s2-s4 s0-s4 | s5-s8 | INVALID s5-s8 | s0-s4 | INVALID Signed-off-by: Joshua Brindle <joshua.brindle@crunchydata.com> [PM: subject lines and checkpatch.pl fixes] Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-10-07device_cgroup: Export devcgroup_check_permissionHarish Kasiviswanathan1-2/+13
For AMD compute (amdkfd) driver. All AMD compute devices are exported via single device node /dev/kfd. As a result devices cannot be controlled individually using device cgroup. AMD compute devices will rely on its graphics counterpart that exposes /dev/dri/renderN node for each device. For each task (based on its cgroup), KFD driver will check if /dev/dri/renderN node is accessible before exposing it. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-3/+0
2019-10-05integrity: remove pointless subdir-$(CONFIG_...)Masahiro Yamada1-2/+0
The ima/ and evm/ sub-directories contain built-in objects, so obj-$(CONFIG_...) is the correct way to descend into them. subdir-$(CONFIG_...) is redundant. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-10-05integrity: remove unneeded, broken attempt to add -fshort-wcharMasahiro Yamada1-1/+0
I guess commit 15ea0e1e3e18 ("efi: Import certificates from UEFI Secure Boot") attempted to add -fshort-wchar for building load_uefi.o, but it has never worked as intended. load_uefi.o is created in the platform_certs/ sub-directory. If you really want to add -fshort-wchar, the correct code is: $(obj)/platform_certs/load_uefi.o: KBUILD_CFLAGS += -fshort-wchar But, you do not need to fix it. Commit 8c97023cf051 ("Kbuild: use -fshort-wchar globally") had already added -fshort-wchar globally. This code was unneeded in the first place. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-10-03selinux: fix context string corruption in convert_context()Ondrej Mosnacek1-1/+8
string_to_context_struct() may garble the context string, so we need to copy back the contents again from the old context struct to avoid storing the corrupted context. Since string_to_context_struct() tokenizes (and therefore truncates) the context string and we are later potentially copying it with kstrdup(), this may eventually cause pieces of uninitialized kernel memory to be disclosed to userspace (when copying to userspace based on the stored length and not the null character). How to reproduce on Fedora and similar: # dnf install -y memcached # systemctl start memcached # semodule -d memcached # load_policy # load_policy # systemctl stop memcached # ausearch -m AVC type=AVC msg=audit(1570090572.648:313): avc: denied { signal } for pid=1 comm="systemd" scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=process permissive=0 trawcon=73797374656D5F75007400000000000070BE6E847296FFFF726F6D000096FFFF76 Cc: stable@vger.kernel.org Reported-by: Milos Malik <mmalik@redhat.com> Fixes: ee1a84fdfeed ("selinux: overhaul sidtab to fix bug and improve performance") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-10-01net: rtnetlink: add linkprop commands to add and delete alternative ifnamesJiri Pirko1-1/+3
Add two commands to add and delete list of link properties. Implement the first property type along - alternative ifnames. Each net device can have multiple alternative names. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-01selinux: allow labeling before policy is loadedJonathan Lebon1-0/+12
Currently, the SELinux LSM prevents one from setting the `security.selinux` xattr on an inode without a policy first being loaded. However, this restriction is problematic: it makes it impossible to have newly created files with the correct label before actually loading the policy. This is relevant in distributions like Fedora, where the policy is loaded by systemd shortly after pivoting out of the initrd. In such instances, all files created prior to pivoting will be unlabeled. One then has to relabel them after pivoting, an operation which inherently races with other processes trying to access those same files. Going further, there are use cases for creating the entire root filesystem on first boot from the initrd (e.g. Container Linux supports this today[1], and we'd like to support it in Fedora CoreOS as well[2]). One can imagine doing this in two ways: at the block device level (e.g. laying down a disk image), or at the filesystem level. In the former, labeling can simply be part of the image. But even in the latter scenario, one still really wants to be able to set the right labels when populating the new filesystem. This patch enables this by changing behaviour in the following two ways: 1. allow `setxattr` if we're not initialized 2. don't try to set the in-core inode SID if we're not initialized; instead leave it as `LABEL_INVALID` so that revalidation may be attempted at a later time Note the first hunk of this patch is mostly the same as a previously discussed one[3], though it was part of a larger series which wasn't accepted. [1] https://coreos.com/os/docs/latest/root-filesystem-placement.html [2] https://github.com/coreos/fedora-coreos-tracker/issues/94 [3] https://www.spinics.net/lists/linux-initramfs/msg04593.html Co-developed-by: Victor Kamensky <kamensky@cisco.com> Signed-off-by: Victor Kamensky <kamensky@cisco.com> Signed-off-by: Jonathan Lebon <jlebon@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-10-01selinux: remove load size limitzhanglin1-4/+0
Load size was limited to 64MB, this was legacy limitation due to vmalloc() which was removed a while ago. Signed-off-by: zhanglin <zhang.lin16@zte.com.cn> [PM: removed comments in the description about 'real world use cases'] Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-09-28Merge branch 'next-lockdown' of ↵Linus Torvalds10-16/+350
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull kernel lockdown mode from James Morris: "This is the latest iteration of the kernel lockdown patchset, from Matthew Garrett, David Howells and others. From the original description: This patchset introduces an optional kernel lockdown feature, intended to strengthen the boundary between UID 0 and the kernel. When enabled, various pieces of kernel functionality are restricted. Applications that rely on low-level access to either hardware or the kernel may cease working as a result - therefore this should not be enabled without appropriate evaluation beforehand. The majority of mainstream distributions have been carrying variants of this patchset for many years now, so there's value in providing a doesn't meet every distribution requirement, but gets us much closer to not requiring external patches. There are two major changes since this was last proposed for mainline: - Separating lockdown from EFI secure boot. Background discussion is covered here: https://lwn.net/Articles/751061/ - Implementation as an LSM, with a default stackable lockdown LSM module. This allows the lockdown feature to be policy-driven, rather than encoding an implicit policy within the mechanism. The new locked_down LSM hook is provided to allow LSMs to make a policy decision around whether kernel functionality that would allow tampering with or examining the runtime state of the kernel should be permitted. The included lockdown LSM provides an implementation with a simple policy intended for general purpose use. This policy provides a coarse level of granularity, controllable via the kernel command line: lockdown={integrity|confidentiality} Enable the kernel lockdown feature. If set to integrity, kernel features that allow userland to modify the running kernel are disabled. If set to confidentiality, kernel features that allow userland to extract confidential information from the kernel are also disabled. This may also be controlled via /sys/kernel/security/lockdown and overriden by kernel configuration. New or existing LSMs may implement finer-grained controls of the lockdown features. Refer to the lockdown_reason documentation in include/linux/security.h for details. The lockdown feature has had signficant design feedback and review across many subsystems. This code has been in linux-next for some weeks, with a few fixes applied along the way. Stephen Rothwell noted that commit 9d1f8be5cf42 ("bpf: Restrict bpf when kernel lockdown is in confidentiality mode") is missing a Signed-off-by from its author. Matthew responded that he is providing this under category (c) of the DCO" * 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (31 commits) kexec: Fix file verification on S390 security: constify some arrays in lockdown LSM lockdown: Print current->comm in restriction messages efi: Restrict efivar_ssdt_load when the kernel is locked down tracefs: Restrict tracefs when the kernel is locked down debugfs: Restrict debugfs when the kernel is locked down kexec: Allow kexec_file() with appropriate IMA policy when locked down lockdown: Lock down perf when in confidentiality mode bpf: Restrict bpf when kernel lockdown is in confidentiality mode lockdown: Lock down tracing and perf kprobes when in confidentiality mode lockdown: Lock down /proc/kcore x86/mmiotrace: Lock down the testmmiotrace module lockdown: Lock down module params that specify hardware parameters (eg. ioport) lockdown: Lock down TIOCSSERIAL lockdown: Prohibit PCMCIA CIS storage when the kernel is locked down acpi: Disable ACPI table override if the kernel is locked down acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down ACPI: Limit access to custom_method when the kernel is locked down x86/msr: Restrict MSR access when the kernel is locked down x86: Lock down IO port access when the kernel is locked down ...
2019-09-27Merge branch 'next-integrity' of ↵Linus Torvalds15-105/+627
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "The major feature in this time is IMA support for measuring and appraising appended file signatures. In addition are a couple of bug fixes and code cleanup to use struct_size(). In addition to the PE/COFF and IMA xattr signatures, the kexec kernel image may be signed with an appended signature, using the same scripts/sign-file tool that is used to sign kernel modules. Similarly, the initramfs may contain an appended signature. This contained a lot of refactoring of the existing appended signature verification code, so that IMA could retain the existing framework of calculating the file hash once, storing it in the IMA measurement list and extending the TPM, verifying the file's integrity based on a file hash or signature (eg. xattrs), and adding an audit record containing the file hash, all based on policy. (The IMA support for appended signatures patch set was posted and reviewed 11 times.) The support for appended signature paves the way for adding other signature verification methods, such as fs-verity, based on a single system-wide policy. The file hash used for verifying the signature and the signature, itself, can be included in the IMA measurement list" * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: ima_api: Use struct_size() in kzalloc() ima: use struct_size() in kzalloc() sefltest/ima: support appended signatures (modsig) ima: Fix use after free in ima_read_modsig() MODSIGN: make new include file self contained ima: fix freeing ongoing ahash_request ima: always return negative code for error ima: Store the measurement again when appraising a modsig ima: Define ima-modsig template ima: Collect modsig ima: Implement support for module-style appended signatures ima: Factor xattr_verify() out of ima_appraise_measurement() ima: Add modsig appraise_type option for module-style appended signatures integrity: Select CONFIG_KEYS instead of depending on it PKCS#7: Introduce pkcs7_get_digest() PKCS#7: Refactor verify_pkcs7_signature() MODSIGN: Export module signature definitions ima: initialize the "template" field with the default template
2019-09-25KEYS: trusted: correctly initialize digests and fix locking issueRoberto Sassu1-0/+5
Commit 0b6cf6b97b7e ("tpm: pass an array of tpm_extend_digest structures to tpm_pcr_extend()") modifies tpm_pcr_extend() to accept a digest for each PCR bank. After modification, tpm_pcr_extend() expects that digests are passed in the same order as the algorithms set in chip->allocated_banks. This patch fixes two issues introduced in the last iterations of the patch set: missing initialization of the TPM algorithm ID in the tpm_digest structures passed to tpm_pcr_extend() by the trusted key module, and unreleased locks in the TPM driver due to returning from tpm_pcr_extend() without calling tpm_put_ops(). Cc: stable@vger.kernel.org Fixes: 0b6cf6b97b7e ("tpm: pass an array of tpm_extend_digest structures to tpm_pcr_extend()") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2019-09-23Merge tag 'smack-for-5.4-rc1' of git://github.com/cschaufler/smack-nextLinus Torvalds2-23/+23
Pull smack updates from Casey Schaufler: "Four patches for v5.4. Nothing is major. All but one are in response to mechanically detected potential issues. The remaining patch cleans up kernel-doc notations" * tag 'smack-for-5.4-rc1' of git://github.com/cschaufler/smack-next: smack: use GFP_NOFS while holding inode_smack::smk_lock security: smack: Fix possible null-pointer dereferences in smack_socket_sock_rcv_skb() smack: fix some kernel-doc notations Smack: Don't ignore other bprm->unsafe flags if LSM_UNSAFE_PTRACE is set
2019-09-23Merge tag 'safesetid-bugfix-5.4' of git://github.com/micah-morton/linuxLinus Torvalds1-1/+2
Pull SafeSetID fix from Micah Morton: "Jann Horn sent some patches to fix some bugs in SafeSetID for 5.3. After he had done his testing there were a couple small code tweaks that went in and caused this bug. From what I can see SafeSetID is broken in 5.3 and crashes the kernel every time during initialization if you try to use it. I came across this bug when backporting Jann's changes for 5.3 to older kernels (4.14 and 4.19). I've tested on a Chrome OS device with those kernels and verified that this change fixes things. It doesn't seem super useful to have this bake in linux-next, since it is completely broken in 5.3 and nobody noticed" * tag 'safesetid-bugfix-5.4' of git://github.com/micah-morton/linux: LSM: SafeSetID: Stop releasing uninitialized ruleset
2019-09-23Merge tag 'selinux-pr-20190917' of ↵Linus Torvalds12-296/+346
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: - Add LSM hooks, and SELinux access control hooks, for dnotify, fanotify, and inotify watches. This has been discussed with both the LSM and fs/notify folks and everybody is good with these new hooks. - The LSM stacking changes missed a few calls to current_security() in the SELinux code; we fix those and remove current_security() for good. - Improve our network object labeling cache so that we always return the object's label, even when under memory pressure. Previously we would return an error if we couldn't allocate a new cache entry, now we always return the label even if we can't create a new cache entry for it. - Convert the sidtab atomic_t counter to a normal u32 with READ/WRITE_ONCE() and memory barrier protection. - A few patches to policydb.c to clean things up (remove forward declarations, long lines, bad variable names, etc) * tag 'selinux-pr-20190917' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: lsm: remove current_security() selinux: fix residual uses of current_security() for the SELinux blob selinux: avoid atomic_t usage in sidtab fanotify, inotify, dnotify, security: add security hook for fs notifications selinux: always return a secid from the network caches if we find one selinux: policydb - rename type_val_to_struct_array selinux: policydb - fix some checkpatch.pl warnings selinux: shuffle around policydb.c to get rid of forward declarations
2019-09-17LSM: SafeSetID: Stop releasing uninitialized rulesetMicah Morton1-1/+2
The first time a rule set is configured for SafeSetID, we shouldn't be trying to release the previously configured ruleset, since there isn't one. Currently, the pointer that would point to a previously configured ruleset is uninitialized on first rule set configuration, leading to a crash when we try to call release_ruleset with that pointer. Acked-by: Jann Horn <jannh@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>
2019-09-10security: constify some arrays in lockdown LSMMatthew Garrett1-2/+2
No reason for these not to be const. Signed-off-by: Matthew Garrett <mjg59@google.com> Suggested-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-09-05keys: Fix missing null pointer check in request_key_auth_describe()Hillf Danton1-0/+6
If a request_key authentication token key gets revoked, there's a window in which request_key_auth_describe() can see it with a NULL payload - but it makes no check for this and something like the following oops may occur: BUG: Kernel NULL pointer dereference at 0x00000038 Faulting instruction address: 0xc0000000004ddf30 Oops: Kernel access of bad area, sig: 11 [#1] ... NIP [...] request_key_auth_describe+0x90/0xd0 LR [...] request_key_auth_describe+0x54/0xd0 Call Trace: [...] request_key_auth_describe+0x54/0xd0 (unreliable) [...] proc_keys_show+0x308/0x4c0 [...] seq_read+0x3d0/0x540 [...] proc_reg_read+0x90/0x110 [...] __vfs_read+0x3c/0x70 [...] vfs_read+0xb4/0x1b0 [...] ksys_read+0x7c/0x130 [...] system_call+0x5c/0x70 Fix this by checking for a NULL pointer when describing such a key. Also make the read routine check for a NULL pointer to be on the safe side. [DH: Modified to not take already-held rcu lock and modified to also check in the read routine] Fixes: 04c567d9313e ("[PATCH] Keys: Fix race between two instantiators of a key") Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Hillf Danton <hdanton@sina.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-04selinux: fix residual uses of current_security() for the SELinux blobStephen Smalley2-11/+11
We need to use selinux_cred() to fetch the SELinux cred blob instead of directly using current->security or current_security(). There were a couple of lingering uses of current_security() in the SELinux code that were apparently missed during the earlier conversions. IIUC, this would only manifest as a bug if multiple security modules including SELinux are enabled and SELinux is not first in the lsm order. After this change, there appear to be no other users of current_security() in-tree; perhaps we should remove it altogether. Fixes: bbd3662a8348 ("Infrastructure management of the cred security blob") Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-09-04smack: use GFP_NOFS while holding inode_smack::smk_lockEric Biggers2-4/+4
inode_smack::smk_lock is taken during smack_d_instantiate(), which is called during a filesystem transaction when creating a file on ext4. Therefore to avoid a deadlock, all code that takes this lock must use GFP_NOFS, to prevent memory reclaim from waiting for the filesystem transaction to complete. Reported-by: syzbot+0eefc1e06a77d327a056@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2019-09-04security: smack: Fix possible null-pointer dereferences in ↵Jia-Ju Bai1-0/+2
smack_socket_sock_rcv_skb() In smack_socket_sock_rcv_skb(), there is an if statement on line 3920 to check whether skb is NULL: if (skb && skb->secmark != 0) This check indicates skb can be NULL in some cases. But on lines 3931 and 3932, skb is used: ad.a.u.net->netif = skb->skb_iif; ipv6_skb_to_auditdata(skb, &ad.a, NULL); Thus, possible null-pointer dereferences may occur when skb is NULL. To fix these possible bugs, an if statement is added to check skb. These bugs are found by a static analysis tool STCheck written by us. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2019-09-04smack: fix some kernel-doc notationsluanshi1-18/+15
Fix/add kernel-doc notation and fix typos in security/smack/. Signed-off-by: Liguang Zhang <zhangliguang@linux.alibaba.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2019-09-04Smack: Don't ignore other bprm->unsafe flags if LSM_UNSAFE_PTRACE is setJann Horn1-1/+2
There is a logic bug in the current smack_bprm_set_creds(): If LSM_UNSAFE_PTRACE is set, but the ptrace state is deemed to be acceptable (e.g. because the ptracer detached in the meantime), the other ->unsafe flags aren't checked. As far as I can tell, this means that something like the following could work (but I haven't tested it): - task A: create task B with fork() - task B: set NO_NEW_PRIVS - task B: install a seccomp filter that makes open() return 0 under some conditions - task B: replace fd 0 with a malicious library - task A: attach to task B with PTRACE_ATTACH - task B: execve() a file with an SMACK64EXEC extended attribute - task A: while task B is still in the middle of execve(), exit (which destroys the ptrace relationship) Make sure that if any flags other than LSM_UNSAFE_PTRACE are set in bprm->unsafe, we reject the execve(). Cc: stable@vger.kernel.org Fixes: 5663884caab1 ("Smack: unify all ptrace accesses in the smack") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2019-08-30keys: ensure that ->match_free() is called in request_key_and_link()Eric Biggers1-1/+1
If check_cached_key() returns a non-NULL value, we still need to call key_type::match_free() to undo key_type::match_preparse(). Fixes: 7743c48e54ee ("keys: Cache result of request_key*() temporarily in task_struct") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-08-29ima: ima_api: Use struct_size() in kzalloc()Gustavo A. R. Silva1-2/+2
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct ima_template_entry { ... struct ima_field_data template_data[0]; /* template related data */ }; instance = kzalloc(sizeof(struct ima_template_entry) + count * sizeof(struct ima_field_data), GFP_NOFS); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_NOFS); This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-08-29ima: use struct_size() in kzalloc()Gustavo A. R. Silva1-3/+2
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; struct boo entry[]; }; instance = kzalloc(sizeof(struct foo) + count * sizeof(struct boo), GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-08-28ima: Fix use after free in ima_read_modsig()Thiago Jung Bauermann1-1/+2
If we can't parse the PKCS7 in the appended modsig, we will free the modsig structure and then access one of its members to determine the error value. Fixes: 39b07096364a ("ima: Implement support for module-style appended signatures") Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@lip6.fr> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Reviewed-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-08-27selinux: avoid atomic_t usage in sidtabOndrej Mosnacek2-32/+35
As noted in Documentation/atomic_t.txt, if we don't need the RMW atomic operations, we should only use READ_ONCE()/WRITE_ONCE() + smp_rmb()/smp_wmb() where necessary (or the combined variants smp_load_acquire()/smp_store_release()). This patch converts the sidtab code to use regular u32 for the counter and reverse lookup cache and use the appropriate operations instead of atomic_get()/atomic_set(). Note that when reading/updating the reverse lookup cache we don't need memory barriers as it doesn't need to be consistent or accurate. We can now also replace some atomic ops with regular loads (when under spinlock) and stores (for conversion target fields that are always accessed under the master table's spinlock). We can now also bump SIDTAB_MAX to U32_MAX as we can use the full u32 range again. Suggested-by: Jann Horn <jannh@google.com> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Jann Horn <jannh@google.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-19lockdown: Print current->comm in restriction messagesMatthew Garrett1-2/+6
Print the content of current->comm in messages generated by lockdown to indicate a restriction that was hit. This makes it a bit easier to find out what caused the message. The message now patterned something like: Lockdown: <comm>: <what> is restricted; see man kernel_lockdown.7 Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19tracefs: Restrict tracefs when the kernel is locked downMatthew Garrett1-0/+1
Tracefs may release more information about the kernel than desirable, so restrict it when the kernel is locked down in confidentiality mode by preventing open(). (Fixed by Ben Hutchings to avoid a null dereference in default_file_open()) Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19debugfs: Restrict debugfs when the kernel is locked downDavid Howells1-0/+1
Disallow opening of debugfs files that might be used to muck around when the kernel is locked down as various drivers give raw access to hardware through debugfs. Given the effort of auditing all 2000 or so files and manually fixing each one as necessary, I've chosen to apply a heuristic instead. The following changes are made: (1) chmod and chown are disallowed on debugfs objects (though the root dir can be modified by mount and remount, but I'm not worried about that). (2) When the kernel is locked down, only files with the following criteria are permitted to be opened: - The file must have mode 00444 - The file must not have ioctl methods - The file must not have mmap (3) When the kernel is locked down, files may only be opened for reading. Normal device interaction should be done through configfs, sysfs or a miscdev, not debugfs. Note that this makes it unnecessary to specifically lock down show_dsts(), show_devs() and show_call() in the asus-wmi driver. I would actually prefer to lock down all files by default and have the the files unlocked by the creator. This is tricky to manage correctly, though, as there are 19 creation functions and ~1600 call sites (some of them in loops scanning tables). Signed-off-by: David Howells <dhowells@redhat.com> cc: Andy Shevchenko <andy.shevchenko@gmail.com> cc: acpi4asus-user@lists.sourceforge.net cc: platform-driver-x86@vger.kernel.org cc: Matthew Garrett <mjg59@srcf.ucam.org> cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19kexec: Allow kexec_file() with appropriate IMA policy when locked downMatthew Garrett3-1/+53
Systems in lockdown mode should block the kexec of untrusted kernels. For x86 and ARM we can ensure that a kernel is trustworthy by validating a PE signature, but this isn't possible on other architectures. On those platforms we can use IMA digital signatures instead. Add a function to determine whether IMA has or will verify signatures for a given event type, and if so permit kexec_file() even if the kernel is otherwise locked down. This is restricted to cases where CONFIG_INTEGRITY_TRUSTED_KEYRING is set in order to prevent an attacker from loading additional keys at runtime. Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Mimi Zohar <zohar@linux.ibm.com> Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: linux-integrity@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Lock down perf when in confidentiality modeDavid Howells1-0/+1
Disallow the use of certain perf facilities that might allow userspace to access kernel data. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19bpf: Restrict bpf when kernel lockdown is in confidentiality modeDavid Howells1-0/+1
bpf_read() and bpf_read_str() could potentially be abused to (eg) allow private keys in kernel memory to be leaked. Disable them if the kernel has been locked down in confidentiality mode. Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> cc: netdev@vger.kernel.org cc: Chun-Yi Lee <jlee@suse.com> cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Lock down tracing and perf kprobes when in confidentiality modeDavid Howells1-0/+1
Disallow the creation of perf and ftrace kprobes when the kernel is locked down in confidentiality mode by preventing their registration. This prevents kprobes from being used to access kernel memory to steal crypto data, but continues to allow the use of kprobes from signed modules. Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: davem@davemloft.net Cc: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Lock down /proc/kcoreDavid Howells1-0/+1
Disallow access to /proc/kcore when the kernel is locked down to prevent access to cryptographic data. This is limited to lockdown confidentiality mode and is still permitted in integrity mode. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19x86/mmiotrace: Lock down the testmmiotrace moduleDavid Howells1-0/+1
The testmmiotrace module shouldn't be permitted when the kernel is locked down as it can be used to arbitrarily read and write MMIO space. This is a runtime check rather than buildtime in order to allow configurations where the same kernel may be run in both locked down or permissive modes depending on local policy. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Howells <dhowells@redhat.com Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Kees Cook <keescook@chromium.org> cc: Thomas Gleixner <tglx@linutronix.de> cc: Steven Rostedt <rostedt@goodmis.org> cc: Ingo Molnar <mingo@kernel.org> cc: "H. Peter Anvin" <hpa@zytor.com> cc: x86@kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Lock down module params that specify hardware parameters (eg. ioport)David Howells1-0/+1
Provided an annotation for module parameters that specify hardware parameters (such as io ports, iomem addresses, irqs, dma channels, fixed dma buffers and other types). Suggested-by: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Jessica Yu <jeyu@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Lock down TIOCSSERIALDavid Howells1-0/+1
Lock down TIOCSSERIAL as that can be used to change the ioport and irq settings on a serial port. This only appears to be an issue for the serial drivers that use the core serial code. All other drivers seem to either ignore attempts to change port/irq or give an error. Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> cc: Jiri Slaby <jslaby@suse.com> Cc: linux-serial@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Prohibit PCMCIA CIS storage when the kernel is locked downDavid Howells1-0/+1
Prohibit replacement of the PCMCIA Card Information Structure when the kernel is locked down. Suggested-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19ACPI: Limit access to custom_method when the kernel is locked downMatthew Garrett1-0/+1
custom_method effectively allows arbitrary access to system memory, making it possible for an attacker to circumvent restrictions on module loading. Disable it if the kernel is locked down. Signed-off-by: Matthew Garrett <mjg59@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> cc: linux-acpi@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19x86/msr: Restrict MSR access when the kernel is locked downMatthew Garrett1-0/+1
Writing to MSRs should not be allowed if the kernel is locked down, since it could lead to execution of arbitrary code in kernel mode. Based on a patch by Kees Cook. Signed-off-by: Matthew Garrett <mjg59@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> cc: x86@kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19x86: Lock down IO port access when the kernel is locked downMatthew Garrett1-0/+1
IO port access would permit users to gain access to PCI configuration registers, which in turn (on a lot of hardware) give access to MMIO register space. This would potentially permit root to trigger arbitrary DMA, so lock it down by default. This also implicitly locks down the KDADDIO, KDDELIO, KDENABIO and KDDISABIO console ioctls. Signed-off-by: Matthew Garrett <mjg59@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> cc: x86@kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19PCI: Lock down BAR access when the kernel is locked downMatthew Garrett1-0/+1
Any hardware that can potentially generate DMA has to be locked down in order to avoid it being possible for an attacker to modify kernel code, allowing them to circumvent disabled module loading or module signing. Default to paranoid - in future we can potentially relax this for sufficiently IOMMU-isolated devices. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> cc: linux-pci@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19hibernate: Disable when the kernel is locked downJosh Boyer1-0/+1
There is currently no way to verify the resume image when returning from hibernate. This might compromise the signed modules trust model, so until we can work with signed hibernate images we disable it when the kernel is locked down. Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: rjw@rjwysocki.net Cc: pavel@ucw.cz cc: linux-pm@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCEJiri Bohac2-2/+2
This is a preparatory patch for kexec_file_load() lockdown. A locked down kernel needs to prevent unsigned kernel images from being loaded with kexec_file_load(). Currently, the only way to force the signature verification is compiling with KEXEC_VERIFY_SIG. This prevents loading usigned images even when the kernel is not locked down at runtime. This patch splits KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE. Analogous to the MODULE_SIG and MODULE_SIG_FORCE for modules, KEXEC_SIG turns on the signature verification but allows unsigned images to be loaded. KEXEC_SIG_FORCE disallows images without a valid signature. Signed-off-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> cc: kexec@lists.infradead.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19kexec_load: Disable at runtime if the kernel is locked downMatthew Garrett1-0/+1
The kexec_load() syscall permits the loading and execution of arbitrary code in ring 0, which is something that lock-down is meant to prevent. It makes sense to disable kexec_load() in this situation. This does not affect kexec_file_load() syscall which can check for a signature on the image to be booted. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Dave Young <dyoung@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> cc: kexec@lists.infradead.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Restrict /dev/{mem,kmem,port} when the kernel is locked downMatthew Garrett1-0/+1
Allowing users to read and write to core kernel memory makes it possible for the kernel to be subverted, avoiding module loading restrictions, and also to steal cryptographic information. Disallow /dev/mem and /dev/kmem from being opened this when the kernel has been locked down to prevent this. Also disallow /dev/port from being opened to prevent raw ioport access and thus DMA from being used to accomplish the same thing. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: x86@kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Enforce module signatures if the kernel is locked downDavid Howells2-0/+2
If the kernel is locked down, require that all modules have valid signatures that we can verify. I have adjusted the errors generated: (1) If there's no signature (ENODATA) or we can't check it (ENOPKG, ENOKEY), then: (a) If signatures are enforced then EKEYREJECTED is returned. (b) If there's no signature or we can't check it, but the kernel is locked down then EPERM is returned (this is then consistent with other lockdown cases). (2) If the signature is unparseable (EBADMSG, EINVAL), the signature fails the check (EKEYREJECTED) or a system error occurs (eg. ENOMEM), we return the error we got. Note that the X.509 code doesn't check for key expiry as the RTC might not be valid or might not have been transferred to the kernel's clock yet. [Modified by Matthew Garrett to remove the IMA integration. This will be replaced with integration with the IMA architecture policy patchset.] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Jessica Yu <jeyu@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19security: Add a static lockdown policy LSMMatthew Garrett5-5/+224
While existing LSMs can be extended to handle lockdown policy, distributions generally want to be able to apply a straightforward static policy. This patch adds a simple LSM that can be configured to reject either integrity or all lockdown queries, and can be configured at runtime (through securityfs), boot time (via a kernel parameter) or build time (via a kconfig option). Based on initial code by David Howells. Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19security: Add a "locked down" LSM hookMatthew Garrett1-0/+6
Add a mechanism to allow LSMs to make a policy decision around whether kernel functionality that would allow tampering with or examining the runtime state of the kernel should be permitted. Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19security: Support early LSMsMatthew Garrett1-8/+42
The lockdown module is intended to allow for kernels to be locked down early in boot - sufficiently early that we don't have the ability to kmalloc() yet. Add support for early initialisation of some LSMs, and then add them to the list of names when we do full initialisation later. Early LSMs are initialised in link order and cannot be overridden via boot parameters, and cannot make use of kmalloc() (since the allocator isn't initialised yet). (Fixed by Stephen Rothwell to include a stub to fix builds when !CONFIG_SECURITY) Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-13KEYS: trusted: allow module init if TPM is inactive or deactivatedRoberto Sassu1-13/+0
Commit c78719203fc6 ("KEYS: trusted: allow trusted.ko to initialize w/o a TPM") allows the trusted module to be loaded even if a TPM is not found, to avoid module dependency problems. However, trusted module initialization can still fail if the TPM is inactive or deactivated. tpm_get_random() returns an error. This patch removes the call to tpm_get_random() and instead extends the PCR specified by the user with zeros. The security of this alternative is equivalent to the previous one, as either option prevents with a PCR update unsealing and misuse of sealed data by a user space process. Even if a PCR is extended with zeros, instead of random data, it is still computationally infeasible to find a value as input for a new PCR extend operation, to obtain again the PCR value that would allow unsealing. Cc: stable@vger.kernel.org Fixes: 240730437deb ("KEYS: trusted: explicitly use tpm_chip structure...") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Tyler Hicks <tyhicks@canonical.com> Suggested-by: Mimi Zohar <zohar@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2019-08-12fanotify, inotify, dnotify, security: add security hook for fs notificationsAaron Goidel3-2/+56
As of now, setting watches on filesystem objects has, at most, applied a check for read access to the inode, and in the case of fanotify, requires CAP_SYS_ADMIN. No specific security hook or permission check has been provided to control the setting of watches. Using any of inotify, dnotify, or fanotify, it is possible to observe, not only write-like operations, but even read access to a file. Modeling the watch as being merely a read from the file is insufficient for the needs of SELinux. This is due to the fact that read access should not necessarily imply access to information about when another process reads from a file. Furthermore, fanotify watches grant more power to an application in the form of permission events. While notification events are solely, unidirectional (i.e. they only pass information to the receiving application), permission events are blocking. Permission events make a request to the receiving application which will then reply with a decision as to whether or not that action may be completed. This causes the issue of the watching application having the ability to exercise control over the triggering process. Without drawing a distinction within the permission check, the ability to read would imply the greater ability to control an application. Additionally, mount and superblock watches apply to all files within the same mount or superblock. Read access to one file should not necessarily imply the ability to watch all files accessed within a given mount or superblock. In order to solve these issues, a new LSM hook is implemented and has been placed within the system calls for marking filesystem objects with inotify, fanotify, and dnotify watches. These calls to the hook are placed at the point at which the target path has been resolved and are provided with the path struct, the mask of requested notification events, and the type of object on which the mark is being set (inode, superblock, or mount). The mask and obj_type have already been translated into common FS_* values shared by the entirety of the fs notification infrastructure. The path struct is passed rather than just the inode so that the mount is available, particularly for mount watches. This also allows for use of the hook by pathname-based security modules. However, since the hook is intended for use even by inode based security modules, it is not placed under the CONFIG_SECURITY_PATH conditional. Otherwise, the inode-based security modules would need to enable all of the path hooks, even though they do not use any of them. This only provides a hook at the point of setting a watch, and presumes that permission to set a particular watch implies the ability to receive all notification about that object which match the mask. This is all that is required for SELinux. If other security modules require additional hooks or infrastructure to control delivery of notification, these can be added by them. It does not make sense for us to propose hooks for which we have no implementation. The understanding that all notifications received by the requesting application are all strictly of a type for which the application has been granted permission shows that this implementation is sufficient in its coverage. Security modules wishing to provide complete control over fanotify must also implement a security_file_open hook that validates that the access requested by the watching application is authorized. Fanotify has the issue that it returns a file descriptor with the file mode specified during fanotify_init() to the watching process on event. This is already covered by the LSM security_file_open hook if the security module implements checking of the requested file mode there. Otherwise, a watching process can obtain escalated access to a file for which it has not been authorized. The selinux_path_notify hook implementation works by adding five new file permissions: watch, watch_mount, watch_sb, watch_reads, and watch_with_perm (descriptions about which will follow), and one new filesystem permission: watch (which is applied to superblock checks). The hook then decides which subset of these permissions must be held by the requesting application based on the contents of the provided mask and the obj_type. The selinux_file_open hook already checks the requested file mode and therefore ensures that a watching process cannot escalate its access through fanotify. The watch, watch_mount, and watch_sb permissions are the baseline permissions for setting a watch on an object and each are a requirement for any watch to be set on a file, mount, or superblock respectively. It should be noted that having either of the other two permissions (watch_reads and watch_with_perm) does not imply the watch, watch_mount, or watch_sb permission. Superblock watches further require the filesystem watch permission to the superblock. As there is no labeled object in view for mounts, there is no specific check for mount watches beyond watch_mount to the inode. Such a check could be added in the future, if a suitable labeled object existed representing the mount. The watch_reads permission is required to receive notifications from read-exclusive events on filesystem objects. These events include accessing a file for the purpose of reading and closing a file which has been opened read-only. This distinction has been drawn in order to provide a direct indication in the policy for this otherwise not obvious capability. Read access to a file should not necessarily imply the ability to observe read events on a file. Finally, watch_with_perm only applies to fanotify masks since it is the only way to set a mask which allows for the blocking, permission event. This permission is needed for any watch which is of this type. Though fanotify requires CAP_SYS_ADMIN, this is insufficient as it gives implicit trust to root, which we do not do, and does not support least privilege. Signed-off-by: Aaron Goidel <acgoide@tycho.nsa.gov> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-05ima: fix freeing ongoing ahash_requestSascha Hauer1-0/+5
integrity_kernel_read() can fail in which case we forward to call ahash_request_free() on a currently running request. We have to wait for its completion before we can free the request. This was observed by interrupting a "find / -type f -xdev -print0 | xargs -0 cat 1>/dev/null" with ctrl-c on an IMA enabled filesystem. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-08-05ima: always return negative code for errorSascha Hauer1-1/+4
integrity_kernel_read() returns the number of bytes read. If this is a short read then this positive value is returned from ima_calc_file_hash_atfm(). Currently this is only indirectly called from ima_calc_file_hash() and this function only tests for the return value being zero or nonzero and also doesn't forward the return value. Nevertheless there's no point in returning a positive value as an error, so translate a short read into -EINVAL. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-08-05ima: Store the measurement again when appraising a modsigThiago Jung Bauermann4-7/+47
If the IMA template contains the "modsig" or "d-modsig" field, then the modsig should be added to the measurement list when the file is appraised. And that is what normally happens, but if a measurement rule caused a file containing a modsig to be measured before a different rule causes it to be appraised, the resulting measurement entry will not contain the modsig because it is only fetched during appraisal. When the appraisal rule triggers, it won't store a new measurement containing the modsig because the file was already measured. We need to detect that situation and store an additional measurement with the modsig. This is done by adding an IMA_MEASURE action flag if we read a modsig and the IMA template contains a modsig field. Suggested-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-08-05ima: Define ima-modsig templateThiago Jung Bauermann8-6/+156
Define new "d-modsig" template field which holds the digest that is expected to match the one contained in the modsig, and also new "modsig" template field which holds the appended file signature. Add a new "ima-modsig" defined template descriptor with the new fields as well as the ones from the "ima-sig" descriptor. Change ima_store_measurement() to accept a struct modsig * argument so that it can be passed along to the templates via struct ima_event_data. Suggested-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-08-05ima: Collect modsigThiago Jung Bauermann5-5/+60
Obtain the modsig and calculate its corresponding hash in ima_collect_measurement(). Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-08-05ima: Implement support for module-style appended signaturesThiago Jung Bauermann8-23/+209
Implement the appraise_type=imasig|modsig option, allowing IMA to read and verify modsig signatures. In case a file has both an xattr signature and an appended modsig, IMA will only use the appended signature if the key used by the xattr signature isn't present in the IMA or platform keyring. Because modsig verification needs to convert from an integrity keyring id to the keyring itself, add an integrity_keyring_from_id() function in digsig.c so that integrity_modsig_verify() can use it. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-08-05ima: Factor xattr_verify() out of ima_appraise_measurement()Thiago Jung Bauermann1-60/+81
Verify xattr signature in a separate function so that the logic in ima_appraise_measurement() remains clear when it gains the ability to also verify an appended module signature. The code in the switch statement is unchanged except for having to dereference the status and cause variables (since they're now pointers), and fixing the style of a block comment to appease checkpatch. Suggested-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-08-05ima: Add modsig appraise_type option for module-style appended signaturesThiago Jung Bauermann6-2/+62
Introduce the modsig keyword to the IMA policy syntax to specify that a given hook should expect the file to have the IMA signature appended to it. Here is how it can be used in a rule: appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig|modsig With this rule, IMA will accept either a signature stored in the extended attribute or an appended signature. For now, the rule above will behave exactly the same as if appraise_type=imasig was specified. The actual modsig implementation will be introduced separately. Suggested-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-08-05integrity: Select CONFIG_KEYS instead of depending on itThiago Jung Bauermann1-1/+1
This avoids a dependency cycle in soon-to-be-introduced CONFIG_IMA_APPRAISE_MODSIG: it will select CONFIG_MODULE_SIG_FORMAT which in turn selects CONFIG_KEYS. Kconfig then complains that CONFIG_INTEGRITY_SIGNATURE depends on CONFIG_KEYS. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-08-05selinux: always return a secid from the network caches if we find onePaul Moore3-47/+38
Previously if we couldn't find an entry in the cache and we failed to allocate memory for a new cache entry we would fail the network object label lookup; this is obviously not ideal. This patch fixes this so that we return the object label even if we can't cache the object at this point in time due to memory pressure. The GitHub issue tracker is below: * https://github.com/SELinuxProject/selinux-kernel/issues/3 Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-05selinux: policydb - rename type_val_to_struct_arrayOndrej Mosnacek3-11/+11
The name is overly long and inconsistent with the other *_val_to_struct members. Dropping the "_array" prefix makes the code easier to read and gets rid of one line over 80 characters warning. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-05selinux: policydb - fix some checkpatch.pl warningsOndrej Mosnacek1-4/+8
Fix most of the code style warnings discovered when moving code around. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-05selinux: shuffle around policydb.c to get rid of forward declarationsPaul Moore1-189/+187
No code changes, but move a lot of the policydb destructors higher up so we can get rid of a forward declaration. This patch does expose a few old checkpatch.pl errors, but those will be dealt with in a separate (set of) patches. Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-02Merge tag 'selinux-pr-20190801' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fix from Paul Moore: "One more small fix for a potential memory leak in an error path" * tag 'selinux-pr-20190801' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: fix memory leak in policydb_init()
2019-08-01ima: initialize the "template" field with the default templateMimi Zohar1-2/+4
IMA policy rules are walked sequentially. Depending on the ordering of the policy rules, the "template" field might be defined in one rule, but will be replaced by subsequent, applicable rules, even if the rule does not explicitly define the "template" field. This patch initializes the "template" once and only replaces the "template", when explicitly defined. Fixes: 19453ce0bcfb ("IMA: support for per policy rule template formats") Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-07-31selinux: fix memory leak in policydb_init()Ondrej Mosnacek1-1/+5
Since roles_init() adds some entries to the role hash table, we need to destroy also its keys/values on error, otherwise we get a memory leak in the error path. Cc: <stable@vger.kernel.org> Reported-by: syzbot+fee3a14d4cdf92646287@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-07-28Merge tag 'meminit-v5.3-rc2' of ↵Linus Torvalds1-0/+7
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull structleak fix from Kees Cook: "Disable gcc-based stack variable auto-init under KASAN (Arnd Bergmann). This fixes a bunch of build warnings under KASAN and the gcc-plugin-based stack auto-initialization features (which are arguably redundant, so better to let KASAN control this)" * tag 'meminit-v5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: structleak: disable STRUCTLEAK_BYREF in combination with KASAN_STACK
2019-07-26Merge tag 'selinux-pr-20190726' of ↵Linus Torvalds1-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fix from Paul Moore: "One small SELinux patch to add some proper bounds/overflow checking when adding a new sid/secid" * tag 'selinux-pr-20190726' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: check sidtab limit before adding a new entry
2019-07-25structleak: disable STRUCTLEAK_BYREF in combination with KASAN_STACKArnd Bergmann1-0/+7
The combination of KASAN_STACK and GCC_PLUGIN_STRUCTLEAK_BYREF leads to much larger kernel stack usage, as seen from the warnings about functions that now exceed the 2048 byte limit: drivers/media/i2c/tvp5150.c:253:1: error: the frame size of 3936 bytes is larger than 2048 bytes drivers/media/tuners/r820t.c:1327:1: error: the frame size of 2816 bytes is larger than 2048 bytes drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c:16552:1: error: the frame size of 3144 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] fs/ocfs2/aops.c:1892:1: error: the frame size of 2088 bytes is larger than 2048 bytes fs/ocfs2/dlm/dlmrecovery.c:737:1: error: the frame size of 2088 bytes is larger than 2048 bytes fs/ocfs2/namei.c:1677:1: error: the frame size of 2584 bytes is larger than 2048 bytes fs/ocfs2/super.c:1186:1: error: the frame size of 2640 bytes is larger than 2048 bytes fs/ocfs2/xattr.c:3678:1: error: the frame size of 2176 bytes is larger than 2048 bytes net/bluetooth/l2cap_core.c:7056:1: error: the frame size of 2144 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] net/bluetooth/l2cap_core.c: In function 'l2cap_recv_frame': net/bridge/br_netlink.c:1505:1: error: the frame size of 2448 bytes is larger than 2048 bytes net/ieee802154/nl802154.c:548:1: error: the frame size of 2232 bytes is larger than 2048 bytes net/wireless/nl80211.c:1726:1: error: the frame size of 2224 bytes is larger than 2048 bytes net/wireless/nl80211.c:2357:1: error: the frame size of 4584 bytes is larger than 2048 bytes net/wireless/nl80211.c:5108:1: error: the frame size of 2760 bytes is larger than 2048 bytes net/wireless/nl80211.c:6472:1: error: the frame size of 2112 bytes is larger than 2048 bytes The structleak plugin was previously disabled for CONFIG_COMPILE_TEST, but meant we missed some bugs, so this time we should address them. The frame size warnings are distracting, and risking a kernel stack overflow is generally not beneficial to performance, so it may be best to disallow that particular combination. This can be done by turning off either one. I picked the dependency in GCC_PLUGIN_STRUCTLEAK_BYREF and GCC_PLUGIN_STRUCTLEAK_BYREF_ALL, as this option is designed to make uninitialized stack usage less harmful when enabled on its own, but it also prevents KASAN from detecting those cases in which it was in fact needed. KASAN_STACK is currently implied by KASAN on gcc, but could be made a user selectable option if we want to allow combining (non-stack) KASAN with GCC_PLUGIN_STRUCTLEAK_BYREF. Note that it would be possible to specifically address the files that print the warning, but presumably the overall stack usage is still significantly higher than in other configurations, so this would not address the full problem. I could not test this with CONFIG_INIT_STACK_ALL, which may or may not suffer from a similar problem. Fixes: 81a56f6dcd20 ("gcc-plugins: structleak: Generalize to all variable types") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20190722114134.3123901-1-arnd@arndb.de Signed-off-by: Kees Cook <keescook@chromium.org>
2019-07-24selinux: check sidtab limit before adding a new entryOndrej Mosnacek1-0/+5
We need to error out when trying to add an entry above SIDTAB_MAX in sidtab_reverse_lookup() to avoid overflow on the odd chance that this happens. Cc: stable@vger.kernel.org Fixes: ee1a84fdfeed ("selinux: overhaul sidtab to fix bug and improve performance") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-07-19Merge branch 'work.mount0' of ↵Linus Torvalds4-28/+67
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs mount updates from Al Viro: "The first part of mount updates. Convert filesystems to use the new mount API" * 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) mnt_init(): call shmem_init() unconditionally constify ksys_mount() string arguments don't bother with registering rootfs init_rootfs(): don't bother with init_ramfs_fs() vfs: Convert smackfs to use the new mount API vfs: Convert selinuxfs to use the new mount API vfs: Convert securityfs to use the new mount API vfs: Convert apparmorfs to use the new mount API vfs: Convert openpromfs to use the new mount API vfs: Convert xenfs to use the new mount API vfs: Convert gadgetfs to use the new mount API vfs: Convert oprofilefs to use the new mount API vfs: Convert ibmasmfs to use the new mount API vfs: Convert qib_fs/ipathfs to use the new mount API vfs: Convert efivarfs to use the new mount API vfs: Convert configfs to use the new mount API vfs: Convert binfmt_misc to use the new mount API convenience helper: get_tree_single() convenience helper get_tree_nodev() vfs: Kill sget_userns() ...
2019-07-18proc/sysctl: add shared variables for range checkMatteo Croce3-20/+15
In the sysctl code the proc_dointvec_minmax() function is often used to validate the user supplied value between an allowed range. This function uses the extra1 and extra2 members from struct ctl_table as minimum and maximum allowed value. On sysctl handler declaration, in every source file there are some readonly variables containing just an integer which address is assigned to the extra1 and extra2 members, so the sysctl range is enforced. The special values 0, 1 and INT_MAX are very often used as range boundary, leading duplication of variables like zero=0, one=1, int_max=INT_MAX in different source files: $ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l 248 Add a const int array containing the most commonly used values, some macros to refer more easily to the correct array member, and use them instead of creating a local one for every object file. This is the bloat-o-meter output comparing the old and new binary compiled with the default Fedora config: # scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164) Data old new delta sysctl_vals - 12 +12 __kstrtab_sysctl_vals - 12 +12 max 14 10 -4 int_max 16 - -16 one 68 - -68 zero 128 28 -100 Total: Before=20583249, After=20583085, chg -0.00% [mcroce@redhat.com: tipc: remove two unused variables] Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com [akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c] [arnd@arndb.de: proc/sysctl: make firmware loader table conditional] Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de [akpm@linux-foundation.org: fix fs/eventpoll.c] Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.com Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Aaron Tomlin <atomlin@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16Merge tag 'docs/v5.3-1' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull rst conversion of docs from Mauro Carvalho Chehab: "As agreed with Jon, I'm sending this big series directly to you, c/c him, as this series required a special care, in order to avoid conflicts with other trees" * tag 'docs/v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (77 commits) docs: kbuild: fix build with pdf and fix some minor issues docs: block: fix pdf output docs: arm: fix a breakage with pdf output docs: don't use nested tables docs: gpio: add sysfs interface to the admin-guide docs: locking: add it to the main index docs: add some directories to the main documentation index docs: add SPDX tags to new index files docs: add a memory-devices subdir to driver-api docs: phy: place documentation under driver-api docs: serial: move it to the driver-api docs: driver-api: add remaining converted dirs to it docs: driver-api: add xilinx driver API documentation docs: driver-api: add a series of orphaned documents docs: admin-guide: add a series of orphaned documents docs: cgroup-v1: add it to the admin-guide book docs: aoe: add it to the driver-api book docs: add some documentation dirs to the driver-api book docs: driver-model: move it to the driver-api book docs: lp855x-driver.rst: add it to the driver-api book ...
2019-07-15LSM: SafeSetID: fix use of literal -1 in capable hookJann Horn1-1/+1
The capable() hook returns an error number. -EPERM is actually the same as -1, so this doesn't make a difference in behavior. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>
2019-07-15LSM: SafeSetID: verify transitive constrainednessJann Horn1-1/+37
Someone might write a ruleset like the following, expecting that it securely constrains UID 1 to UIDs 1, 2 and 3: 1:2 1:3 However, because no constraints are applied to UIDs 2 and 3, an attacker with UID 1 can simply first switch to UID 2, then switch to any UID from there. The secure way to write this ruleset would be: 1:2 1:3 2:2 3:3 , which uses "transition to self" as a way to inhibit the default-allow policy without allowing anything specific. This is somewhat unintuitive. To make sure that policy authors don't accidentally write insecure policies because of this, let the kernel verify that a new ruleset does not contain any entries that are constrained, but transitively unconstrained. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>
2019-07-15LSM: SafeSetID: add read handlerJann Horn2-4/+32
For debugging a running system, it is very helpful to be able to see what policy the system is using. Add a read handler that can dump out a copy of the loaded policy. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>
2019-07-15LSM: SafeSetID: rewrite userspace API to atomic updatesJann Horn3-158/+144
The current API of the SafeSetID LSM uses one write() per rule, and applies each written rule instantly. This has several downsides: - While a policy is being loaded, once a single parent-child pair has been loaded, the parent is restricted to that specific child, even if subsequent rules would allow transitions to other child UIDs. This means that during policy loading, set*uid() can randomly fail. - To replace the policy without rebooting, it is necessary to first flush all old rules. This creates a time window in which no constraints are placed on the use of CAP_SETUID. - If we want to perform sanity checks on the final policy, this requires that the policy isn't constructed in a piecemeal fashion without telling the kernel when it's done. Other kernel APIs - including things like the userns code and netfilter - avoid this problem by performing updates atomically. Luckily, SafeSetID hasn't landed in a stable (upstream) release yet, so maybe it's not too late to completely change the API. The new API for SafeSetID is: If you want to change the policy, open "safesetid/whitelist_policy" and write the entire policy, newline-delimited, in there. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>
2019-07-15LSM: SafeSetID: fix userns handling in securityfsJann Horn1-3/+3
Looking at current_cred() in write handlers is bad form, stop doing that. Also, let's just require that the write is coming from the initial user namespace. Especially SAFESETID_WHITELIST_FLUSH requires privilege over all namespaces, and SAFESETID_WHITELIST_ADD should probably require it as well. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>
2019-07-15LSM: SafeSetID: refactor policy parsingJann Horn1-51/+33
In preparation for changing the policy parsing logic, refactor the line parsing logic to be less verbose and move it into a separate function. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>
2019-07-15LSM: SafeSetID: refactor safesetid_security_capable()Jann Horn1-15/+26
At the moment, safesetid_security_capable() has two nested conditional blocks, and one big comment for all the logic. Chop it up and reduce the amount of indentation. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>
2019-07-15LSM: SafeSetID: refactor policy hash tableJann Horn2-44/+37
parent_kuid and child_kuid are kuids, there is no reason to make them uint64_t. (And anyway, in the kernel, the normal name for that would be u64, not uint64_t.) check_setuid_policy_hashtable_key() and check_setuid_policy_hashtable_key_value() are basically the same thing, merge them. Also fix the comment that claimed that (1<<8)==128. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>
2019-07-15LSM: SafeSetID: fix check for setresuid(new1, new2, new3)Jann Horn1-90/+35
With the old code, when a process with the (real,effective,saved) UID set (1,1,1) calls setresuid(2,3,4), safesetid_task_fix_setuid() only checks whether the transition 1->2 is permitted; the transitions 1->3 and 1->4 are not checked. Fix this. This is also a good opportunity to refactor safesetid_task_fix_setuid() to be less verbose - having one branch per set*uid() syscall is unnecessary. Note that this slightly changes semantics: The UID transition check for UIDs that were not in the old cred struct is now always performed against the policy of the RUID. I think that's more consistent anyway, since the RUID is also the one that decides whether any policy is enforced at all. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>
2019-07-15LSM: SafeSetID: fix pr_warn() to include newlineJann Horn1-2/+2
Fix the pr_warn() calls in the SafeSetID LSM to have newlines at the end. Without this, denial messages will be buffered as incomplete lines in log_output(), and will then only show up once something else prints into dmesg. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>
2019-07-15docs: cgroup-v1: add it to the admin-guide bookMauro Carvalho Chehab1-1/+1
Those files belong to the admin guide, so add them. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: x86: move two x86-specific files to x86 arch dirMauro Carvalho Chehab1-1/+1
Those two docs belong to the x86 architecture: Documentation/Intel-IOMMU.txt -> Documentation/x86/intel-iommu.rst Documentation/intel_txt.txt -> Documentation/x86/intel_txt.rst Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-12Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-0/+29
Merge updates from Andrew Morton: "Am experimenting with splitting MM up into identifiable subsystems perhaps with a view to gitifying it in complex ways. Also with more verbose "incoming" emails. Most of MM is here and a few other trees. Subsystems affected by this patch series: - hotfixes - iommu - scripts - arch/sh - ocfs2 - mm:slab-generic - mm:slub - mm:kmemleak - mm:kasan - mm:cleanups - mm:debug - mm:pagecache - mm:swap - mm:memcg - mm:gup - mm:pagemap - mm:infrastructure - mm:vmalloc - mm:initialization - mm:pagealloc - mm:vmscan - mm:tools - mm:proc - mm:ras - mm:oom-kill hotfixes: mm: vmscan: scan anonymous pages on file refaults mm/nvdimm: add is_ioremap_addr and use that to check ioremap address mm/memcontrol: fix wrong statistics in memory.stat mm/z3fold.c: lock z3fold page before __SetPageMovable() nilfs2: do not use unexported cpu_to_le32()/le32_to_cpu() in uapi header MAINTAINERS: nilfs2: update email address iommu: include/linux/dmar.h: replace single-char identifiers in macros scripts: scripts/decode_stacktrace: match basepath using shell prefix operator, not regex scripts/decode_stacktrace: look for modules with .ko.debug extension scripts/spelling.txt: drop "sepc" from the misspelling list scripts/spelling.txt: add spelling fix for prohibited scripts/decode_stacktrace: Accept dash/underscore in modules scripts/spelling.txt: add more spellings to spelling.txt arch/sh: arch/sh/configs/sdk7786_defconfig: remove CONFIG_LOGFS sh: config: remove left-over BACKLIGHT_LCD_SUPPORT sh: prevent warnings when using iounmap ocfs2: fs: ocfs: fix spelling mistake "hearbeating" -> "heartbeat" ocfs2/dlm: use struct_size() helper ocfs2: add last unlock times in locking_state ocfs2: add locking filter debugfs file ocfs2: add first lock wait time in locking_state ocfs: no need to check return value of debugfs_create functions fs/ocfs2/dlmglue.c: unneeded variable: "status" ocfs2: use kmemdup rather than duplicating its implementation mm:slab-generic: Patch series "mm/slab: Improved sanity checking": mm/slab: validate cache membership under freelist hardening mm/slab: sanity-check page type when looking up cache lkdtm/heap: add tests for freelist hardening mm:slub: mm/slub.c: avoid double string traverse in kmem_cache_flags() slub: don't panic for memcg kmem cache creation failure mm:kmemleak: mm/kmemleak.c: fix check for softirq context mm/kmemleak.c: change error at _write when kmemleak is disabled docs: kmemleak: add more documentation details mm:kasan: mm/kasan: print frame description for stack bugs Patch series "Bitops instrumentation for KASAN", v5: lib/test_kasan: add bitops tests x86: use static_cpu_has in uaccess region to avoid instrumentation asm-generic, x86: add bitops instrumentation for KASAN Patch series "mm/kasan: Add object validation in ksize()", v3: mm/kasan: introduce __kasan_check_{read,write} mm/kasan: change kasan_check_{read,write} to return boolean lib/test_kasan: Add test for double-kzfree detection mm/slab: refactor common ksize KASAN logic into slab_common.c mm/kasan: add object validation in ksize() mm:cleanups: include/linux/pfn_t.h: remove pfn_t_to_virt() Patch series "remove ARCH_SELECT_MEMORY_MODEL where it has no effect": arm: remove ARCH_SELECT_MEMORY_MODEL s390: remove ARCH_SELECT_MEMORY_MODEL sparc: remove ARCH_SELECT_MEMORY_MODEL mm/gup.c: make follow_page_mask() static mm/memory.c: trivial clean up in insert_page() mm: make !CONFIG_HUGE_PAGE wrappers into static inlines include/linux/mm_types.h: ifdef struct vm_area_struct::swap_readahead_info mm: remove the account_page_dirtied export mm/page_isolation.c: change the prototype of undo_isolate_page_range() include/linux/vmpressure.h: use spinlock_t instead of struct spinlock mm: remove the exporting of totalram_pages include/linux/pagemap.h: document trylock_page() return value mm:debug: mm/failslab.c: by default, do not fail allocations with direct reclaim only Patch series "debug_pagealloc improvements": mm, debug_pagelloc: use static keys to enable debugging mm, page_alloc: more extensive free page checking with debug_pagealloc mm, debug_pagealloc: use a page type instead of page_ext flag mm:pagecache: Patch series "fix filler_t callback type mismatches", v2: mm/filemap.c: fix an overly long line in read_cache_page mm/filemap: don't cast ->readpage to filler_t for do_read_cache_page jffs2: pass the correct prototype to read_cache_page 9p: pass the correct prototype to read_cache_page mm/filemap.c: correct the comment about VM_FAULT_RETRY mm:swap: mm, swap: fix race between swapoff and some swap operations mm/swap_state.c: simplify total_swapcache_pages() with get_swap_device() mm, swap: use rbtree for swap_extent mm/mincore.c: fix race between swapoff and mincore mm:memcg: memcg, oom: no oom-kill for __GFP_RETRY_MAYFAIL memcg, fsnotify: no oom-kill for remote memcg charging mm, memcg: introduce memory.events.local mm: memcontrol: dump memory.stat during cgroup OOM Patch series "mm: reparent slab memory on cgroup removal", v7: mm: memcg/slab: postpone kmem_cache memcg pointer initialization to memcg_link_cache() mm: memcg/slab: rename slab delayed deactivation functions and fields mm: memcg/slab: generalize postponed non-root kmem_cache deactivation mm: memcg/slab: introduce __memcg_kmem_uncharge_memcg() mm: memcg/slab: unify SLAB and SLUB page accounting mm: memcg/slab: don't check the dying flag on kmem_cache creation mm: memcg/slab: synchronize access to kmem_cache dying flag using a spinlock mm: memcg/slab: rework non-root kmem_cache lifecycle management mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages mm: memcg/slab: reparent memcg kmem_caches on cgroup removal mm, memcg: add a memcg_slabinfo debugfs file mm:gup: Patch series "switch the remaining architectures to use generic GUP", v4: mm: use untagged_addr() for get_user_pages_fast addresses mm: simplify gup_fast_permitted mm: lift the x86_32 PAE version of gup_get_pte to common code MIPS: use the generic get_user_pages_fast code sh: add the missing pud_page definition sh: use the generic get_user_pages_fast code sparc64: add the missing pgd_page definition sparc64: define untagged_addr() sparc64: use the generic get_user_pages_fast code mm: rename CONFIG_HAVE_GENERIC_GUP to CONFIG_HAVE_FAST_GUP mm: reorder code blocks in gup.c mm: consolidate the get_user_pages* implementations mm: validate get_user_pages_fast flags mm: move the powerpc hugepd code to mm/gup.c mm: switch gup_hugepte to use try_get_compound_head mm: mark the page referenced in gup_hugepte mm/gup: speed up check_and_migrate_cma_pages() on huge page mm/gup.c: remove some BUG_ONs from get_gate_page() mm/gup.c: mark undo_dev_pagemap as __maybe_unused mm:pagemap: asm-generic, x86: introduce generic pte_{alloc,free}_one[_kernel] alpha: switch to generic version of pte allocation arm: switch to generic version of pte allocation arm64: switch to generic version of pte allocation csky: switch to generic version of pte allocation m68k: sun3: switch to generic version of pte allocation mips: switch to generic version of pte allocation nds32: switch to generic version of pte allocation nios2: switch to generic version of pte allocation parisc: switch to generic version of pte allocation riscv: switch to generic version of pte allocation um: switch to generic version of pte allocation unicore32: switch to generic version of pte allocation mm/pgtable: drop pgtable_t variable from pte_fn_t functions mm/memory.c: fail when offset == num in first check of __vm_map_pages() mm:infrastructure: mm/mmu_notifier: use hlist_add_head_rcu() mm:vmalloc: Patch series "Some cleanups for the KVA/vmalloc", v5: mm/vmalloc.c: remove "node" argument mm/vmalloc.c: preload a CPU with one object for split purpose mm/vmalloc.c: get rid of one single unlink_va() when merge mm/vmalloc.c: switch to WARN_ON() and move it under unlink_va() mm/vmalloc.c: spelling> s/informaion/information/ mm:initialization: mm/large system hash: use vmalloc for size > MAX_ORDER when !hashdist mm/large system hash: clear hashdist when only one node with memory is booted mm:pagealloc: arm64: move jump_label_init() before parse_early_param() Patch series "add init_on_alloc/init_on_free boot options", v10: mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options mm: init: report memory auto-initialization features at boot time mm:vmscan: mm: vmscan: remove double slab pressure by inc'ing sc->nr_scanned mm: vmscan: correct some vmscan counters for THP swapout mm:tools: tools/vm/slabinfo: order command line options tools/vm/slabinfo: add partial slab listing to -X tools/vm/slabinfo: add option to sort by partial slabs tools/vm/slabinfo: add sorting info to help menu mm:proc: proc: use down_read_killable mmap_sem for /proc/pid/maps proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup proc: use down_read_killable mmap_sem for /proc/pid/pagemap proc: use down_read_killable mmap_sem for /proc/pid/clear_refs proc: use down_read_killable mmap_sem for /proc/pid/map_files mm: use down_read_killable for locking mmap_sem in access_remote_vm mm: smaps: split PSS into components mm: vmalloc: show number of vmalloc pages in /proc/meminfo mm:ras: mm/memory-failure.c: clarify error message mm:oom-kill: mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks() mm, oom: refactor dump_tasks for memcg OOMs mm, oom: remove redundant task_in_mem_cgroup() check oom: decouple mems_allowed from oom_unkillable_task mm/oom_kill.c: remove redundant OOM score normalization in select_bad_process()" * akpm: (147 commits) mm/oom_kill.c: remove redundant OOM score normalization in select_bad_process() oom: decouple mems_allowed from oom_unkillable_task mm, oom: remove redundant task_in_mem_cgroup() check mm, oom: refactor dump_tasks for memcg OOMs mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks() mm/memory-failure.c: clarify error message mm: vmalloc: show number of vmalloc pages in /proc/meminfo mm: smaps: split PSS into components mm: use down_read_killable for locking mmap_sem in access_remote_vm proc: use down_read_killable mmap_sem for /proc/pid/map_files proc: use down_read_killable mmap_sem for /proc/pid/clear_refs proc: use down_read_killable mmap_sem for /proc/pid/pagemap proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup proc: use down_read_killable mmap_sem for /proc/pid/maps tools/vm/slabinfo: add sorting info to help menu tools/vm/slabinfo: add option to sort by partial slabs tools/vm/slabinfo: add partial slab listing to -X tools/vm/slabinfo: order command line options mm: vmscan: correct some vmscan counters for THP swapout mm: vmscan: remove double slab pressure by inc'ing sc->nr_scanned ...
2019-07-12mm: security: introduce init_on_alloc=1 and init_on_free=1 boot optionsAlexander Potapenko1-0/+29
Patch series "add init_on_alloc/init_on_free boot options", v10. Provide init_on_alloc and init_on_free boot options. These are aimed at preventing possible information leaks and making the control-flow bugs that depend on uninitialized values more deterministic. Enabling either of the options guarantees that the memory returned by the page allocator and SL[AU]B is initialized with zeroes. SLOB allocator isn't supported at the moment, as its emulation of kmem caches complicates handling of SLAB_TYPESAFE_BY_RCU caches correctly. Enabling init_on_free also guarantees that pages and heap objects are initialized right after they're freed, so it won't be possible to access stale data by using a dangling pointer. As suggested by Michal Hocko, right now we don't let the heap users to disable initialization for certain allocations. There's not enough evidence that doing so can speed up real-life cases, and introducing ways to opt-out may result in things going out of control. This patch (of 2): The new options are needed to prevent possible information leaks and make control-flow bugs that depend on uninitialized values more deterministic. This is expected to be on-by-default on Android and Chrome OS. And it gives the opportunity for anyone else to use it under distros too via the boot args. (The init_on_free feature is regularly requested by folks where memory forensics is included in their threat models.) init_on_alloc=1 makes the kernel initialize newly allocated pages and heap objects with zeroes. Initialization is done at allocation time at the places where checks for __GFP_ZERO are performed. init_on_free=1 makes the kernel initialize freed pages and heap objects with zeroes upon their deletion. This helps to ensure sensitive data doesn't leak via use-after-free accesses. Both init_on_alloc=1 and init_on_free=1 guarantee that the allocator returns zeroed memory. The two exceptions are slab caches with constructors and SLAB_TYPESAFE_BY_RCU flag. Those are never zero-initialized to preserve their semantics. Both init_on_alloc and init_on_free default to zero, but those defaults can be overridden with CONFIG_INIT_ON_ALLOC_DEFAULT_ON and CONFIG_INIT_ON_FREE_DEFAULT_ON. If either SLUB poisoning or page poisoning is enabled, those options take precedence over init_on_alloc and init_on_free: initialization is only applied to unpoisoned allocations. Slowdown for the new features compared to init_on_free=0, init_on_alloc=0: hackbench, init_on_free=1: +7.62% sys time (st.err 0.74%) hackbench, init_on_alloc=1: +7.75% sys time (st.err 2.14%) Linux build with -j12, init_on_free=1: +8.38% wall time (st.err 0.39%) Linux build with -j12, init_on_free=1: +24.42% sys time (st.err 0.52%) Linux build with -j12, init_on_alloc=1: -0.13% wall time (st.err 0.42%) Linux build with -j12, init_on_alloc=1: +0.57% sys time (st.err 0.40%) The slowdown for init_on_free=0, init_on_alloc=0 compared to the baseline is within the standard error. The new features are also going to pave the way for hardware memory tagging (e.g. arm64's MTE), which will require both on_alloc and on_free hooks to set the tags for heap objects. With MTE, tagging will have the same cost as memory initialization. Although init_on_free is rather costly, there are paranoid use-cases where in-memory data lifetime is desired to be minimized. There are various arguments for/against the realism of the associated threat models, but given that we'll need the infrastructure for MTE anyway, and there are people who want wipe-on-free behavior no matter what the performance cost, it seems reasonable to include it in this series. [glider@google.com: v8] Link: http://lkml.kernel.org/r/20190626121943.131390-2-glider@google.com [glider@google.com: v9] Link: http://lkml.kernel.org/r/20190627130316.254309-2-glider@google.com [glider@google.com: v10] Link: http://lkml.kernel.org/r/20190628093131.199499-2-glider@google.com Link: http://lkml.kernel.org/r/20190617151050.92663-2-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Michal Hocko <mhocko@suse.cz> [page and dmapool parts Acked-by: James Morris <jamorris@linux.microsoft.com>] Cc: Christoph Lameter <cl@linux.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Sandeep Patil <sspatil@android.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Jann Horn <jannh@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-11Merge tag 'loadpin-v5.3-rc1' of ↵Linus Torvalds1-0/+48
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull security/loadpin updates from Kees Cook: - Allow exclusion of specific file types (Ke Wu) * tag 'loadpin-v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: security/loadpin: Allow to exclude specific file types
2019-07-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds1-1/+4
Pull networking updates from David Miller: "Some highlights from this development cycle: 1) Big refactoring of ipv6 route and neigh handling to support nexthop objects configurable as units from userspace. From David Ahern. 2) Convert explored_states in BPF verifier into a hash table, significantly decreased state held for programs with bpf2bpf calls, from Alexei Starovoitov. 3) Implement bpf_send_signal() helper, from Yonghong Song. 4) Various classifier enhancements to mvpp2 driver, from Maxime Chevallier. 5) Add aRFS support to hns3 driver, from Jian Shen. 6) Fix use after free in inet frags by allocating fqdirs dynamically and reworking how rhashtable dismantle occurs, from Eric Dumazet. 7) Add act_ctinfo packet classifier action, from Kevin Darbyshire-Bryant. 8) Add TFO key backup infrastructure, from Jason Baron. 9) Remove several old and unused ISDN drivers, from Arnd Bergmann. 10) Add devlink notifications for flash update status to mlxsw driver, from Jiri Pirko. 11) Lots of kTLS offload infrastructure fixes, from Jakub Kicinski. 12) Add support for mv88e6250 DSA chips, from Rasmus Villemoes. 13) Various enhancements to ipv6 flow label handling, from Eric Dumazet and Willem de Bruijn. 14) Support TLS offload in nfp driver, from Jakub Kicinski, Dirk van der Merwe, and others. 15) Various improvements to axienet driver including converting it to phylink, from Robert Hancock. 16) Add PTP support to sja1105 DSA driver, from Vladimir Oltean. 17) Add mqprio qdisc offload support to dpaa2-eth, from Ioana Radulescu. 18) Add devlink health reporting to mlx5, from Moshe Shemesh. 19) Convert stmmac over to phylink, from Jose Abreu. 20) Add PTP PHC (Physical Hardware Clock) support to mlxsw, from Shalom Toledo. 21) Add nftables SYNPROXY support, from Fernando Fernandez Mancera. 22) Convert tcp_fastopen over to use SipHash, from Ard Biesheuvel. 23) Track spill/fill of constants in BPF verifier, from Alexei Starovoitov. 24) Support bounded loops in BPF, from Alexei Starovoitov. 25) Various page_pool API fixes and improvements, from Jesper Dangaard Brouer. 26) Just like ipv4, support ref-countless ipv6 route handling. From Wei Wang. 27) Support VLAN offloading in aquantia driver, from Igor Russkikh. 28) Add AF_XDP zero-copy support to mlx5, from Maxim Mikityanskiy. 29) Add flower GRE encap/decap support to nfp driver, from Pieter Jansen van Vuuren. 30) Protect against stack overflow when using act_mirred, from John Hurley. 31) Allow devmap map lookups from eBPF, from Toke Høiland-Jørgensen. 32) Use page_pool API in netsec driver, Ilias Apalodimas. 33) Add Google gve network driver, from Catherine Sullivan. 34) More indirect call avoidance, from Paolo Abeni. 35) Add kTLS TX HW offload support to mlx5, from Tariq Toukan. 36) Add XDP_REDIRECT support to bnxt_en, from Andy Gospodarek. 37) Add MPLS manipulation actions to TC, from John Hurley. 38) Add sending a packet to connection tracking from TC actions, and then allow flower classifier matching on conntrack state. From Paul Blakey. 39) Netfilter hw offload support, from Pablo Neira Ayuso" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2080 commits) net/mlx5e: Return in default case statement in tx_post_resync_params mlx5: Return -EINVAL when WARN_ON_ONCE triggers in mlx5e_tls_resync(). net: dsa: add support for BRIDGE_MROUTER attribute pkt_sched: Include const.h net: netsec: remove static declaration for netsec_set_tx_de() net: netsec: remove superfluous if statement netfilter: nf_tables: add hardware offload support net: flow_offload: rename tc_cls_flower_offload to flow_cls_offload net: flow_offload: add flow_block_cb_is_busy() and use it net: sched: remove tcf block API drivers: net: use flow block API net: sched: use flow block API net: flow_offload: add flow_block_cb_{priv, incref, decref}() net: flow_offload: add list handling functions net: flow_offload: add flow_block_cb_alloc() and flow_block_cb_free() net: flow_offload: rename TCF_BLOCK_BINDER_TYPE_* to FLOW_BLOCK_BINDER_TYPE_* net: flow_offload: rename TC_BLOCK_{UN}BIND to FLOW_BLOCK_{UN}BIND net: flow_offload: add flow_block_cb_setup_simple() net: hisilicon: Add an tx_desc to adapt HI13X1_GMAC net: hisilicon: Add an rx_desc to adapt HI13X1_GMAC ...
2019-07-10Revert "Merge tag 'keys-acl-20190703' of ↵Linus Torvalds22-629/+187
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs" This reverts merge 0f75ef6a9cff49ff612f7ce0578bced9d0b38325 (and thus effectively commits 7a1ade847596 ("keys: Provide KEYCTL_GRANT_PERMISSION") 2e12256b9a76 ("keys: Replace uid/gid/perm permissions checking with an ACL") that the merge brought in). It turns out that it breaks booting with an encrypted volume, and Eric biggers reports that it also breaks the fscrypt tests [1] and loading of in-kernel X.509 certificates [2]. The root cause of all the breakage is likely the same, but David Howells is off email so rather than try to work it out it's getting reverted in order to not impact the rest of the merge window. [1] https://lore.kernel.org/lkml/20190710011559.GA7973@sol.localdomain/ [2] https://lore.kernel.org/lkml/20190710013225.GB7973@sol.localdomain/ Link: https://lore.kernel.org/lkml/CAHk-=wjxoeMJfeBahnWH=9zShKp2bsVy527vo3_y8HfOdhwAAw@mail.gmail.com/ Reported-by: Eric Biggers <ebiggers@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-09Merge tag 'docs-5.3' of git://git.lwn.net/linuxLinus Torvalds1-1/+1
Pull Documentation updates from Jonathan Corbet: "It's been a relatively busy cycle for docs: - A fair pile of RST conversions, many from Mauro. These create more than the usual number of simple but annoying merge conflicts with other trees, unfortunately. He has a lot more of these waiting on the wings that, I think, will go to you directly later on. - A new document on how to use merges and rebases in kernel repos, and one on Spectre vulnerabilities. - Various improvements to the build system, including automatic markup of function() references because some people, for reasons I will never understand, were of the opinion that :c:func:``function()`` is unattractive and not fun to type. - We now recommend using sphinx 1.7, but still support back to 1.4. - Lots of smaller improvements, warning fixes, typo fixes, etc" * tag 'docs-5.3' of git://git.lwn.net/linux: (129 commits) docs: automarkup.py: ignore exceptions when seeking for xrefs docs: Move binderfs to admin-guide Disable Sphinx SmartyPants in HTML output doc: RCU callback locks need only _bh, not necessarily _irq docs: format kernel-parameters -- as code Doc : doc-guide : Fix a typo platform: x86: get rid of a non-existent document Add the RCU docs to the core-api manual Documentation: RCU: Add TOC tree hooks Documentation: RCU: Rename txt files to rst Documentation: RCU: Convert RCU UP systems to reST Documentation: RCU: Convert RCU linked list to reST Documentation: RCU: Convert RCU basic concepts to reST docs: filesystems: Remove uneeded .rst extension on toctables scripts/sphinx-pre-install: fix out-of-tree build docs: zh_CN: submitting-drivers.rst: Remove a duplicated Documentation/ Documentation: PGP: update for newer HW devices Documentation: Add section about CPU vulnerabilities for Spectre Documentation: platform: Delete x86-laptop-drivers.txt docs: Note that :c:func: should no longer be used ...
2019-07-09Merge branch 'next-lsm' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull capabilities update from James Morris: "Minor fixes for capabilities: - Update the commoncap.c code to utilize XATTR_SECURITY_PREFIX_LEN, from Carmeli tamir. - Make the capability hooks static, from Yue Haibing" * 'next-lsm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: security/commoncap: Use xattr security prefix len security: Make capability_hooks static
2019-07-08Merge branch 'siginfo-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull force_sig() argument change from Eric Biederman: "A source of error over the years has been that force_sig has taken a task parameter when it is only safe to use force_sig with the current task. The force_sig function is built for delivering synchronous signals such as SIGSEGV where the userspace application caused a synchronous fault (such as a page fault) and the kernel responded with a signal. Because the name force_sig does not make this clear, and because the force_sig takes a task parameter the function force_sig has been abused for sending other kinds of signals over the years. Slowly those have been fixed when the oopses have been tracked down. This set of changes fixes the remaining abusers of force_sig and carefully rips out the task parameter from force_sig and friends making this kind of error almost impossible in the future" * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits) signal/x86: Move tsk inside of CONFIG_MEMORY_FAILURE in do_sigbus signal: Remove the signal number and task parameters from force_sig_info signal: Factor force_sig_info_to_task out of force_sig_info signal: Generate the siginfo in force_sig signal: Move the computation of force into send_signal and correct it. signal: Properly set TRACE_SIGNAL_LOSE_INFO in __send_signal signal: Remove the task parameter from force_sig_fault signal: Use force_sig_fault_to_task for the two calls that don't deliver to current signal: Explicitly call force_sig_fault on current signal/unicore32: Remove tsk parameter from __do_user_fault signal/arm: Remove tsk parameter from __do_user_fault signal/arm: Remove tsk parameter from ptrace_break signal/nds32: Remove tsk parameter from send_sigtrap signal/riscv: Remove tsk parameter from do_trap signal/sh: Remove tsk parameter from force_sig_info_fault signal/um: Remove task parameter from send_sigtrap signal/x86: Remove task parameter from send_sigtrap signal: Remove task parameter from force_sig_mceerr signal: Remove task parameter from force_sig signal: Remove task parameter from force_sigsegv ...
2019-07-08Merge branch 'for-5.3' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "Documentation updates and the addition of cgroup_parse_float() which will be used by new controllers including blk-iocost" * 'for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: docs: cgroup-v1: convert docs to ReST and rename to *.rst cgroup: Move cgroup_parse_float() implementation out of CONFIG_SYSFS cgroup: add cgroup_parse_float()
2019-07-08Merge branch 'next-integrity' of ↵Linus Torvalds16-79/+378
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "Bug fixes, code clean up, and new features: - IMA policy rules can be defined in terms of LSM labels, making the IMA policy dependent on LSM policy label changes, in particular LSM label deletions. The new environment, in which IMA-appraisal is being used, frequently updates the LSM policy and permits LSM label deletions. - Prevent an mmap'ed shared file opened for write from also being mmap'ed execute. In the long term, making this and other similar changes at the VFS layer would be preferable. - The IMA per policy rule template format support is needed for a couple of new/proposed features (eg. kexec boot command line measurement, appended signatures, and VFS provided file hashes). - Other than the "boot-aggregate" record in the IMA measuremeent list, all other measurements are of file data. Measuring and storing the kexec boot command line in the IMA measurement list is the first buffer based measurement included in the measurement list" * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: integrity: Introduce struct evm_xattr ima: Update MAX_TEMPLATE_NAME_LEN to fit largest reasonable definition KEXEC: Call ima_kexec_cmdline to measure the boot command line args IMA: Define a new template field buf IMA: Define a new hook to measure the kexec boot command line arguments IMA: support for per policy rule template formats integrity: Fix __integrity_init_keyring() section mismatch ima: Use designated initializers for struct ima_event_data ima: use the lsm policy update notifier LSM: switch to blocking policy update notifiers x86/ima: fix the Kconfig dependency for IMA_ARCH_POLICY ima: Make arch_policy_entry static ima: prevent a file already mmap'ed write to be mmap'ed execute x86/ima: check EFI SetupMode too
2019-07-08Merge tag 'keys-acl-20190703' of ↵Linus Torvalds22-187/+629
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull keyring ACL support from David Howells: "This changes the permissions model used by keys and keyrings to be based on an internal ACL by the following means: - Replace the permissions mask internally with an ACL that contains a list of ACEs, each with a specific subject with a permissions mask. Potted default ACLs are available for new keys and keyrings. ACE subjects can be macroised to indicate the UID and GID specified on the key (which remain). Future commits will be able to add additional subject types, such as specific UIDs or domain tags/namespaces. Also split a number of permissions to give finer control. Examples include splitting the revocation permit from the change-attributes permit, thereby allowing someone to be granted permission to revoke a key without allowing them to change the owner; also the ability to join a keyring is split from the ability to link to it, thereby stopping a process accessing a keyring by joining it and thus acquiring use of possessor permits. - Provide a keyctl to allow the granting or denial of one or more permits to a specific subject. Direct access to the ACL is not granted, and the ACL cannot be viewed" * tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: keys: Provide KEYCTL_GRANT_PERMISSION keys: Replace uid/gid/perm permissions checking with an ACL
2019-07-08Merge tag 'keys-namespace-20190627' of ↵Linus Torvalds11-243/+389
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull keyring namespacing from David Howells: "These patches help make keys and keyrings more namespace aware. Firstly some miscellaneous patches to make the process easier: - Simplify key index_key handling so that the word-sized chunks assoc_array requires don't have to be shifted about, making it easier to add more bits into the key. - Cache the hash value in the key so that we don't have to calculate on every key we examine during a search (it involves a bunch of multiplications). - Allow keying_search() to search non-recursively. Then the main patches: - Make it so that keyring names are per-user_namespace from the point of view of KEYCTL_JOIN_SESSION_KEYRING so that they're not accessible cross-user_namespace. keyctl_capabilities() shows KEYCTL_CAPS1_NS_KEYRING_NAME for this. - Move the user and user-session keyrings to the user_namespace rather than the user_struct. This prevents them propagating directly across user_namespaces boundaries (ie. the KEY_SPEC_* flags will only pick from the current user_namespace). - Make it possible to include the target namespace in which the key shall operate in the index_key. This will allow the possibility of multiple keys with the same description, but different target domains to be held in the same keyring. keyctl_capabilities() shows KEYCTL_CAPS1_NS_KEY_TAG for this. - Make it so that keys are implicitly invalidated by removal of a domain tag, causing them to be garbage collected. - Institute a network namespace domain tag that allows keys to be differentiated by the network namespace in which they operate. New keys that are of a type marked 'KEY_TYPE_NET_DOMAIN' are assigned the network domain in force when they are created. - Make it so that the desired network namespace can be handed down into the request_key() mechanism. This allows AFS, NFS, etc. to request keys specific to the network namespace of the superblock. This also means that the keys in the DNS record cache are thenceforth namespaced, provided network filesystems pass the appropriate network namespace down into dns_query(). For DNS, AFS and NFS are good, whilst CIFS and Ceph are not. Other cache keyrings, such as idmapper keyrings, also need to set the domain tag - for which they need access to the network namespace of the superblock" * tag 'keys-namespace-20190627' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: keys: Pass the network namespace into request_key mechanism keys: Network namespace domain tag keys: Garbage collect keys for which the domain has been removed keys: Include target namespace in match criteria keys: Move the user and user-session keyrings to the user_namespace keys: Namespace keyring names keys: Add a 'recurse' flag for keyring searches keys: Cache the hash value to avoid lots of recalculation keys: Simplify key description management
2019-07-08Merge tag 'keys-request-20190626' of ↵Linus Torvalds8-106/+180
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull request_key improvements from David Howells: "These are all request_key()-related, including a fix and some improvements: - Fix the lack of a Link permission check on a key found by request_key(), thereby enabling request_key() to link keys that don't grant this permission to the target keyring (which must still grant Write permission). Note that the key must be in the caller's keyrings already to be found. - Invalidate used request_key authentication keys rather than revoking them, so that they get cleaned up immediately rather than hanging around till the expiry time is passed. - Move the RCU locks outwards from the keyring search functions so that a request_key_rcu() can be provided. This can be called in RCU mode, so it can't sleep and can't upcall - but it can be called from LOOKUP_RCU pathwalk mode. - Cache the latest positive result of request_key*() temporarily in task_struct so that filesystems that make a lot of request_key() calls during pathwalk can take advantage of it to avoid having to redo the searching. This requires CONFIG_KEYS_REQUEST_CACHE=y. It is assumed that the key just found is likely to be used multiple times in each step in an RCU pathwalk, and is likely to be reused for the next step too. Note that the cleanup of the cache is done on TIF_NOTIFY_RESUME, just before userspace resumes, and on exit" * tag 'keys-request-20190626' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: keys: Kill off request_key_async{,_with_auxdata} keys: Cache result of request_key*() temporarily in task_struct keys: Provide request_key_rcu() keys: Move the RCU locks outwards from the keyring search functions keys: Invalidate used request_key authentication keys keys: Fix request_key() lack of Link perm check on found key
2019-07-08Merge tag 'keys-misc-20190619' of ↵Linus Torvalds8-78/+369
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull misc keyring updates from David Howells: "These are some miscellaneous keyrings fixes and improvements: - Fix a bunch of warnings from sparse, including missing RCU bits and kdoc-function argument mismatches - Implement a keyctl to allow a key to be moved from one keyring to another, with the option of prohibiting key replacement in the destination keyring. - Grant Link permission to possessors of request_key_auth tokens so that upcall servicing daemons can more easily arrange things such that only the necessary auth key is passed to the actual service program, and not all the auth keys a daemon might possesss. - Improvement in lookup_user_key(). - Implement a keyctl to allow keyrings subsystem capabilities to be queried. The keyutils next branch has commits to make available, document and test the move-key and capabilities code: https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log They're currently on the 'next' branch" * tag 'keys-misc-20190619' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: keys: Add capability-checking keyctl function keys: Reuse keyring_index_key::desc_len in lookup_user_key() keys: Grant Link permission to possessers of request_key auth keys keys: Add a keyctl to move a key between keyrings keys: Hoist locking out of __key_link_begin() keys: Break bits out of key_unlink() keys: Change keyring_serialise_link_sem to a mutex keys: sparse: Fix kdoc mismatches keys: sparse: Fix incorrect RCU accesses keys: sparse: Fix key_fs[ug]id_changed()
2019-07-08Merge tag 'selinux-pr-20190702' of ↵Linus Torvalds3-23/+31
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: "Like the audit pull request this is a little early due to some upcoming vacation plans and uncertain network access while I'm away. Also like the audit PR, the list of patches here is pretty minor, the highlights include: - Explicitly use __le variables to make sure "sparse" can verify proper byte endian handling. - Remove some BUG_ON()s that are no longer needed. - Allow zero-byte writes to the "keycreate" procfs attribute without requiring key:create to make it easier for userspace to reset the keycreate label. - Consistently log the "invalid_context" field as an untrusted string in the AUDIT_SELINUX_ERR audit records" * tag 'selinux-pr-20190702' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: format all invalid context as untrusted selinux: fix empty write to keycreate file selinux: remove some no-op BUG_ONs selinux: provide __le variables explicitly
2019-07-08Merge branch 'locking-core-for-linus' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle are: - rwsem scalability improvements, phase #2, by Waiman Long, which are rather impressive: "On a 2-socket 40-core 80-thread Skylake system with 40 reader and writer locking threads, the min/mean/max locking operations done in a 5-second testing window before the patchset were: 40 readers, Iterations Min/Mean/Max = 1,807/1,808/1,810 40 writers, Iterations Min/Mean/Max = 1,807/50,344/151,255 After the patchset, they became: 40 readers, Iterations Min/Mean/Max = 30,057/31,359/32,741 40 writers, Iterations Min/Mean/Max = 94,466/95,845/97,098" There's a lot of changes to the locking implementation that makes it similar to qrwlock, including owner handoff for more fair locking. Another microbenchmark shows how across the spectrum the improvements are: "With a locking microbenchmark running on 5.1 based kernel, the total locking rates (in kops/s) on a 2-socket Skylake system with equal numbers of readers and writers (mixed) before and after this patchset were: # of Threads Before Patch After Patch ------------ ------------ ----------- 2 2,618 4,193 4 1,202 3,726 8 802 3,622 16 729 3,359 32 319 2,826 64 102 2,744" The changes are extensive and the patch-set has been through several iterations addressing various locking workloads. There might be more regressions, but unless they are pathological I believe we want to use this new implementation as the baseline going forward. - jump-label optimizations by Daniel Bristot de Oliveira: the primary motivation was to remove IPI disturbance of isolated RT-workload CPUs, which resulted in the implementation of batched jump-label updates. Beyond the improvement of the real-time characteristics kernel, in one test this patchset improved static key update overhead from 57 msecs to just 1.4 msecs - which is a nice speedup as well. - atomic64_t cross-arch type cleanups by Mark Rutland: over the last ~10 years of atomic64_t existence the various types used by the APIs only had to be self-consistent within each architecture - which means they became wildly inconsistent across architectures. Mark puts and end to this by reworking all the atomic64 implementations to use 's64' as the base type for atomic64_t, and to ensure that this type is consistently used for parameters and return values in the API, avoiding further problems in this area. - A large set of small improvements to lockdep by Yuyang Du: type cleanups, output cleanups, function return type and othr cleanups all around the place. - A set of percpu ops cleanups and fixes by Peter Zijlstra. - Misc other changes - please see the Git log for more details" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits) locking/lockdep: increase size of counters for lockdep statistics locking/atomics: Use sed(1) instead of non-standard head(1) option locking/lockdep: Move mark_lock() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING x86/jump_label: Make tp_vec_nr static x86/percpu: Optimize raw_cpu_xchg() x86/percpu, sched/fair: Avoid local_clock() x86/percpu, x86/irq: Relax {set,get}_irq_regs() x86/percpu: Relax smp_processor_id() x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}() locking/rwsem: Guard against making count negative locking/rwsem: Adaptive disabling of reader optimistic spinning locking/rwsem: Enable time-based spinning on reader-owned rwsem locking/rwsem: Make rwsem->owner an atomic_long_t locking/rwsem: Enable readers spinning on writer locking/rwsem: Clarify usage of owner's nonspinaable bit locking/rwsem: Wake up almost all readers in wait queue locking/rwsem: More optimal RT task handling of null owner locking/rwsem: Always release wait_lock before waking up tasks locking/rwsem: Implement lock handoff to prevent lock starvation locking/rwsem: Make rwsem_spin_on_owner() return owner state ...
2019-07-07security/commoncap: Use xattr security prefix lenCarmeli Tamir1-2/+2
Using the existing defined XATTR_SECURITY_PREFIX_LEN instead of sizeof(XATTR_SECURITY_PREFIX) - 1. Pretty simple cleanup. Signed-off-by: Carmeli Tamir <carmeli.tamir@gmail.com> Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2019-07-04vfs: Convert smackfs to use the new mount APIDavid Howells1-12/+22
Convert the smackfs filesystem to the new internal mount API as the old one will be obsoleted and removed. This allows greater flexibility in communication of mount parameters between userspace, the VFS and the filesystem. See Documentation/filesystems/mount_api.txt for more information. Signed-off-by: David Howells <dhowells@redhat.com> cc: Casey Schaufler <casey@schaufler-ca.com> cc: linux-security-module@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-07-04vfs: Convert selinuxfs to use the new mount APIDavid Howells1-5/+15
Convert the selinuxfs filesystem to the new internal mount API as the old one will be obsoleted and removed. This allows greater flexibility in communication of mount parameters between userspace, the VFS and the filesystem. See Documentation/filesystems/mount_api.txt for more information. Signed-off-by: David Howells <dhowells@redhat.com> cc: Paul Moore <paul@paul-moore.com> cc: Stephen Smalley <sds@tycho.nsa.gov> cc: Eric Paris <eparis@parisplace.org> cc: selinux@vger.kernel.org cc: linux-security-module@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-07-04vfs: Convert securityfs to use the new mount APIDavid Howells1-6/+15
Convert the securityfs filesystem to the new internal mount API as the old one will be obsoleted and removed. This allows greater flexibility in communication of mount parameters between userspace, the VFS and the filesystem. See Documentation/filesystems/mount_api.txt for more information. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-security-module@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-07-04vfs: Convert apparmorfs to use the new mount APIDavid Howells1-5/+15
Convert the apparmorfs filesystem to the new internal mount API as the old one will be obsoleted and removed. This allows greater flexibility in communication of mount parameters between userspace, the VFS and the filesystem. See Documentation/filesystems/mount_api.txt for more information. Signed-off-by: David Howells <dhowells@redhat.com> cc: John Johansen <john.johansen@canonical.com> cc: apparmor@lists.ubuntu.com cc: linux-security-module@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-07-03keys: Provide KEYCTL_GRANT_PERMISSIONDavid Howells4-1/+133
Provide a keyctl() operation to grant/remove permissions. The grant operation, wrapped by libkeyutils, looks like: int ret = keyctl_grant_permission(key_serial_t key, enum key_ace_subject_type type, unsigned int subject, unsigned int perm); Where key is the key to be modified, type and subject represent the subject to which permission is to be granted (or removed) and perm is the set of permissions to be granted. 0 is returned on success. SET_SECURITY permission is required for this. The subject type currently must be KEY_ACE_SUBJ_STANDARD for the moment (other subject types will come along later). For subject type KEY_ACE_SUBJ_STANDARD, the following subject values are available: KEY_ACE_POSSESSOR The possessor of the key KEY_ACE_OWNER The owner of the key KEY_ACE_GROUP The key's group KEY_ACE_EVERYONE Everyone perm lists the permissions to be granted: KEY_ACE_VIEW Can view the key metadata KEY_ACE_READ Can read the key content KEY_ACE_WRITE Can update/modify the key content KEY_ACE_SEARCH Can find the key by searching/requesting KEY_ACE_LINK Can make a link to the key KEY_ACE_SET_SECURITY Can set security KEY_ACE_INVAL Can invalidate KEY_ACE_REVOKE Can revoke KEY_ACE_JOIN Can join this keyring KEY_ACE_CLEAR Can clear this keyring If an ACE already exists for the subject, then the permissions mask will be overwritten; if perm is 0, it will be deleted. Currently, the internal ACL is limited to a maximum of 16 entries. For example: int ret = keyctl_grant_permission(key, KEY_ACE_SUBJ_STANDARD, KEY_ACE_OWNER, KEY_ACE_VIEW | KEY_ACE_READ); Signed-off-by: David Howells <dhowells@redhat.com>
2019-07-01selinux: format all invalid context as untrustedRichard Guy Briggs1-10/+19
The userspace tools expect all fields of the same name to be logged consistently with the same encoding. Since the invalid_context fields contain untrusted strings in selinux_inode_setxattr() and selinux_setprocattr(), encode all instances of this field the same way as though they were untrusted even though compute_sid_handle_invalid_context() and security_sid_mls_copy() are trusted. Please see github issue https://github.com/linux-audit/audit-kernel/issues/57 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-06-30integrity: Introduce struct evm_xattrThiago Jung Bauermann3-7/+14
Even though struct evm_ima_xattr_data includes a fixed-size array to hold a SHA1 digest, most of the code ignores the array and uses the struct to mean "type indicator followed by data of unspecified size" and tracks the real size of what the struct represents in a separate length variable. The only exception to that is the EVM code, which correctly uses the definition of struct evm_ima_xattr_data. So make this explicit in the code by removing the length specification from the array in struct evm_ima_xattr_data. Also, change the name of the element from digest to data since in most places the array doesn't hold a digest. A separate struct evm_xattr is introduced, with the original definition of evm_ima_xattr_data to be used in the places that actually expect that definition, specifically the EVM HMAC code. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-06-30ima: Update MAX_TEMPLATE_NAME_LEN to fit largest reasonable definitionThiago Jung Bauermann1-1/+7
MAX_TEMPLATE_NAME_LEN is used when restoring measurements carried over from a kexec. It should be set to the length of a template containing all fields except for 'd' and 'n', which don't need to be accounted for since they shouldn't be defined in the same template description as 'd-ng' and 'n-ng'. That length is greater than the current 15, so update using a sizeof() to show where the number comes from and also can be visually shown to be correct. The sizeof() is calculated at compile time. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-06-30IMA: Define a new template field bufPrakhar Srivastava5-1/+33
A buffer(kexec boot command line arguments) measured into IMA measuremnt list cannot be appraised, without already being aware of the buffer contents. Since hashes are non-reversible, raw buffer is needed for validation or regenerating hash for appraisal/attestation. Add support to store/read the buffer contents in HEX. The kexec cmdline hash is stored in the "d-ng" field of the template data. It can be verified using sudo cat /sys/kernel/security/integrity/ima/ascii_runtime_measurements | grep kexec-cmdline | cut -d' ' -f 6 | xxd -r -p | sha256sum - Add two new fields to ima_event_data to hold the buf and buf_len - Add a new template field 'buf' to be used to store/read the buffer data. - Updated process_buffer_meaurement to add the buffer to ima_event_data. process_buffer_measurement added in "Define a new IMA hook to measure the boot command line arguments" - Add a new template policy name ima-buf to represent 'd-ng|n-ng|buf' Signed-off-by: Prakhar Srivastava <prsriva02@gmail.com> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-06-27keys: Replace uid/gid/perm permissions checking with an ACLDavid Howells21-186/+496
Replace the uid/gid/perm permissions checking on a key with an ACL to allow the SETATTR and SEARCH permissions to be split. This will also allow a greater range of subjects to represented. ============ WHY DO THIS? ============ The problem is that SETATTR and SEARCH cover a slew of actions, not all of which should be grouped together. For SETATTR, this includes actions that are about controlling access to a key: (1) Changing a key's ownership. (2) Changing a key's security information. (3) Setting a keyring's restriction. And actions that are about managing a key's lifetime: (4) Setting an expiry time. (5) Revoking a key. and (proposed) managing a key as part of a cache: (6) Invalidating a key. Managing a key's lifetime doesn't really have anything to do with controlling access to that key. Expiry time is awkward since it's more about the lifetime of the content and so, in some ways goes better with WRITE permission. It can, however, be set unconditionally by a process with an appropriate authorisation token for instantiating a key, and can also be set by the key type driver when a key is instantiated, so lumping it with the access-controlling actions is probably okay. As for SEARCH permission, that currently covers: (1) Finding keys in a keyring tree during a search. (2) Permitting keyrings to be joined. (3) Invalidation. But these don't really belong together either, since these actions really need to be controlled separately. Finally, there are number of special cases to do with granting the administrator special rights to invalidate or clear keys that I would like to handle with the ACL rather than key flags and special checks. =============== WHAT IS CHANGED =============== The SETATTR permission is split to create two new permissions: (1) SET_SECURITY - which allows the key's owner, group and ACL to be changed and a restriction to be placed on a keyring. (2) REVOKE - which allows a key to be revoked. The SEARCH permission is split to create: (1) SEARCH - which allows a keyring to be search and a key to be found. (2) JOIN - which allows a keyring to be joined as a session keyring. (3) INVAL - which allows a key to be invalidated. The WRITE permission is also split to create: (1) WRITE - which allows a key's content to be altered and links to be added, removed and replaced in a keyring. (2) CLEAR - which allows a keyring to be cleared completely. This is split out to make it possible to give just this to an administrator. (3) REVOKE - see above. Keys acquire ACLs which consist of a series of ACEs, and all that apply are unioned together. An ACE specifies a subject, such as: (*) Possessor - permitted to anyone who 'possesses' a key (*) Owner - permitted to the key owner (*) Group - permitted to the key group (*) Everyone - permitted to everyone Note that 'Other' has been replaced with 'Everyone' on the assumption that you wouldn't grant a permit to 'Other' that you wouldn't also grant to everyone else. Further subjects may be made available by later patches. The ACE also specifies a permissions mask. The set of permissions is now: VIEW Can view the key metadata READ Can read the key content WRITE Can update/modify the key content SEARCH Can find the key by searching/requesting LINK Can make a link to the key SET_SECURITY Can change owner, ACL, expiry INVAL Can invalidate REVOKE Can revoke JOIN Can join this keyring CLEAR Can clear this keyring The KEYCTL_SETPERM function is then deprecated. The KEYCTL_SET_TIMEOUT function then is permitted if SET_SECURITY is set, or if the caller has a valid instantiation auth token. The KEYCTL_INVALIDATE function then requires INVAL. The KEYCTL_REVOKE function then requires REVOKE. The KEYCTL_JOIN_SESSION_KEYRING function then requires JOIN to join an existing keyring. The JOIN permission is enabled by default for session keyrings and manually created keyrings only. ====================== BACKWARD COMPATIBILITY ====================== To maintain backward compatibility, KEYCTL_SETPERM will translate the permissions mask it is given into a new ACL for a key - unless KEYCTL_SET_ACL has been called on that key, in which case an error will be returned. It will convert possessor, owner, group and other permissions into separate ACEs, if each portion of the mask is non-zero. SETATTR permission turns on all of INVAL, REVOKE and SET_SECURITY. WRITE permission turns on WRITE, REVOKE and, if a keyring, CLEAR. JOIN is turned on if a keyring is being altered. The KEYCTL_DESCRIBE function translates the ACL back into a permissions mask to return depending on possessor, owner, group and everyone ACEs. It will make the following mappings: (1) INVAL, JOIN -> SEARCH (2) SET_SECURITY -> SETATTR (3) REVOKE -> WRITE if SETATTR isn't already set (4) CLEAR -> WRITE Note that the value subsequently returned by KEYCTL_DESCRIBE may not match the value set with KEYCTL_SETATTR. ======= TESTING ======= This passes the keyutils testsuite for all but a couple of tests: (1) tests/keyctl/dh_compute/badargs: The first wrong-key-type test now returns EOPNOTSUPP rather than ENOKEY as READ permission isn't removed if the type doesn't have ->read(). You still can't actually read the key. (2) tests/keyctl/permitting/valid: The view-other-permissions test doesn't work as Other has been replaced with Everyone in the ACL. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-27keys: Pass the network namespace into request_key mechanismDavid Howells4-17/+36
Create a request_key_net() function and use it to pass the network namespace domain tag into DNS revolver keys and rxrpc/AFS keys so that keys for different domains can coexist in the same keyring. Signed-off-by: David Howells <dhowells@redhat.com> cc: netdev@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-afs@lists.infradead.org
2019-06-26keys: Network namespace domain tagDavid Howells1-1/+6
Create key domain tags for network namespaces and make it possible to automatically tag keys that are used by networked services (e.g. AF_RXRPC, AFS, DNS) with the default network namespace if not set by the caller. This allows keys with the same description but in different namespaces to coexist within a keyring. Signed-off-by: David Howells <dhowells@redhat.com> cc: netdev@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-afs@lists.infradead.org
2019-06-26keys: Garbage collect keys for which the domain has been removedDavid Howells2-1/+17
If a key operation domain (such as a network namespace) has been removed then attempt to garbage collect all the keys that use it. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26keys: Include target namespace in match criteriaDavid Howells5-4/+39
Currently a key has a standard matching criteria of { type, description } and this is used to only allow keys with unique criteria in a keyring. This means, however, that you cannot have keys with the same type and description but a different target namespace in the same keyring. This is a potential problem for a containerised environment where, say, a container is made up of some parts of its mount space involving netfs superblocks from two different network namespaces. This is also a problem for shared system management keyrings such as the DNS records keyring or the NFS idmapper keyring that might contain keys from different network namespaces. Fix this by including a namespace component in a key's matching criteria. Keyring types are marked to indicate which, if any, namespace is relevant to keys of that type, and that namespace is set when the key is created from the current task's namespace set. The capability bit KEYCTL_CAPS1_NS_KEY_TAG is set if the kernel is employing this feature. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26keys: Move the user and user-session keyrings to the user_namespaceDavid Howells5-104/+187
Move the user and user-session keyrings to the user_namespace struct rather than pinning them from the user_struct struct. This prevents these keyrings from propagating across user-namespaces boundaries with regard to the KEY_SPEC_* flags, thereby making them more useful in a containerised environment. The issue is that a single user_struct may be represent UIDs in several different namespaces. The way the patch does this is by attaching a 'register keyring' in each user_namespace and then sticking the user and user-session keyrings into that. It can then be searched to retrieve them. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jann Horn <jannh@google.com>
2019-06-26keys: Namespace keyring namesDavid Howells2-57/+45
Keyring names are held in a single global list that any process can pick from by means of keyctl_join_session_keyring (provided the keyring grants Search permission). This isn't very container friendly, however. Make the following changes: (1) Make default session, process and thread keyring names begin with a '.' instead of '_'. (2) Keyrings whose names begin with a '.' aren't added to the list. Such keyrings are system specials. (3) Replace the global list with per-user_namespace lists. A keyring adds its name to the list for the user_namespace that it is currently in. (4) When a user_namespace is deleted, it just removes itself from the keyring name list. The global keyring_name_lock is retained for accessing the name lists. This allows (4) to work. This can be tested by: # keyctl newring foo @s 995906392 # unshare -U $ keyctl show ... 995906392 --alswrv 65534 65534 \_ keyring: foo ... $ keyctl session foo Joined session keyring: 935622349 As can be seen, a new session keyring was created. The capability bit KEYCTL_CAPS1_NS_KEYRING_NAME is set if the kernel is employing this feature. Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric W. Biederman <ebiederm@xmission.com>
2019-06-26keys: Add a 'recurse' flag for keyring searchesDavid Howells8-9/+22
Add a 'recurse' flag for keyring searches so that the flag can be omitted and recursion disabled, thereby allowing just the nominated keyring to be searched and none of the children. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26keys: Cache the hash value to avoid lots of recalculationDavid Howells3-16/+22
Cache the hash of the key's type and description in the index key so that we're not recalculating it every time we look at a key during a search. The hash function does a bunch of multiplications, so evading those is probably worthwhile - especially as this is done for every key examined during a search. This also allows the methods used by assoc_array to get chunks of index-key to be simplified. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26keys: Simplify key description managementDavid Howells4-49/+30
Simplify key description management by cramming the word containing the length with the first few chars of the description also. This simplifies the code that generates the index-key used by assoc_array. It should speed up key searching a bit too. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26keys: Kill off request_key_async{,_with_auxdata}David Howells1-50/+0
Kill off request_key_async{,_with_auxdata}() as they're not currently used. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-24IMA: Define a new hook to measure the kexec boot command line argumentsPrakhar Srivastava4-0/+81
Currently during soft reboot(kexec_file_load) boot command line arguments are not measured. Define hooks needed to measure kexec command line arguments during soft reboot(kexec_file_load). - A new ima hook ima_kexec_cmdline is defined to be called by the kexec code. - A new function process_buffer_measurement is defined to measure the buffer hash into the IMA measurement list. - A new func policy KEXEC_CMDLINE is defined to control the measurement. Signed-off-by: Prakhar Srivastava <prsriva02@gmail.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-06-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller17-71/+65
Minor SPDX change conflict. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-21Merge tag 'spdx-5.2-rc6' of ↵Linus Torvalds15-61/+15
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull still more SPDX updates from Greg KH: "Another round of SPDX updates for 5.2-rc6 Here is what I am guessing is going to be the last "big" SPDX update for 5.2. It contains all of the remaining GPLv2 and GPLv2+ updates that were "easy" to determine by pattern matching. The ones after this are going to be a bit more difficult and the people on the spdx list will be discussing them on a case-by-case basis now. Another 5000+ files are fixed up, so our overall totals are: Files checked: 64545 Files with SPDX: 45529 Compared to the 5.1 kernel which was: Files checked: 63848 Files with SPDX: 22576 This is a huge improvement. Also, we deleted another 20000 lines of boilerplate license crud, always nice to see in a diffstat" * tag 'spdx-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (65 commits) treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 507 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 506 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 505 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 504 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 503 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 502 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 501 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 498 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 497 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 496 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 495 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 491 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 490 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 489 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 488 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 487 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 486 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 485 ...
2019-06-20apparmor: increase left match history buffer sizeJohn Johansen2-5/+4
There have been cases reported where a history buffer size of 8 was not enough to resolve conflict overlaps. Increase the buffer to and get rid of the size element which is currently just storing the constant WB_HISTORY_SIZE. Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-06-20apparmor: Switch to GFP_KERNEL where possibleSebastian Andrzej Siewior3-12/+12
After removing preempt_disable() from get_buffers() it is possible to replace a few GFP_ATOMIC allocations with GFP_KERNEL. Replace GFP_ATOMIC allocations with GFP_KERNEL where the context looks to bee preepmtible. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-06-20apparmor: Use a memory pool instead per-CPU cachesSebastian Andrzej Siewior5-111/+164
The get_buffers() macro may provide one or two buffers to the caller. Those buffers are pre-allocated on init for each CPU. By default it allocates 2* 2 * MAX_PATH * POSSIBLE_CPU which equals 64KiB on a system with 4 CPUs or 1MiB with 64 CPUs and so on. Replace the per-CPU buffers with a common memory pool which is shared across all CPUs. The pool grows on demand and never shrinks. The pool starts with two (UP) or four (SMP) elements. By using this pool it is possible to request a buffer and keeping preemption enabled which avoids the hack in profile_transition(). It has been pointed out by Tetsuo Handa that GFP_KERNEL allocations for small amount of memory do not fail. In order not to have an endless retry, __GFP_RETRY_MAYFAIL is passed (so the memory allocation is not repeated until success) and retried once hoping that in the meantime a buffer has been returned to the pool. Since now NULL is possible all allocation paths check the buffer pointer and return -ENOMEM on failure. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-06-20apparmor: Force type-casting of current->real_credBharath Vedartham1-1/+1
This patch fixes the sparse warning: warning: cast removes address space '<asn:4>' of expression. Signed-off-by: Bharath Vedartham <linux.bhar@gmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-06-19IMA: support for per policy rule template formatsMatthew Garrett7-27/+76
Admins may wish to log different measurements using different IMA templates. Add support for overriding the default template on a per-rule basis. Inspired-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-06-19keys: Cache result of request_key*() temporarily in task_structDavid Howells2-0/+55
If a filesystem uses keys to hold authentication tokens, then it needs a token for each VFS operation that might perform an authentication check - either by passing it to the server, or using to perform a check based on authentication data cached locally. For open files this isn't a problem, since the key should be cached in the file struct since it represents the subject performing operations on that file descriptor. During pathwalk, however, there isn't anywhere to cache the key, except perhaps in the nameidata struct - but that isn't exposed to the filesystems. Further, a pathwalk can incur a lot of operations, calling one or more of the following, for instance: ->lookup() ->permission() ->d_revalidate() ->d_automount() ->get_acl() ->getxattr() on each dentry/inode it encounters - and each one may need to call request_key(). And then, at the end of pathwalk, it will call the actual operation: ->mkdir() ->mknod() ->getattr() ->open() ... which may need to go and get the token again. However, it is very likely that all of the operations on a single dentry/inode - and quite possibly a sequence of them - will all want to use the same authentication token, which suggests that caching it would be a good idea. To this end: (1) Make it so that a positive result of request_key() and co. that didn't require upcalling to userspace is cached temporarily in task_struct. (2) The cache is 1 deep, so a new result displaces the old one. (3) The key is released by exit and by notify-resume. (4) The cache is cleared in a newly forked process. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-19keys: Provide request_key_rcu()David Howells1-0/+44
Provide a request_key_rcu() function that can be used to request a key under RCU conditions. It can only search and check permissions; it cannot allocate a new key, upcall or wait for an upcall to complete. It may return a partially constructed key. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-19keys: Move the RCU locks outwards from the keyring search functionsDavid Howells6-60/+75
Move the RCU locks outwards from the keyring search functions so that it will become possible to provide an RCU-capable partial request_key() function in a later commit. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-19keys: Invalidate used request_key authentication keysDavid Howells2-3/+3
Invalidate used request_key authentication keys rather than revoking them so that they get cleaned up immediately rather than potentially hanging around. There doesn't seem any need to keep the revoked keys around. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-19keys: Fix request_key() lack of Link perm check on found keyDavid Howells1-0/+10
The request_key() syscall allows a process to gain access to the 'possessor' permits of any key that grants it Search permission by virtue of request_key() not checking whether a key it finds grants Link permission to the caller. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500Thomas Gleixner15-61/+15
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19keys: Add capability-checking keyctl functionDavid Howells3-0/+40
Add a keyctl function that requests a set of capability bits to find out what features are supported. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-18apparmor: reset pos on failure to unpack for various functionsMike Salvatore1-8/+39
Each function that manipulates the aa_ext struct should reset it's "pos" member on failure. This ensures that, on failure, no changes are made to the state of the aa_ext struct. There are paths were elements are optional and the error path is used to indicate the optional element is not present. This means instead of just aborting on error the unpack stream can become unsynchronized on optional elements, if using one of the affected functions. Cc: stable@vger.kernel.org Fixes: 736ec752d95e ("AppArmor: policy routines for loading and unpacking policy") Signed-off-by: Mike Salvatore <mike.salvatore@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-06-18apparmor: enforce nullbyte at end of tag stringJann Horn1-1/+1
A packed AppArmor policy contains null-terminated tag strings that are read by unpack_nameX(). However, unpack_nameX() uses string functions on them without ensuring that they are actually null-terminated, potentially leading to out-of-bounds accesses. Make sure that the tag string is null-terminated before passing it to strcmp(). Cc: stable@vger.kernel.org Fixes: 736ec752d95e ("AppArmor: policy routines for loading and unpacking policy") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-06-18apparmor: fix PROFILE_MEDIATES for untrusted inputJohn Johansen1-1/+10
While commit 11c236b89d7c2 ("apparmor: add a default null dfa") ensure every profile has a policy.dfa it does not resize the policy.start[] to have entries for every possible start value. Which means PROFILE_MEDIATES is not safe to use on untrusted input. Unforunately commit b9590ad4c4f2 ("apparmor: remove POLICY_MEDIATES_SAFE") did not take into account the start value usage. The input string in profile_query_cb() is user controlled and is not properly checked to be within the limited start[] entries, even worse it can't be as userspace policy is allowed to make us of entries types the kernel does not know about. This mean usespace can currently cause the kernel to access memory up to 240 entries beyond the start array bounds. Cc: stable@vger.kernel.org Fixes: b9590ad4c4f2 ("apparmor: remove POLICY_MEDIATES_SAFE") Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-06-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller93-484/+133
Honestly all the conflicts were simple overlapping changes, nothing really interesting to report. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-17integrity: Fix __integrity_init_keyring() section mismatchGeert Uytterhoeven1-2/+3
With gcc-4.6.3: WARNING: vmlinux.o(.text.unlikely+0x24c64): Section mismatch in reference from the function __integrity_init_keyring() to the function .init.text:set_platform_trusted_keys() The function __integrity_init_keyring() references the function __init set_platform_trusted_keys(). This is often because __integrity_init_keyring lacks a __init annotation or the annotation of set_platform_trusted_keys is wrong. Indeed, if the compiler decides not to inline __integrity_init_keyring(), a warning is issued. Fix this by adding the missing __init annotation. Fixes: 9dc92c45177ab70e ("integrity: Define a trusted platform keyring") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Nayna Jain <nayna@linux.ibm.com> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-06-17locking/lockdep: Rename lockdep_assert_held_exclusive() -> ↵Nikolay Borisov1-4/+4
lockdep_assert_held_write() All callers of lockdep_assert_held_exclusive() use it to verify the correct locking state of either a semaphore (ldisc_sem in tty, mmap_sem for perf events, i_rwsem of inode for dax) or rwlock by apparmor. Thus it makes sense to rename _exclusive to _write since that's the semantics callers care. Additionally there is already lockdep_assert_held_read(), which this new naming is more consistent with. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190531100651.3969-1-nborisov@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-14Smack: Restore the smackfsdef mount option and add missing prefixesCasey Schaufler1-5/+7
The 5.1 mount system rework changed the smackfsdef mount option to smackfsdefault. This fixes the regression by making smackfsdef treated the same way as smackfsdefault. Also fix the smack_param_specs[] to have "smack" prefixes on all the names. This isn't visible to a user unless they either: (a) Try to mount a filesystem that's converted to the internal mount API and that implements the ->parse_monolithic() context operation - and only then if they call security_fs_context_parse_param() rather than security_sb_eat_lsm_opts(). There are no examples of this upstream yet, but nfs will probably want to do this for nfs2 or nfs3. (b) Use fsconfig() to configure the filesystem - in which case security_fs_context_parse_param() will be called. This issue is that smack_sb_eat_lsm_opts() checks for the "smack" prefix on the options, but smack_fs_context_parse_param() does not. Fixes: c3300aaf95fb ("smack: get rid of match_token()") Fixes: 2febd254adc4 ("smack: Implement filesystem context security hooks") Cc: stable@vger.kernel.org Reported-by: Jose Bollo <jose.bollo@iot.bzh> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-14docs: cgroup-v1: convert docs to ReST and rename to *.rstMauro Carvalho Chehab1-1/+1
Convert the cgroup-v1 files to ReST format, in order to allow a later addition to the admin-guide. The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2019-06-14Merge tag 'v5.2-rc4' into mauroJonathan Corbet131-616/+157
We need to pick up post-rc1 changes to various document files so they don't get lost in Mauro's massive RST conversion push.