commit 06b0d88bfe617c354acfda490172e4d72bc632c8 Author: Greg Kroah-Hartman Date: Thu Oct 5 09:47:47 2017 +0200 Linux 4.13.5 commit fd7ea1828b0dc2a3501b403f4254e9b34ea8a8da Author: Vladis Dronov Date: Mon Sep 4 16:00:50 2017 +0200 video: fbdev: aty: do not leak uninitialized padding in clk to userspace commit 8e75f7a7a00461ef6d91797a60b606367f6e344d upstream. 'clk' is copied to a userland with padding byte(s) after 'vclk_post_div' field unitialized, leaking data from the stack. Fix this ensuring all of 'clk' is initialized to zero. References: https://github.com/torvalds/linux/pull/441 Reported-by: sohu0106 Signed-off-by: Vladis Dronov Signed-off-by: Bartlomiej Zolnierkiewicz Signed-off-by: Greg Kroah-Hartman commit 841453bb0a270f902272cc7211442d477a355818 Author: Paolo Bonzini Date: Thu Sep 28 17:58:41 2017 +0200 KVM: VMX: use cmpxchg64 commit c0a1666bcb2a33e84187a15eabdcd54056be9a97 upstream. This fixes a compilation failure on 32-bit systems. Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit d96820ac7491bb5e469b16ef4fadda63dfdd6fd2 Author: Haozhong Zhang Date: Mon Sep 18 09:56:50 2017 +0800 KVM: VMX: remove WARN_ON_ONCE in kvm_vcpu_trigger_posted_interrupt commit 5753743fa5108b8f98bd61e40dc63f641b26c768 upstream. WARN_ON_ONCE(pi_test_sn(&vmx->pi_desc)) in kvm_vcpu_trigger_posted_interrupt() intends to detect the violation of invariant that VT-d PI notification event is not suppressed when vcpu is in the guest mode. Because the two checks for the target vcpu mode and the target suppress field cannot be performed atomically, the target vcpu mode may change in between. If that does happen, WARN_ON_ONCE() here may raise false alarms. As the previous patch fixed the real invariant breaker, remove this WARN_ON_ONCE() to avoid false alarms, and document the allowed cases instead. Signed-off-by: Haozhong Zhang Reported-by: "Ramamurthy, Venkatesh" Reported-by: Dan Williams Reviewed-by: Paolo Bonzini Fixes: 28b835d60fcc ("KVM: Update Posted-Interrupts Descriptor when vCPU is preempted") Signed-off-by: Radim Krčmář Signed-off-by: Greg Kroah-Hartman commit 18d3d3bee06e9f0823c0df4f1e5581d2a7947f3a Author: Haozhong Zhang Date: Mon Sep 18 09:56:49 2017 +0800 KVM: VMX: do not change SN bit in vmx_update_pi_irte() commit dc91f2eb1a4021eb6705c15e474942f84ab9b211 upstream. In kvm_vcpu_trigger_posted_interrupt() and pi_pre_block(), KVM assumes that PI notification events should not be suppressed when the target vCPU is not blocked. vmx_update_pi_irte() sets the SN field before changing an interrupt from posting to remapping, but it does not check the vCPU mode. Therefore, the change of SN field may break above the assumption. Besides, I don't see reasons to suppress notification events here, so remove the changes of SN field to avoid race condition. Signed-off-by: Haozhong Zhang Reported-by: "Ramamurthy, Venkatesh" Reported-by: Dan Williams Reviewed-by: Paolo Bonzini Fixes: 28b835d60fcc ("KVM: Update Posted-Interrupts Descriptor when vCPU is preempted") Signed-off-by: Radim Krčmář Signed-off-by: Greg Kroah-Hartman commit 7c6bcb52072cb117e7792dca856944f05182bd19 Author: Eric Biggers Date: Mon Oct 2 11:01:40 2017 -0700 x86/fpu: Don't let userspace set bogus xcomp_bv commit 814fb7bb7db5433757d76f4c4502c96fc53b0b5e upstream. On x86, userspace can use the ptrace() or rt_sigreturn() system calls to set a task's extended state (xstate) or "FPU" registers. ptrace() can set them for another task using the PTRACE_SETREGSET request with NT_X86_XSTATE, while rt_sigreturn() can set them for the current task. In either case, registers can be set to any value, but the kernel assumes that the XSAVE area itself remains valid in the sense that the CPU can restore it. However, in the case where the kernel is using the uncompacted xstate format (which it does whenever the XSAVES instruction is unavailable), it was possible for userspace to set the xcomp_bv field in the xstate_header to an arbitrary value. However, all bits in that field are reserved in the uncompacted case, so when switching to a task with nonzero xcomp_bv, the XRSTOR instruction failed with a #GP fault. This caused the WARN_ON_FPU(err) in copy_kernel_to_xregs() to be hit. In addition, since the error is otherwise ignored, the FPU registers from the task previously executing on the CPU were leaked. Fix the bug by checking that the user-supplied value of xcomp_bv is 0 in the uncompacted case, and returning an error otherwise. The reason for validating xcomp_bv rather than simply overwriting it with 0 is that we want userspace to see an error if it (incorrectly) provides an XSAVE area in compacted format rather than in uncompacted format. Note that as before, in case of error we clear the task's FPU state. This is perhaps non-ideal, especially for PTRACE_SETREGSET; it might be better to return an error before changing anything. But it seems the "clear on error" behavior is fine for now, and it's a little tricky to do otherwise because it would mean we couldn't simply copy the full userspace state into kernel memory in one __copy_from_user(). This bug was found by syzkaller, which hit the above-mentioned WARN_ON_FPU(): WARNING: CPU: 1 PID: 0 at ./arch/x86/include/asm/fpu/internal.h:373 __switch_to+0x5b5/0x5d0 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.13.0 #453 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff9ba2bc8e42c0 task.stack: ffffa78cc036c000 RIP: 0010:__switch_to+0x5b5/0x5d0 RSP: 0000:ffffa78cc08bbb88 EFLAGS: 00010082 RAX: 00000000fffffffe RBX: ffff9ba2b8bf2180 RCX: 00000000c0000100 RDX: 00000000ffffffff RSI: 000000005cb10700 RDI: ffff9ba2b8bf36c0 RBP: ffffa78cc08bbbd0 R08: 00000000929fdf46 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9ba2bc8e42c0 R13: 0000000000000000 R14: ffff9ba2b8bf3680 R15: ffff9ba2bf5d7b40 FS: 00007f7e5cb10700(0000) GS:ffff9ba2bf400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004005cc CR3: 0000000079fd5000 CR4: 00000000001406e0 Call Trace: Code: 84 00 00 00 00 00 e9 11 fd ff ff 0f ff 66 0f 1f 84 00 00 00 00 00 e9 e7 fa ff ff 0f ff 66 0f 1f 84 00 00 00 00 00 e9 c2 fa ff ff <0f> ff 66 0f 1f 84 00 00 00 00 00 e9 d4 fc ff ff 66 66 2e 0f 1f Here is a C reproducer. The expected behavior is that the program spin forever with no output. However, on a buggy kernel running on a processor with the "xsave" feature but without the "xsaves" feature (e.g. Sandy Bridge through Broadwell for Intel), within a second or two the program reports that the xmm registers were corrupted, i.e. were not restored correctly. With CONFIG_X86_DEBUG_FPU=y it also hits the above kernel warning. #define _GNU_SOURCE #include #include #include #include #include #include #include #include int main(void) { int pid = fork(); uint64_t xstate[512]; struct iovec iov = { .iov_base = xstate, .iov_len = sizeof(xstate) }; if (pid == 0) { bool tracee = true; for (int i = 0; i < sysconf(_SC_NPROCESSORS_ONLN) && tracee; i++) tracee = (fork() != 0); uint32_t xmm0[4] = { [0 ... 3] = tracee ? 0x00000000 : 0xDEADBEEF }; asm volatile(" movdqu %0, %%xmm0\n" " mov %0, %%rbx\n" "1: movdqu %%xmm0, %0\n" " mov %0, %%rax\n" " cmp %%rax, %%rbx\n" " je 1b\n" : "+m" (xmm0) : : "rax", "rbx", "xmm0"); printf("BUG: xmm registers corrupted! tracee=%d, xmm0=%08X%08X%08X%08X\n", tracee, xmm0[0], xmm0[1], xmm0[2], xmm0[3]); } else { usleep(100000); ptrace(PTRACE_ATTACH, pid, 0, 0); wait(NULL); ptrace(PTRACE_GETREGSET, pid, NT_X86_XSTATE, &iov); xstate[65] = -1; ptrace(PTRACE_SETREGSET, pid, NT_X86_XSTATE, &iov); ptrace(PTRACE_CONT, pid, 0, 0); wait(NULL); } return 1; } Note: the program only tests for the bug using the ptrace() system call. The bug can also be reproduced using the rt_sigreturn() system call, but only when called from a 32-bit program, since for 64-bit programs the kernel restores the FPU state from the signal frame by doing XRSTOR directly from userspace memory (with proper error checking). Reported-by: Dmitry Vyukov Signed-off-by: Eric Biggers Reviewed-by: Kees Cook Reviewed-by: Rik van Riel Acked-by: Dave Hansen Cc: Andrew Morton Cc: Andy Lutomirski Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Eric Biggers Cc: Fenghua Yu Cc: Kevin Hao Cc: Linus Torvalds Cc: Michael Halcrow Cc: Oleg Nesterov Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Wanpeng Li Cc: Yu-cheng Yu Cc: kernel-hardening@lists.openwall.com Fixes: 0b29643a5843 ("x86/xsaves: Change compacted format xsave area header") Link: http://lkml.kernel.org/r/20170922174156.16780-2-ebiggers3@gmail.com Link: http://lkml.kernel.org/r/20170923130016.21448-25-mingo@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit ac9275090890e205860590cb4720567a283c55d6 Author: Laurent Dufour Date: Mon Sep 4 10:32:15 2017 +0200 x86/mm: Fix fault error path using unsafe vma pointer commit a3c4fb7c9c2ebfd50b8c60f6c069932bb319bc37 upstream. commit 7b2d0dbac489 ("x86/mm/pkeys: Pass VMA down in to fault signal generation code") passes down a vma pointer to the error path, but that is done once the mmap_sem is released when calling mm_fault_error() from __do_page_fault(). This is dangerous as the vma structure is no more safe to be used once the mmap_sem has been released. As only the protection key value is required in the error processing, we could just pass down this value. Fix it by passing a pointer to a protection key value down to the fault signal generation code. The use of a pointer allows to keep the check generating a warning message in fill_sig_info_pkey() when the vma was not known. If the pointer is valid, the protection value can be accessed by deferencing the pointer. [ tglx: Made *pkey u32 as that's the type which is passed in siginfo ] Fixes: 7b2d0dbac489 ("x86/mm/pkeys: Pass VMA down in to fault signal generation code") Signed-off-by: Laurent Dufour Signed-off-by: Thomas Gleixner Cc: linux-mm@kvack.org Cc: Dave Hansen Link: http://lkml.kernel.org/r/1504513935-12742-1-git-send-email-ldufour@linux.vnet.ibm.com Signed-off-by: Greg Kroah-Hartman commit 5bf264461b2a45d28cdc1ba90ffa9bd52e667d2c Author: Viresh Kumar Date: Thu Sep 21 10:44:36 2017 -0700 PM / OPP: Call notifier without holding opp_table->lock commit e4d8ae00169f7686e1da5a62e5cf797d12bf8822 upstream. The notifier callbacks may want to call some OPP helper routines which may try to take the same opp_table->lock again and cause a deadlock. One such usecase was reported by Chanwoo Choi, where calling dev_pm_opp_disable() leads us to the devfreq's OPP notifier handler, which further calls dev_pm_opp_find_freq_floor() and it deadlocks. We don't really need the opp_table->lock to be held across the notifier call though, all we want to make sure is that the 'opp' doesn't get freed while being used from within the notifier chain. We can do it with help of dev_pm_opp_get/put() as well. Let's do it. Fixes: 5b650b388844 "PM / OPP: Take kref from _find_opp_table()" Reported-by: Chanwoo Choi Tested-by: Chanwoo Choi Reviewed-by: Stephen Boyd Reviewed-by: Chanwoo Choi Signed-off-by: Viresh Kumar Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 211d5eabde61c1d7759b98f1b101abd35b330e28 Author: Ville Syrjälä Date: Mon Sep 18 23:00:59 2017 +0300 platform/x86: fujitsu-laptop: Don't oops when FUJ02E3 is not presnt commit ce7c47d60bda6c7f09ccf16e978d971c8fa16ff0 upstream. My Fujitsu-Siemens Lifebook S6120 doesn't have the FUJ02E3 device, but it does have FUJ02B1. That means we do register the backlight device (and it even seems to work), but the code will oops as soon as we try to set the backlight brightness because it's trying to call call_fext_func() with a NULL device. Let's just skip those function calls when the FUJ02E3 device is not present. Cc: Jonathan Woithe Cc: Andy Shevchenko Signed-off-by: Ville Syrjälä Signed-off-by: Darren Hart (VMware) Signed-off-by: Greg Kroah-Hartman commit 19546fe8d26102da9d6b1895a96a24cac608828b Author: satoru takeuchi Date: Tue Sep 12 22:42:52 2017 +0900 btrfs: prevent to set invalid default subvolid commit 6d6d282932d1a609e60dc4467677e0e863682f57 upstream. `btrfs sub set-default` succeeds to set an ID which isn't corresponding to any fs/file tree. If such the bad ID is set to a filesystem, we can't mount this filesystem without specifying `subvol` or `subvolid` mount options. Fixes: 6ef5ed0d386b ("Btrfs: add ioctl and incompat flag to set the default mount subvol") Signed-off-by: Satoru Takeuchi Reviewed-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 6bc44a96c98c1a70616612b66a5f89f0c0e4253e Author: Naohiro Aota Date: Fri Sep 8 17:48:55 2017 +0900 btrfs: propagate error to btrfs_cmp_data_prepare caller commit 78ad4ce014d025f41b8dde3a81876832ead643cf upstream. btrfs_cmp_data_prepare() (almost) always returns 0 i.e. ignoring errors from gather_extent_pages(). While the pages are freed by btrfs_cmp_data_free(), cmp->num_pages still has > 0. Then, btrfs_extent_same() try to access the already freed pages causing faults (or violates PageLocked assertion). This patch just return the error as is so that the caller stop the process. Signed-off-by: Naohiro Aota Fixes: f441460202cb ("btrfs: fix deadlock with extent-same and readpage") Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit d2a30525077f726ee205a5e21e2f4856b96cb9b7 Author: Naohiro Aota Date: Fri Sep 1 17:59:07 2017 +0900 btrfs: finish ordered extent cleaning if no progress is found commit 67c003f90fd68062d92a7ffade36f9b2a9098bd8 upstream. __endio_write_update_ordered() repeats the search until it reaches the end of the specified range. This works well with direct IO path, because before the function is called, it's ensured that there are ordered extents filling whole the range. It's not the case, however, when it's called from run_delalloc_range(): it is possible to have error in the midle of the loop in e.g. run_delalloc_nocow(), so that there exisits the range not covered by any ordered extents. By cleaning such "uncomplete" range, __endio_write_update_ordered() stucks at offset where there're no ordered extents. Since the ordered extents are created from head to tail, we can stop the search if there are no offset progress. Fixes: 524272607e88 ("btrfs: Handle delalloc error correctly to avoid ordered extent hang") Signed-off-by: Naohiro Aota Reviewed-by: Qu Wenruo Reviewed-by: Josef Bacik Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 5c95ce1ebfdf3949035dabcae1cbf6f0bf3aa97f Author: Naohiro Aota Date: Fri Sep 1 17:58:47 2017 +0900 btrfs: clear ordered flag on cleaning up ordered extents commit 63d71450c8d817649a79e37d685523f988b9cc98 upstream. Commit 524272607e88 ("btrfs: Handle delalloc error correctly to avoid ordered extent hang") introduced btrfs_cleanup_ordered_extents() to cleanup submitted ordered extents. However, it does not clear the ordered bit (Private2) of corresponding pages. Thus, the following BUG occurs from free_pages_check_bad() (on btrfs/125 with nospace_cache). BUG: Bad page state in process btrfs pfn:3fa787 page:ffffdf2acfe9e1c0 count:0 mapcount:0 mapping: (null) index:0xd flags: 0x8000000000002008(uptodate|private_2) raw: 8000000000002008 0000000000000000 000000000000000d 00000000ffffffff raw: ffffdf2acf5c1b20 ffffb443802238b0 0000000000000000 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set bad because of flags: 0x2000(private_2) This patch clears the flag same as other places calling btrfs_dec_test_ordered_pending() for every page in the specified range. Fixes: 524272607e88 ("btrfs: Handle delalloc error correctly to avoid ordered extent hang") Signed-off-by: Naohiro Aota Reviewed-by: Qu Wenruo Reviewed-by: Josef Bacik Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 632ecb4ee6b3b129e1567296d730fc538e113fbf Author: Naohiro Aota Date: Fri Aug 25 14:15:14 2017 +0900 btrfs: fix NULL pointer dereference from free_reloc_roots() commit bb166d7207432d3c7d10c45dc052f12ba3a2121d upstream. __del_reloc_root should be called before freeing up reloc_root->node. If not, calling __del_reloc_root() dereference reloc_root->node, causing the system BUG. Fixes: 6bdf131fac23 ("Btrfs: don't leak reloc root nodes on error") Signed-off-by: Naohiro Aota Reviewed-by: Nikolay Borisov Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 8c64ccdccea9f9c3c7c7ae16dc0c94221ab50805 Author: Nicolai Stange Date: Mon Sep 11 09:45:40 2017 +0200 PCI: Fix race condition with driver_override commit 9561475db680f7144d2223a409dd3d7e322aca03 upstream. The driver_override implementation is susceptible to a race condition when different threads are reading vs. storing a different driver override. Add locking to avoid the race condition. This is in close analogy to commit 6265539776a0 ("driver core: platform: fix race condition with driver_override") from Adrian Salido. Fixes: 782a985d7af2 ("PCI: Introduce new device binding path using pci_dev.driver_override") Signed-off-by: Nicolai Stange Signed-off-by: Bjorn Helgaas Signed-off-by: Greg Kroah-Hartman commit 8d73e57d868eb8f4ce1f8ed1addd57fe8d0c080f Author: Shaohua Li Date: Thu Sep 21 10:23:35 2017 -0700 md: separate request handling commit 393debc23c7820211d1c8253dd6a8408a7628fe7 upstream. With commit cc27b0c78c79, pers->make_request could bail out without handling the bio. If that happens, we should retry. The commit fixes md_make_request but not other call sites. Separate the request handling part, so other call sites can use it. Reported-by: Nate Dailey Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start()) Reviewed-by: NeilBrown Signed-off-by: Shaohua Li Signed-off-by: Greg Kroah-Hartman commit 93f1f1b25b32e591c9c3b03e3e338d494cf21068 Author: Shaohua Li Date: Thu Sep 21 09:55:28 2017 -0700 md: fix a race condition for flush request handling commit 79bf31a3b2a7ca467cfec8ff97d359a77065d01f upstream. md_submit_flush_data calls pers->make_request, which missed the suspend check. Fix it with the new md_handle_request API. Reported-by: Nate Dailey Tested-by: Nate Dailey Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start()) Reviewed-by: NeilBrown Signed-off-by: Shaohua Li Signed-off-by: Greg Kroah-Hartman commit d3d86d580864a0b4430b3cd53e0487fc7a1c72c7 Author: Peter Zijlstra Date: Fri Sep 22 17:48:06 2017 +0200 futex: Fix pi_state->owner serialization commit c74aef2d06a9f59cece89093eecc552933cba72a upstream. There was a reported suspicion about a race between exit_pi_state_list() and put_pi_state(). The same report mentioned the comment with put_pi_state() said it should be called with hb->lock held, and it no longer is in all places. As it turns out, the pi_state->owner serialization is indeed broken. As per the new rules: 734009e96d19 ("futex: Change locking rules") pi_state->owner should be serialized by pi_state->pi_mutex.wait_lock. For the sites setting pi_state->owner we already hold wait_lock (where required) but exit_pi_state_list() and put_pi_state() were not and raced on clearing it. Fixes: 734009e96d19 ("futex: Change locking rules") Reported-by: Gratian Crisan Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner Cc: dvhart@infradead.org Link: https://lkml.kernel.org/r/20170922154806.jd3ffltfk24m4o4y@hirez.programming.kicks-ass.net Signed-off-by: Greg Kroah-Hartman commit 82f25e73c4043d14ee70ced261975264380b46c2 Author: Lucas Stach Date: Mon Sep 11 15:29:31 2017 +0200 etnaviv: fix gem object list corruption commit 518417525f3652c12fb5fad6da4ade66c0072fa3 upstream. All manipulations of the gem_object list need to be protected by the list mutex, as GEM objects can be created and freed in parallel. This fixes a kernel memory corruption. Signed-off-by: Lucas Stach Signed-off-by: Greg Kroah-Hartman commit 2790c64d1e4c8c299d86b827882caf580eb5f2cf Author: Lucas Stach Date: Fri Sep 8 16:24:32 2017 +0200 etnaviv: fix submit error path commit 5a642e6bc49f59922e19ebd639e74f72753fc77b upstream. If the gpu submit fails, bail out to avoid accessing a potentially unititalized fence. Signed-off-by: Lucas Stach Signed-off-by: Greg Kroah-Hartman commit 67c423c9e885e34bc2829a0baae4cbf54312502d Author: Richard Genoud Date: Wed Sep 27 14:49:17 2017 +0200 mtd: nand: atmel: fix buffer overflow in atmel_pmecc_user commit 36de80740008e6a4a55115b4a92e2059e47c1cba upstream. When calculating the size needed by struct atmel_pmecc_user *user, the dmu and delta buffer sizes were forgotten. This lead to a memory corruption (especially with a large ecc_strength). Link: http://lkml.kernel.org/r/1506503157.3016.5.camel@gmail.com Fixes: f88fc122cc34 ("mtd: nand: Cleanup/rework the atmel_nand driver") Reported-by: Richard Genoud Pointed-at-by: Boris Brezillon Signed-off-by: Richard Genoud Reviewed-by: Nicolas Ferre Signed-off-by: Boris Brezillon Signed-off-by: Greg Kroah-Hartman commit d45d8cd4717dd954ad338f5a0fc3652e93fbb2a2 Author: Boris Brezillon Date: Mon Sep 25 10:19:57 2017 +0200 mtd: Fix partition alignment check on multi-erasesize devices commit 7e439681af82984045efc215437ebb2ca8d33a4c upstream. Commit 1eeef2d7483a ("mtd: handle partitioning on devices with 0 erasesize") introduced a regression on heterogeneous erase region devices. Alignment of the partition was tested against the master eraseblock size which can be bigger than the slave one, thus leading to some partitions being marked as read-only. Update wr_alignment to match this slave erasesize after this erasesize has been determined by picking the biggest erasesize of all the regions embedded in the MTD partition. Reported-by: Mathias Thore Fixes: 1eeef2d7483a ("mtd: handle partitioning on devices with 0 erasesize") Signed-off-by: Boris Brezillon Tested-by: Mathias Thore Reviewed-by: Mathias Thore Signed-off-by: Greg Kroah-Hartman commit 2cfa35c2f21437421d6afeda3be7d7341d2b15af Author: Jeffy Chen Date: Thu Sep 28 12:37:31 2017 +0800 irq/generic-chip: Don't replace domain's name commit 72364d320644c12948786962673772f271039a4a upstream. When generic irq chips are allocated for an irq domain the domain name is set to the irq chip name. That was done to have named domains before the recent changes which enforce domain naming were done. Since then the overwrite causes a memory leak when the domain name is dynamically allocated and even worse it would cause the domain free code to free the wrong name pointer, which might point to a constant. Remove the name assignment to prevent this. Fixes: d59f6617eef0 ("genirq: Allow fwnode to carry name information only") Signed-off-by: Jeffy Chen Signed-off-by: Thomas Gleixner Link: https://lkml.kernel.org/r/20170928043731.4764-1-jeffy.chen@rock-chips.com Signed-off-by: Greg Kroah-Hartman commit de8c137cb712c65f59f4e97e5a3b90d75b7edf59 Author: Ethan Zhao Date: Mon Sep 4 13:59:34 2017 +0800 sched/sysctl: Check user input value of sysctl_sched_time_avg commit 5ccba44ba118a5000cccc50076b0344632459779 upstream. System will hang if user set sysctl_sched_time_avg to 0: [root@XXX ~]# sysctl kernel.sched_time_avg_ms=0 Stack traceback for pid 0 0xffff883f6406c600 0 0 1 3 R 0xffff883f6406cf50 *swapper/3 ffff883f7ccc3ae8 0000000000000018 ffffffff810c4dd0 0000000000000000 0000000000017800 ffff883f7ccc3d78 0000000000000003 ffff883f7ccc3bf8 ffffffff810c4fc9 ffff883f7ccc3c08 00000000810c5043 ffff883f7ccc3c08 Call Trace: [] ? update_group_capacity+0x110/0x200 [] ? update_sd_lb_stats+0x109/0x600 [] ? find_busiest_group+0x47/0x530 [] ? load_balance+0x194/0x900 [] ? update_rq_clock.part.83+0x1a/0xe0 [] ? rebalance_domains+0x152/0x290 [] ? run_rebalance_domains+0xdc/0x1d0 [] ? __do_softirq+0xfb/0x320 [] ? irq_exit+0x125/0x130 [] ? scheduler_ipi+0x97/0x160 [] ? smp_reschedule_interrupt+0x29/0x30 [] ? reschedule_interrupt+0x6e/0x80 [] ? cpuidle_enter_state+0xcc/0x230 [] ? cpuidle_enter_state+0x9c/0x230 [] ? cpuidle_enter+0x17/0x20 [] ? cpu_startup_entry+0x38c/0x420 [] ? start_secondary+0x173/0x1e0 Because divide-by-zero error happens in function: update_group_capacity() update_cpu_capacity() scale_rt_capacity() { ... total = sched_avg_period() + delta; used = div_u64(avg, total); ... } To fix this issue, check user input value of sysctl_sched_time_avg, keep it unchanged when hitting invalid input, and set the minimum limit of sysctl_sched_time_avg to 1 ms. Reported-by: James Puthukattukaran Signed-off-by: Ethan Zhao Signed-off-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: efault@gmx.de Cc: ethan.kernel@gmail.com Cc: keescook@chromium.org Cc: mcgrof@kernel.org Link: http://lkml.kernel.org/r/1504504774-18253-1-git-send-email-ethan.zhao@oracle.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit c8b679ba7c65457e45af7c086c5df0748522aa2f Author: Al Viro Date: Fri Sep 29 13:43:15 2017 -0400 fix infoleak in waitid(2) commit 6c85501f2fabcfc4fc6ed976543d252c4eaf4be9 upstream. kernel_waitid() can return a PID, an error or 0. rusage is filled in the first case and waitid(2) rusage should've been copied out exactly in that case, *not* whenever kernel_waitid() has not returned an error. Compat variant shares that braino; none of kernel_wait4() callers do, so the below ought to fix it. Reported-and-tested-by: Alexander Potapenko Fixes: ce72a16fa705 ("wait4(2)/waitid(2): separate copying rusage to userland") Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit 00dfbf106b0002c0b5bf0d8a34243d280258fa76 Author: Ross Zwisler Date: Mon Sep 18 14:46:03 2017 -0700 xfs: validate bdev support for DAX inode flag commit 6851a3db7e224bbb85e23b3c64a506c9e0904382 upstream. Currently only the blocksize is checked, but we should really be calling bdev_dax_supported() which also tests to make sure we can get a struct dax_device and that the dax_direct_access() path is working. This is the same check that we do for the "-o dax" mount option in xfs_fs_fill_super(). This does not fix the race issues that caused the XFS DAX inode option to be disabled, so that option will still be disabled. If/when we re-enable it, though, I think we will want this issue to have been fixed. I also do think that we want to fix this in stable kernels. Signed-off-by: Ross Zwisler Reviewed-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong Signed-off-by: Greg Kroah-Hartman commit 27920625f93e59b2b24bdadf946ae4b61f885f40 Author: Jim Mattson Date: Tue Sep 12 13:02:54 2017 -0700 kvm: nVMX: Don't allow L2 to access the hardware CR8 commit 51aa68e7d57e3217192d88ce90fd5b8ef29ec94f upstream. If L1 does not specify the "use TPR shadow" VM-execution control in vmcs12, then L0 must specify the "CR8-load exiting" and "CR8-store exiting" VM-execution controls in vmcs02. Failure to do so will give the L2 VM unrestricted read/write access to the hardware CR8. This fixes CVE-2017-12154. Signed-off-by: Jim Mattson Reviewed-by: David Hildenbrand Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit c69768cc3eb4fb41a0f3c47c6d6575fd8ead57b7 Author: Jan H. Schönherr Date: Thu Sep 7 19:02:30 2017 +0100 KVM: VMX: Do not BUG() on out-of-bounds guest IRQ commit 3a8b0677fc6180a467e26cc32ce6b0c09a32f9bb upstream. The value of the guest_irq argument to vmx_update_pi_irte() is ultimately coming from a KVM_IRQFD API call. Do not BUG() in vmx_update_pi_irte() if the value is out-of bounds. (Especially, since KVM as a whole seems to hang after that.) Instead, print a message only once if we find that we don't have a route for a certain IRQ (which can be out-of-bounds or within the array). This fixes CVE-2017-1000252. Fixes: efc644048ecde54 ("KVM: x86: Update IRTE for posted-interrupts") Signed-off-by: Jan H. Schönherr Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 4549616830dd2967676b882342948a2102a34c53 Author: Boqun Feng Date: Fri Sep 29 19:01:45 2017 +0800 kvm/x86: Handle async PF in RCU read-side critical sections commit b862789aa5186d5ea3a024b7cfe0f80c3a38b980 upstream. Sasha Levin reported a WARNING: | WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329 | rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline] | WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329 | rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458 ... | CPU: 0 PID: 6974 Comm: syz-fuzzer Not tainted 4.13.0-next-20170908+ #246 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS | 1.10.1-1ubuntu1 04/01/2014 | Call Trace: ... | RIP: 0010:rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline] | RIP: 0010:rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458 | RSP: 0018:ffff88003b2debc8 EFLAGS: 00010002 | RAX: 0000000000000001 RBX: 1ffff1000765bd85 RCX: 0000000000000000 | RDX: 1ffff100075d7882 RSI: ffffffffb5c7da20 RDI: ffff88003aebc410 | RBP: ffff88003b2def30 R08: dffffc0000000000 R09: 0000000000000001 | R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003b2def08 | R13: 0000000000000000 R14: ffff88003aebc040 R15: ffff88003aebc040 | __schedule+0x201/0x2240 kernel/sched/core.c:3292 | schedule+0x113/0x460 kernel/sched/core.c:3421 | kvm_async_pf_task_wait+0x43f/0x940 arch/x86/kernel/kvm.c:158 | do_async_page_fault+0x72/0x90 arch/x86/kernel/kvm.c:271 | async_page_fault+0x22/0x30 arch/x86/entry/entry_64.S:1069 | RIP: 0010:format_decode+0x240/0x830 lib/vsprintf.c:1996 | RSP: 0018:ffff88003b2df520 EFLAGS: 00010283 | RAX: 000000000000003f RBX: ffffffffb5d1e141 RCX: ffff88003b2df670 | RDX: 0000000000000001 RSI: dffffc0000000000 RDI: ffffffffb5d1e140 | RBP: ffff88003b2df560 R08: dffffc0000000000 R09: 0000000000000000 | R10: ffff88003b2df718 R11: 0000000000000000 R12: ffff88003b2df5d8 | R13: 0000000000000064 R14: ffffffffb5d1e140 R15: 0000000000000000 | vsnprintf+0x173/0x1700 lib/vsprintf.c:2136 | sprintf+0xbe/0xf0 lib/vsprintf.c:2386 | proc_self_get_link+0xfb/0x1c0 fs/proc/self.c:23 | get_link fs/namei.c:1047 [inline] | link_path_walk+0x1041/0x1490 fs/namei.c:2127 ... This happened when the host hit a page fault, and delivered it as in an async page fault, while the guest was in an RCU read-side critical section. The guest then tries to reschedule in kvm_async_pf_task_wait(), but rcu_preempt_note_context_switch() would treat the reschedule as a sleep in RCU read-side critical section, which is not allowed (even in preemptible RCU). Thus the WARN. To cure this, make kvm_async_pf_task_wait() go to the halt path if the PF happens in a RCU read-side critical section. Reported-by: Sasha Levin Cc: "Paul E. McKenney" Cc: Peter Zijlstra Signed-off-by: Boqun Feng Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 9c66f63a37e4d276e79b0f2521f783100d58f090 Author: Ladi Prosek Date: Fri Sep 22 07:53:15 2017 +0200 KVM: nVMX: fix HOST_CR3/HOST_CR4 cache commit 44889942b6eb356eab27ce25fe10701adfec7776 upstream. For nested virt we maintain multiple VMCS that can run on a vCPU. So it is incorrect to keep vmcs_host_cr3 and vmcs_host_cr4, whose purpose is caching the value of the rarely changing HOST_CR3 and HOST_CR4 VMCS fields, in vCPU-wide data structures. Hyper-V nested on KVM runs into this consistently for me with PCID enabled. CR3 is updated with a new value, unlikely(cr3 != vmx->host_state.vmcs_host_cr3) fires, and the currently loaded VMCS is updated. Then we switch from L2 to L1 and the next exit reverts CR3 to its old value. Fixes: d6e41f1151fe ("x86/mm, KVM: Teach KVM's VMX code that CR3 isn't a constant") Signed-off-by: Ladi Prosek Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 76cfd1e82903ff9417be9a3f45bf094ffcd223f4 Author: Paolo Bonzini Date: Tue Jun 6 12:57:06 2017 +0200 KVM: VMX: simplify and fix vmx_vcpu_pi_load commit 31afb2ea2b10a7d17ce3db4cdb0a12b63b2fe08a upstream. The simplify part: do not touch pi_desc.nv, we can set it when the VCPU is first created. Likewise, pi_desc.sn is only handled by vmx_vcpu_pi_load, do not touch it in __pi_post_block. The fix part: do not check kvm_arch_has_assigned_device, instead check the SN bit to figure out whether vmx_vcpu_pi_put ran before. This matches what the previous patch did in pi_post_block. Cc: Huangweidong Cc: Gonglei Cc: wangxin Cc: Radim Krčmář Tested-by: Longpeng (Mike) Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit e275005508a7a16e74c0e3dd49e31b7050b2763d Author: Paolo Bonzini Date: Tue Jun 6 12:57:05 2017 +0200 KVM: VMX: avoid double list add with VT-d posted interrupts commit 8b306e2f3c41939ea528e6174c88cfbfff893ce1 upstream. In some cases, for example involving hot-unplug of assigned devices, pi_post_block can forget to remove the vCPU from the blocked_vcpu_list. When this happens, the next call to pi_pre_block corrupts the list. Fix this in two ways. First, check vcpu->pre_pcpu in pi_pre_block and WARN instead of adding the element twice in the list. Second, always do the list removal in pi_post_block if vcpu->pre_pcpu is set (not -1). The new code keeps interrupts disabled for the whole duration of pi_pre_block/pi_post_block. This is not strictly necessary, but easier to follow. For the same reason, PI.ON is checked only after the cmpxchg, and to handle it we just call the post-block code. This removes duplication of the list removal code. Cc: Huangweidong Cc: Gonglei Cc: wangxin Cc: Radim Krčmář Tested-by: Longpeng (Mike) Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 5170abd410a8eb619af4701ec5360832d97b29cf Author: Paolo Bonzini Date: Tue Jun 6 12:57:04 2017 +0200 KVM: VMX: extract __pi_post_block commit cd39e1176d320157831ce030b4c869bd2d5eb142 upstream. Simple code movement patch, preparing for the next one. Cc: Huangweidong Cc: Gonglei Cc: wangxin Cc: Radim Krčmář Tested-by: Longpeng (Mike) Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit b50697878f0d882b06fcf2eefad32b3be8f4b1a8 Author: Will Deacon Date: Fri Sep 29 12:27:41 2017 +0100 arm64: fault: Route pte translation faults via do_translation_fault commit 760bfb47c36a07741a089bf6a28e854ffbee7dc9 upstream. We currently route pte translation faults via do_page_fault, which elides the address check against TASK_SIZE before invoking the mm fault handling code. However, this can cause issues with the path walking code in conjunction with our word-at-a-time implementation because load_unaligned_zeropad can end up faulting in kernel space if it reads across a page boundary and runs into a page fault (e.g. by attempting to read from a guard region). In the case of such a fault, load_unaligned_zeropad has registered a fixup to shift the valid data and pad with zeroes, however the abort is reported as a level 3 translation fault and we dispatch it straight to do_page_fault, despite it being a kernel address. This results in calling a sleeping function from atomic context: BUG: sleeping function called from invalid context at arch/arm64/mm/fault.c:313 in_atomic(): 0, irqs_disabled(): 0, pid: 10290 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [...] [] ___might_sleep+0x134/0x144 [] __might_sleep+0x7c/0x8c [] do_page_fault+0x140/0x330 [] do_mem_abort+0x54/0xb0 Exception stack(0xfffffffb20247a70 to 0xfffffffb20247ba0) [...] [] el1_da+0x18/0x78 [] path_parentat+0x44/0x88 [] filename_parentat+0x5c/0xd8 [] filename_create+0x4c/0x128 [] SyS_mkdirat+0x50/0xc8 [] el0_svc_naked+0x24/0x28 Code: 36380080 d5384100 f9400800 9402566d (d4210000) ---[ end trace 2d01889f2bca9b9f ]--- Fix this by dispatching all translation faults to do_translation_faults, which avoids invoking the page fault logic for faults on kernel addresses. Reported-by: Ankit Jain Signed-off-by: Will Deacon Signed-off-by: Catalin Marinas Signed-off-by: Greg Kroah-Hartman commit 4bf30dffc293273ce28291d9129825c2bb1b14f3 Author: Will Deacon Date: Fri Sep 29 11:29:55 2017 +0100 arm64: mm: Use READ_ONCE when dereferencing pointer to pte table commit f069faba688701c4d56b6c3452a130f97bf02e95 upstream. On kernels built with support for transparent huge pages, different CPUs can access the PMD concurrently due to e.g. fast GUP or page_vma_mapped_walk and they must take care to use READ_ONCE to avoid value tearing or caching of stale values by the compiler. Unfortunately, these functions call into our pgtable macros, which don't use READ_ONCE, and compiler caching has been observed to cause the following crash during ext4 writeback: PC is at check_pte+0x20/0x170 LR is at page_vma_mapped_walk+0x2e0/0x540 [...] Process doio (pid: 2463, stack limit = 0xffff00000f2e8000) Call trace: [] check_pte+0x20/0x170 [] page_vma_mapped_walk+0x2e0/0x540 [] page_mkclean_one+0xac/0x278 [] rmap_walk_file+0xf0/0x238 [] rmap_walk+0x64/0xa0 [] page_mkclean+0x90/0xa8 [] clear_page_dirty_for_io+0x84/0x2a8 [] mpage_submit_page+0x34/0x98 [] mpage_process_page_bufs+0x164/0x170 [] mpage_prepare_extent_to_map+0x134/0x2b8 [] ext4_writepages+0x484/0xe30 [] do_writepages+0x44/0xe8 [] __filemap_fdatawrite_range+0xbc/0x110 [] file_write_and_wait_range+0x48/0xd8 [] ext4_sync_file+0x80/0x4b8 [] vfs_fsync_range+0x64/0xc0 [] SyS_msync+0x194/0x1e8 This is because page_vma_mapped_walk loads the PMD twice before calling pte_offset_map: the first time without READ_ONCE (where it gets all zeroes due to a concurrent pmdp_invalidate) and the second time with READ_ONCE (where it sees a valid table pointer due to a concurrent pmd_populate). However, the compiler inlines everything and caches the first value in a register, which is subsequently used in pte_offset_phys which returns a junk pointer that is later dereferenced when attempting to access the relevant pte. This patch fixes the issue by using READ_ONCE in pte_offset_phys to ensure that a stale value is not used. Whilst this is a point fix for a known failure (and simple to backport), a full fix moving all of our page table accessors over to {READ,WRITE}_ONCE and consistently using READ_ONCE in page_vma_mapped_walk is in the works for a future kernel release. Cc: Jon Masters Cc: Timur Tabi Fixes: f27176cfc363 ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()") Tested-by: Richard Ruigrok Signed-off-by: Will Deacon Signed-off-by: Catalin Marinas Signed-off-by: Greg Kroah-Hartman commit adf56f6f9ebb364e13bfbdb365dd3cba0b254fae Author: Marc Zyngier Date: Tue Sep 26 15:57:16 2017 +0100 arm64: Make sure SPsel is always set commit 5371513fb338fb9989c569dc071326d369d6ade8 upstream. When the kernel is entered at EL2 on an ARMv8.0 system, we construct the EL1 pstate and make sure this uses the the EL1 stack pointer (we perform an exception return to EL1h). But if the kernel is either entered at EL1 or stays at EL2 (because we're on a VHE-capable system), we fail to set SPsel, and use whatever stack selection the higher exception level has choosen for us. Let's not take any chance, and make sure that SPsel is set to one before we decide the mode we're going to run in. Acked-by: Mark Rutland Signed-off-by: Marc Zyngier Signed-off-by: Will Deacon Signed-off-by: Catalin Marinas Signed-off-by: Greg Kroah-Hartman commit 257ac6ecaba725861b15caa26f1f2d384e144b9b Author: Oleg Nesterov Date: Wed Sep 27 09:25:30 2017 -0600 seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter() commit 66a733ea6b611aecf0119514d2dddab5f9d6c01e upstream. As Chris explains, get_seccomp_filter() and put_seccomp_filter() can end up using different filters. Once we drop ->siglock it is possible for task->seccomp.filter to have been replaced by SECCOMP_FILTER_FLAG_TSYNC. Fixes: f8e529ed941b ("seccomp, ptrace: add support for dumping seccomp filters") Reported-by: Chris Salls Signed-off-by: Oleg Nesterov [tycho: add __get_seccomp_filter vs. open coding refcount_inc()] Signed-off-by: Tycho Andersen [kees: tweak commit log] Signed-off-by: Kees Cook Signed-off-by: Greg Kroah-Hartman commit 6cb8922f9c6810ad69c45ade8d4d710d55d813bf Author: Kees Cook Date: Thu Sep 7 16:32:46 2017 -0700 selftests/seccomp: Support glibc 2.26 siginfo_t.h commit 10859f3855db4c6f10dc7974ff4b3a292f3de8e0 upstream. The 2.26 release of glibc changed how siginfo_t is defined, and the earlier work-around to using the kernel definition are no longer needed. The old way needs to stay around for a while, though. Reported-by: Seth Forshee Cc: Andy Lutomirski Cc: Will Drewry Cc: Shuah Khan Cc: linux-kselftest@vger.kernel.org Signed-off-by: Kees Cook Tested-by: Seth Forshee Signed-off-by: Shuah Khan Signed-off-by: Greg Kroah-Hartman commit b3fa97213993d1fb0f6dfa0f0bdac1680729d355 Author: Steven Rostedt (VMware) Date: Fri Sep 22 17:36:32 2017 -0400 extable: Enable RCU if it is not watching in kernel_text_address() commit e8cac8b1d10589be45671a5ade0926a639b543b7 upstream. If kernel_text_address() is called when RCU is not watching, it can cause an RCU bug because is_module_text_address(), the is_kprobe_*insn_slot() and is_bpf_text_address() functions require the use of RCU. Only enable RCU if it is not currently watching before it calls is_module_text_address(). The use of rcu_nmi_enter() is used to enable RCU because kernel_text_address() can happen pretty much anywhere (like an NMI), and even from within an NMI. It is called via save_stack_trace() that can be called by any WARN() or tracing function, which can happen while RCU is not watching (for example, going to or coming from idle, or during CPU take down or bring up). Fixes: 0be964be0 ("module: Sanitize RCU usage and locking") Acked-by: Paul E. McKenney Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit eb4cadd0abf27dccbcf8119467f0c5a94aae76f7 Author: Steven Rostedt (VMware) Date: Fri Sep 22 17:22:19 2017 -0400 extable: Consolidate *kernel_text_address() functions commit 9aadde91b3c035413c806619beb3e3ef6e697953 upstream. The functionality between kernel_text_address() and _kernel_text_address() is the same except that _kernel_text_address() does a little more (that function needs a rename, but that can be done another time). Instead of having duplicate code in both, simply have _kernel_text_address() calls kernel_text_address() instead. This is marked for stable because there's an RCU bug that can happen if one of these functions gets called while RCU is not watching. That fix depends on this fix to keep from having to write the fix twice. Fixes: 0be964be0 ("module: Sanitize RCU usage and locking") Acked-by: Paul E. McKenney Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit 4d4b18be3ac192f10feb6c08e2725e339625f008 Author: Adrian Hunter Date: Mon Sep 18 15:16:08 2017 +0300 mmc: sdhci-pci: Fix voltage switch for some Intel host controllers commit 6ae033689d7b1a419def78e8e990b0eab8bb6419 upstream. Some Intel host controllers (e.g. CNP) use an ACPI device-specific method to ensure correct voltage switching. Fix voltage switch for those, by adding a call to the DSM. Signed-off-by: Adrian Hunter Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 55c2ca358b290ae6d33866725384bae725671653 Author: Paul E. McKenney Date: Fri Sep 22 14:10:22 2017 -0700 rcu: Allow for page faults in NMI handlers commit 28585a832602747cbfa88ad8934013177a3aae38 upstream. A number of architecture invoke rcu_irq_enter() on exception entry in order to allow RCU read-side critical sections in the exception handler when the exception is from an idle or nohz_full CPU. This works, at least unless the exception happens in an NMI handler. In that case, rcu_nmi_enter() would already have exited the extended quiescent state, which would mean that rcu_irq_enter() would (incorrectly) cause RCU to think that it is again in an extended quiescent state. This will in turn result in lockdep splats in response to later RCU read-side critical sections. This commit therefore causes rcu_irq_enter() and rcu_irq_exit() to take no action if there is an rcu_nmi_enter() in effect, thus avoiding the unscheduled return to RCU quiescent state. This in turn should make the kernel safe for on-demand RCU voyeurism. Link: http://lkml.kernel.org/r/20170922211022.GA18084@linux.vnet.ibm.com Fixes: 0be964be0 ("module: Sanitize RCU usage and locking") Reported-by: Steven Rostedt Signed-off-by: Paul E. McKenney Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit 61d8dbf2387accf406f0d87c30aab0051af2a267 Author: Steve Wise Date: Wed Sep 13 09:52:32 2017 -0700 iw_cxgb4: put ep reference in pass_accept_req() commit 3d318605f5e32ff44fb290d9b67573b34213c4c8 upstream. The listening endpoint should always be dereferenced at the end of pass_accept_req(). Fixes: f86fac79afec ("RDMA/iw_cxgb4: atomic find and reference for listening endpoints") Signed-off-by: Steve Wise Signed-off-by: Doug Ledford Signed-off-by: Greg Kroah-Hartman commit 351a522363a1cf11c8dd35c4ca73726bd9a19ab0 Author: Steve Wise Date: Tue Sep 5 11:52:34 2017 -0700 iw_cxgb4: remove the stid on listen create failure commit 8b1bbf36b7452c4acb20e91948eaa5e225ea6978 upstream. If a listen create fails, then the server tid (stid) is incorrectly left in the stid idr table, which can cause a touch-after-free if the stid is looked up and the already freed endpoint is touched. So make sure and remove it in the error path. Signed-off-by: Steve Wise Signed-off-by: Doug Ledford Signed-off-by: Greg Kroah-Hartman commit 966176739139d0a5c915c6a59d0ab711310f9639 Author: Steve Wise Date: Tue Sep 5 11:52:33 2017 -0700 iw_cxgb4: drop listen destroy replies if no ep found commit 3c8415cc7aff467faba25841fb859660ac14a04e upstream. If the thread waiting for a CLOSE_LISTSRV_RPL times out and bails, then we need to handle a subsequent CPL if it arrives and the stid has been released. In this case silently drop it. Signed-off-by: Steve Wise Signed-off-by: Doug Ledford Signed-off-by: Greg Kroah-Hartman commit 0ea30c797c6f6f2e1606ab8960387a88e68f6fe4 Author: Christoph Hellwig Date: Thu Sep 7 13:54:35 2017 +0200 bsg-lib: don't free job in bsg_prepare_job commit f507b54dccfd8000c517d740bc45f20c74532d18 upstream. The job structure is allocated as part of the request, so we should not free it in the error path of bsg_prepare_job. Signed-off-by: Christoph Hellwig Reviewed-by: Ming Lei Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit f148de59e2c6ce65bfca18312844d090c1fc9f8b Author: Andreas Gruenbacher Date: Tue Sep 19 07:15:35 2017 -0500 gfs2: Fix debugfs glocks dump commit 10201655b085df8e000822e496e5d4016a167a36 upstream. The switch to rhashtables (commit 88ffbf3e03) broke the debugfs glock dump (/sys/kernel/debug/gfs2//glocks) for dumps bigger than a single buffer: the right function for restarting an rhashtable iteration from the beginning of the hash table is rhashtable_walk_enter; rhashtable_walk_stop + rhashtable_walk_start will just resume from the current position. Signed-off-by: Andreas Gruenbacher Signed-off-by: Bob Peterson Signed-off-by: Greg Kroah-Hartman commit 676b7ae6c40434dcff827e0f67725de25ddd32fc Author: Mikulas Patocka Date: Wed Sep 13 09:17:57 2017 -0400 brd: fix overflow in __brd_direct_access commit 02a4843618fb35f847cf8c31cd3893873aa0edde upstream. The code in __brd_direct_access multiplies the pgoff variable by page size and divides it by 512. It can cause overflow on 32-bit architectures. The overflow happens if we create ramdisk larger than 4G and use it as a sparse device. This patch replaces multiplication and division with multiplication by the number of sectors per page. Reviewed-by: Dan Williams Signed-off-by: Mikulas Patocka Fixes: 1647b9b959c7 ("brd: add dax_operations support") Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit f0e85f56f70f172ada0d06b26baff47823695629 Author: Vladis Dronov Date: Wed Sep 13 00:21:21 2017 +0200 nl80211: check for the required netlink attributes presence commit e785fa0a164aa11001cba931367c7f94ffaff888 upstream. nl80211_set_rekey_data() does not check if the required attributes NL80211_REKEY_DATA_{REPLAY_CTR,KEK,KCK} are present when processing NL80211_CMD_SET_REKEY_OFFLOAD request. This request can be issued by users with CAP_NET_ADMIN privilege and may result in NULL dereference and a system crash. Add a check for the required attributes presence. This patch is based on the patch by bo Zhang. This fixes CVE-2017-12153. References: https://bugzilla.redhat.com/show_bug.cgi?id=1491046 Fixes: e5497d766ad ("cfg80211/nl80211: support GTK rekey offload") Reported-by: bo Zhang Signed-off-by: Vladis Dronov Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit a25137c56253648559d2420f29b8ce3270faa17d Author: Ilya Dryomov Date: Mon Sep 18 12:21:37 2017 +0200 libceph: don't allow bidirectional swap of pg-upmap-items commit 29a0cfbf91ba997591535a4f7246835ce8328141 upstream. This reverts most of commit f53b7665c8ce ("libceph: upmap semantic changes"). We need to prevent duplicates in the final result. For example, we can currently take [1,2,3] and apply [(1,2)] and get [2,2,3] or [1,2,3] and apply [(3,2)] and get [1,2,2] The rest of the system is not prepared to handle duplicates in the result set like this. The reverted piece was intended to allow [1,2,3] and [(1,2),(2,1)] to get [2,1,3] to reorder primaries. First, this bidirectional swap is hard to implement in a way that also prevents dups. For example, [1,2,3] and [(1,4),(2,3),(3,4)] would give [4,3,4] but would we just drop the last step we'd have [4,3,3] which is also invalid, etc. Simpler to just not handle bidirectional swaps. In practice, they are not needed: if you just want to choose a different primary then use primary_affinity, or pg_upmap (not pg_upmap_items). Link: http://tracker.ceph.com/issues/21410 Signed-off-by: Ilya Dryomov Reviewed-by: Sage Weil Signed-off-by: Greg Kroah-Hartman commit 0cc22b028e3ef4ef75651820183aa7b7e820b98b Author: Andreas Gruenbacher Date: Mon Sep 25 12:23:03 2017 +0200 vfs: Return -ENXIO for negative SEEK_HOLE / SEEK_DATA offsets commit fc46820b27a2d9a46f7e90c9ceb4a64a1bc5fab8 upstream. In generic_file_llseek_size, return -ENXIO for negative offsets as well as offsets beyond EOF. This affects filesystems which don't implement SEEK_HOLE / SEEK_DATA internally, possibly because they don't support holes. Fixes xfstest generic/448. Signed-off-by: Andreas Gruenbacher Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit cc6985ec8f095aaa9b1b9f4bee1d292e772ca468 Author: Steve French Date: Fri Sep 22 01:40:27 2017 -0500 SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags commit 1013e760d10e614dc10b5624ce9fc41563ba2e65 upstream. Signed-off-by: Steve French Reviewed-by: Ronnie Sahlberg Reviewed-by: Pavel Shilovsky Signed-off-by: Greg Kroah-Hartman commit 95ed592c1d3db331e6d367a50dee62efe6bd7633 Author: Steve French Date: Thu Sep 21 21:32:29 2017 -0500 SMB3: handle new statx fields commit 6e70e26dc52be62c1f39f81b5f71fa5e643677aa upstream. We weren't returning the creation time or the two easily supported attributes (ENCRYPTED or COMPRESSED) for the getattr call to allow statx to return these fields. Signed-off-by: Steve French Reviewed-by: Ronnie Sahlberg Acked-by: Jeff Layton Reviewed-by: Pavel Shilovsky Signed-off-by: Greg Kroah-Hartman commit 6fcd886168b3ac005fb87b030379bd72315783be Author: Steve French Date: Wed Sep 20 19:57:18 2017 -0500 SMB: Validate negotiate (to protect against downgrade) even if signing off commit 0603c96f3af50e2f9299fa410c224ab1d465e0f9 upstream. As long as signing is supported (ie not a guest user connection) and connection is SMB3 or SMB3.02, then validate negotiate (protect against man in the middle downgrade attacks). We had been doing this only when signing was required, not when signing was just enabled, but this more closely matches recommended SMB3 behavior and is better security. Suggested by Metze. Signed-off-by: Steve French Reviewed-by: Jeremy Allison Acked-by: Stefan Metzmacher Reviewed-by: Ronnie Sahlberg Signed-off-by: Greg Kroah-Hartman commit 2720aef9a67a61914adef8745395c045779d54d5 Author: Steve French Date: Tue Sep 19 18:40:03 2017 -0500 SMB3: Warn user if trying to sign connection that authenticated as guest commit c721c38957fb19982416f6be71aae7b30630d83b upstream. It can be confusing if user ends up authenticated as guest but they requested signing (server will return error validating signed packets) so add log message for this. Signed-off-by: Steve French Reviewed-by: Ronnie Sahlberg Signed-off-by: Greg Kroah-Hartman commit 9d7f9bd42dc9663abb7ef5be400785488036e85c Author: Steve French Date: Tue Sep 19 11:43:47 2017 -0500 SMB3: Fix endian warning commit 590d08d3da45e9fed423b08ab38d71886c07abc8 upstream. Multi-dialect negotiate patch had a minor endian error. Signed-off-by: Steve French Reviewed-by: Ronnie Sahlberg Signed-off-by: Greg Kroah-Hartman commit b9572c351396f6edb7d33319a0f3e914dcfda993 Author: Steve French Date: Mon Sep 18 18:18:45 2017 -0500 Fix SMB3.1.1 guest authentication to Samba commit 23586b66d84ba3184b8820277f3fc42761640f87 upstream. Samba rejects SMB3.1.1 dialect (vers=3.1.1) negotiate requests from the kernel client due to the two byte pad at the end of the negotiate contexts. Signed-off-by: Steve French Reviewed-by: Ronnie Sahlberg Signed-off-by: Greg Kroah-Hartman commit 3158f228fc2b0b6d5500498ccfc4b308c9b08b5d Author: Alex Estrin Date: Tue Sep 26 06:06:22 2017 -0700 Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0" commit 612601d0013f03de9dc134809f242ba6da9ca252 upstream. commit 9a9b8112699d will cause core to fail UD QP from being destroyed on ipoib unload, therefore cause resources leakage. On pkey change event above patch modifies mgid before calling underlying driver to detach it from QP. Drivers' detach_mcast() will fail to find modified mgid it was never given to attach in a first place. Core qp->usecnt will never go down, so ib_destroy_qp() will fail. IPoIB driver actually does take care of new broadcast mgid based on new pkey by destroying an old mcast object in ipoib_mcast_dev_flush()) .... if (priv->broadcast) { rb_erase(&priv->broadcast->rb_node, &priv->multicast_tree); list_add_tail(&priv->broadcast->list, &remove_list); priv->broadcast = NULL; } ... then in restarted ipoib_macst_join_task() creating a new broadcast mcast object, sending join request and on completion tells the driver to attach to reinitialized QP: ... if (!priv->broadcast) { ... broadcast = ipoib_mcast_alloc(dev, 0); ... memcpy(broadcast->mcmember.mgid.raw, priv->dev->broadcast + 4, sizeof (union ib_gid)); priv->broadcast = broadcast; ... Fixes: 9a9b8112699d ("IB/ipoib: Update broadcast object if PKey value was changed in index 0") Reviewed-by: Mike Marciniszyn Reviewed-by: Dennis Dalessandro Signed-off-by: Alex Estrin Signed-off-by: Dennis Dalessandro Reviewed-by: Feras Daoud Signed-off-by: Doug Ledford Signed-off-by: Greg Kroah-Hartman commit f6919da4b4ced09be9b2297a29de61daad1c6bbf Author: Rafael J. Wysocki Date: Tue Sep 19 02:22:39 2017 +0200 PM: core: Fix device_pm_check_callbacks() commit 157c460e10cb6eca29ccbd0f023db159d0c55ec7 upstream. The device_pm_check_callbacks() function doesn't check legacy ->suspend and ->resume callback pointers under the device's bus type, class and driver, so in some cases it may set the no_pm_callbacks flag for the device incorrectly and then the callbacks may be skipped during system suspend/resume, which shouldn't happen. Fixes: aa8e54b55947 (PM / sleep: Go direct_complete if driver has no callbacks) Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit d0587b313ed847f0a74e73ec3400de93af91617f Author: Gerald Schaefer Date: Mon Sep 18 16:51:51 2017 +0200 s390/mm: fix write access check in gup_huge_pmd() commit ba385c0594e723d41790ecfb12c610e6f90c7785 upstream. The check for the _SEGMENT_ENTRY_PROTECT bit in gup_huge_pmd() is the wrong way around. It must not be set for write==1, and not be checked for write==0. Fix this similar to how it was fixed for ptes long time ago in commit 25591b070336 ("[S390] fix get_user_pages_fast"). One impact of this bug would be unnecessarily using the gup slow path for write==0 on r/w mappings. A potentially more severe impact would be that gup_huge_pmd() will succeed for write==1 on r/o mappings. Signed-off-by: Gerald Schaefer Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit a43acb30af52cc9c0a4f8896ad689d4819196e33 Author: Gerald Schaefer Date: Mon Sep 18 16:10:35 2017 +0200 s390/mm: make pmdp_invalidate() do invalidation only commit 91c575b335766effa6103eba42a82aea560c365f upstream. Commit 227be799c39a ("s390/mm: uninline pmdp_xxx functions from pgtable.h") inadvertently changed the behavior of pmdp_invalidate(), so that it now clears the pmd instead of just marking it as invalid. Fix this by restoring the original behavior. A possible impact of the misbehaving pmdp_invalidate() would be the MADV_DONTNEED races (see commits ced10803 and 58ceeb6b), although we should not have any negative impact on the related dirty/young flags, since those flags are not set by the hardware on s390. Fixes: 227be799c39a ("s390/mm: uninline pmdp_xxx functions from pgtable.h") Signed-off-by: Gerald Schaefer Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit 62c7d518f491fd985e253cc2cfc5aaa625f4fe58 Author: Pu Hou Date: Tue Sep 5 05:17:24 2017 +0200 s390/perf: fix bug when creating per-thread event commit fc3100d64f0ae383ae8d845989103da06d62763b upstream. A per-thread event could not be created correctly like below: perf record --per-thread -e rB0000 -- sleep 1 Error: The sys_perf_event_open() syscall returned with 19 (No such device) for event (rB0000). /bin/dmesg may provide additional information. No CONFIG_PERF_EVENTS=y kernel support configured? This bug was introduced by: commit c311c797998c1e70eade463dd60b843da4f1a203 Author: Alexey Dobriyan Date: Mon May 8 15:56:15 2017 -0700 cpumask: make "nr_cpumask_bits" unsigned If a per-thread event is not attached to any CPU, the cpu field in struct perf_event is -1. The above commit converts the CPU number to unsigned int, which result in an illegal CPU number. Fixes: c311c797998c ("cpumask: make "nr_cpumask_bits" unsigned") Cc: Alexey Dobriyan Acked-by: Heiko Carstens Signed-off-by: Pu Hou Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit 7d4b45badc7e4e192b83f92937fee38e220123f7 Author: Paul Burton Date: Tue Sep 19 22:07:18 2017 -0700 MIPS: Fix perf event init commit fd0b19ed5389187829b854900511c9195875bb42 upstream. Commit c311c797998c ("cpumask: make "nr_cpumask_bits" unsigned") modified mipspmu_event_init() to cast the struct perf_event cpu field to an unsigned integer before it is compared with nr_cpumask_bits (and *ahem* did so without copying the linux-mips mailing list or any MIPS developers...). This is broken because the cpu field may be -1 for events which follow a process rather than being affine to a particular CPU. When this is the case the cast to an unsigned int results in a value equal to ULONG_MAX, which is always greater than nr_cpumask_bits so we always fail mipspmu_event_init() and return -ENODEV. The check against nr_cpumask_bits seems nonsensical anyway, so this patch simply removes it. The cpu field is going to either be -1 or a valid CPU number. Comparing it with nr_cpumask_bits is effectively checking that it's a valid cpu number, but it seems safe to rely on the core perf events code to ensure that's the case. The end result is that this fixes use of perf on MIPS when not constraining events to a particular CPU, and fixes the "perf list hw" command which fails to list any events without this. Signed-off-by: Paul Burton Fixes: c311c797998c ("cpumask: make "nr_cpumask_bits" unsigned") Cc: Alexey Dobriyan Cc: Andrew Morton Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17323/ Signed-off-by: Ralf Baechle Signed-off-by: Greg Kroah-Hartman commit 5075520e4b0b2ec8d21ae7a992e6a073e46c8507 Author: Gustavo Romero Date: Wed Sep 13 22:13:48 2017 -0400 powerpc/tm: Flush TM only if CPU has TM feature commit c1fa0768a8713b135848f78fd43ffc208d8ded70 upstream. Commit cd63f3c ("powerpc/tm: Fix saving of TM SPRs in core dump") added code to access TM SPRs in flush_tmregs_to_thread(). However flush_tmregs_to_thread() does not check if TM feature is available on CPU before trying to access TM SPRs in order to copy live state to thread structures. flush_tmregs_to_thread() is indeed guarded by CONFIG_PPC_TRANSACTIONAL_MEM but it might be the case that kernel was compiled with CONFIG_PPC_TRANSACTIONAL_MEM enabled and ran on a CPU without TM feature available, thus rendering the execution of TM instructions that are treated by the CPU as illegal instructions. The fix is just to add proper checking in flush_tmregs_to_thread() if CPU has the TM feature before accessing any TM-specific resource, returning immediately if TM is no available on the CPU. Adding that checking in flush_tmregs_to_thread() instead of in places where it is called, like in vsr_get() and vsr_set(), is better because avoids the same problem cropping up elsewhere. Fixes: cd63f3c ("powerpc/tm: Fix saving of TM SPRs in core dump") Signed-off-by: Gustavo Romero Reviewed-by: Cyril Bur Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 3cbae7ad201315399c562818ef0b77bd74b091d2 Author: Tyrel Datwyler Date: Wed Sep 20 17:02:52 2017 -0400 powerpc/pseries: Fix parent_dn reference leak in add_dt_node() commit b537ca6fede69a281dc524983e5e633d79a10a08 upstream. A reference to the parent device node is held by add_dt_node() for the node to be added. If the call to dlpar_configure_connector() fails add_dt_node() returns ENOENT and that reference is not freed. Add a call to of_node_put(parent_dn) prior to bailing out after a failed dlpar_configure_connector() call. Fixes: 8d5ff320766f ("powerpc/pseries: Make dlpar_configure_connector parent node aware") Signed-off-by: Tyrel Datwyler Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 098088ab5be28b2f353e28c857ede1b530203a4f Author: Benjamin Herrenschmidt Date: Thu Sep 7 16:35:40 2017 +1000 powerpc/eeh: Create PHB PEs after EEH is initialized commit 3e77adeea3c5393c9b624832f65441e92867f618 upstream. Otherwise we end up not yet having computed the right diag data size on powernv where EEH initialization is delayed, thus causing memory corruption later on when calling OPAL. Fixes: 5cb1f8fdddb7 ("powerpc/powernv/pci: Dynamically allocate PHB diag data") Signed-off-by: Benjamin Herrenschmidt Acked-by: Russell Currey Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 8a7ab21fe58132104e8742d669cfd83092352f7b Author: Dan Williams Date: Mon Sep 18 14:48:58 2017 -0700 libnvdimm, namespace: fix btt claim class crash commit 33a56086712561b8b9cdc881e0317f4c36861f72 upstream. Maurice reports: BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 IP: holder_class_store+0x253/0x2b0 [libnvdimm] ...while trying to reconfigure an NVDIMM-N namespace into 'sector' / 'btt' mode. The crash points to this line: (gdb) li *(holder_class_store+0x253) 0x7773 is in holder_class_store (drivers/nvdimm/namespace_devs.c:1420). 1415 for (i = 0; i < nd_region->ndr_mappings; i++) { 1416 struct nd_mapping *nd_mapping = &nd_region->mapping[i]; 1417 struct nvdimm_drvdata *ndd = to_ndd(nd_mapping); 1418 struct nd_namespace_index *nsindex; 1419 1420 nsindex = to_namespace_index(ndd, ndd->ns_current); ...where we are failing because ndd is NULL due to NVDIMM-N dimms not supporting labels. Long story short, default to the BTTv1 format in the label-less / NVDIMM-N case. Fixes: 14e494542636 ("libnvdimm, btt: BTT updates for UEFI 2.7 format") Cc: Vishal Verma Reported-by: Maurice A. Saldivar Tested-by: Maurice A. Saldivar Signed-off-by: Dan Williams Signed-off-by: Greg Kroah-Hartman commit 069276fabc8c6aef5c4f95442fef68138f5e8020 Author: Eric Biggers Date: Mon Sep 18 11:37:23 2017 -0700 KEYS: prevent KEYCTL_READ on negative key commit 37863c43b2c6464f252862bf2e9768264e961678 upstream. Because keyctl_read_key() looks up the key with no permissions requested, it may find a negatively instantiated key. If the key is also possessed, we went ahead and called ->read() on the key. But the key payload will actually contain the ->reject_error rather than the normal payload. Thus, the kernel oopses trying to read the user_key_payload from memory address (int)-ENOKEY = 0x00000000ffffff82. Fortunately the payload data is stored inline, so it shouldn't be possible to abuse this as an arbitrary memory read primitive... Reproducer: keyctl new_session keyctl request2 user desc '' @s keyctl read $(keyctl show | awk '/user: desc/ {print $1}') It causes a crash like the following: BUG: unable to handle kernel paging request at 00000000ffffff92 IP: user_read+0x33/0xa0 PGD 36a54067 P4D 36a54067 PUD 0 Oops: 0000 [#1] SMP CPU: 0 PID: 211 Comm: keyctl Not tainted 4.14.0-rc1 #337 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014 task: ffff90aa3b74c3c0 task.stack: ffff9878c0478000 RIP: 0010:user_read+0x33/0xa0 RSP: 0018:ffff9878c047bee8 EFLAGS: 00010246 RAX: 0000000000000001 RBX: ffff90aa3d7da340 RCX: 0000000000000017 RDX: 0000000000000000 RSI: 00000000ffffff82 RDI: ffff90aa3d7da340 RBP: ffff9878c047bf00 R08: 00000024f95da94f R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f58ece69740(0000) GS:ffff90aa3e200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000ffffff92 CR3: 0000000036adc001 CR4: 00000000003606f0 Call Trace: keyctl_read_key+0xac/0xe0 SyS_keyctl+0x99/0x120 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x7f58ec787bb9 RSP: 002b:00007ffc8d401678 EFLAGS: 00000206 ORIG_RAX: 00000000000000fa RAX: ffffffffffffffda RBX: 00007ffc8d402800 RCX: 00007f58ec787bb9 RDX: 0000000000000000 RSI: 00000000174a63ac RDI: 000000000000000b RBP: 0000000000000004 R08: 00007ffc8d402809 R09: 0000000000000020 R10: 0000000000000000 R11: 0000000000000206 R12: 00007ffc8d402800 R13: 00007ffc8d4016e0 R14: 0000000000000000 R15: 0000000000000000 Code: e5 41 55 49 89 f5 41 54 49 89 d4 53 48 89 fb e8 a4 b4 ad ff 85 c0 74 09 80 3d b9 4c 96 00 00 74 43 48 8b b3 20 01 00 00 4d 85 ed <0f> b7 5e 10 74 29 4d 85 e4 74 24 4c 39 e3 4c 89 e2 4c 89 ef 48 RIP: user_read+0x33/0xa0 RSP: ffff9878c047bee8 CR2: 00000000ffffff92 Fixes: 61ea0c0ba904 ("KEYS: Skip key state checks when checking for possession") Signed-off-by: Eric Biggers Signed-off-by: David Howells Signed-off-by: Greg Kroah-Hartman commit 943f8697a9b3785a4b20a4b01816f11cccadfd0f Author: Eric Biggers Date: Mon Sep 18 11:37:03 2017 -0700 KEYS: prevent creating a different user's keyrings commit 237bbd29f7a049d310d907f4b2716a7feef9abf3 upstream. It was possible for an unprivileged user to create the user and user session keyrings for another user. For example: sudo -u '#3000' sh -c 'keyctl add keyring _uid.4000 "" @u keyctl add keyring _uid_ses.4000 "" @u sleep 15' & sleep 1 sudo -u '#4000' keyctl describe @u sudo -u '#4000' keyctl describe @us This is problematic because these "fake" keyrings won't have the right permissions. In particular, the user who created them first will own them and will have full access to them via the possessor permissions, which can be used to compromise the security of a user's keys: -4: alswrv-----v------------ 3000 0 keyring: _uid.4000 -5: alswrv-----v------------ 3000 0 keyring: _uid_ses.4000 Fix it by marking user and user session keyrings with a flag KEY_FLAG_UID_KEYRING. Then, when searching for a user or user session keyring by name, skip all keyrings that don't have the flag set. Fixes: 69664cf16af4 ("keys: don't generate user and user session keyrings unless they're accessed") Signed-off-by: Eric Biggers Signed-off-by: David Howells Signed-off-by: Greg Kroah-Hartman commit b81931fd5be18454fd510c8e22be9ba4c6276bcf Author: Eric Biggers Date: Mon Sep 18 11:36:45 2017 -0700 KEYS: fix writing past end of user-supplied buffer in keyring_read() commit e645016abc803dafc75e4b8f6e4118f088900ffb upstream. Userspace can call keyctl_read() on a keyring to get the list of IDs of keys in the keyring. But if the user-supplied buffer is too small, the kernel would write the full list anyway --- which will corrupt whatever userspace memory happened to be past the end of the buffer. Fix it by only filling the space that is available. Fixes: b2a4df200d57 ("KEYS: Expand the capacity of a keyring") Signed-off-by: Eric Biggers Signed-off-by: David Howells Signed-off-by: Greg Kroah-Hartman commit 2142feb3c25a7158871e141dd1f7bd5ae133c36b Author: Jason A. Donenfeld Date: Wed Sep 20 16:58:39 2017 +0200 security/keys: rewrite all of big_key crypto commit 428490e38b2e352812e0b765d8bceafab0ec441d upstream. This started out as just replacing the use of crypto/rng with get_random_bytes_wait, so that we wouldn't use bad randomness at boot time. But, upon looking further, it appears that there were even deeper underlying cryptographic problems, and that this seems to have been committed with very little crypto review. So, I rewrote the whole thing, trying to keep to the conventions introduced by the previous author, to fix these cryptographic flaws. It makes no sense to seed crypto/rng at boot time and then keep using it like this, when in fact there's already get_random_bytes_wait, which can ensure there's enough entropy and be a much more standard way of generating keys. Since this sensitive material is being stored untrusted, using ECB and no authentication is simply not okay at all. I find it surprising and a bit horrifying that this code even made it past basic crypto review, which perhaps points to some larger issues. This patch moves from using AES-ECB to using AES-GCM. Since keys are uniquely generated each time, we can set the nonce to zero. There was also a race condition in which the same key would be reused at the same time in different threads. A mutex fixes this issue now. So, to summarize, this commit fixes the following vulnerabilities: * Low entropy key generation, allowing an attacker to potentially guess or predict keys. * Unauthenticated encryption, allowing an attacker to modify the cipher text in particular ways in order to manipulate the plaintext, which is is even more frightening considering the next point. * Use of ECB mode, allowing an attacker to trivially swap blocks or compare identical plaintext blocks. * Key re-use. * Faulty memory zeroing. Signed-off-by: Jason A. Donenfeld Reviewed-by: Eric Biggers Signed-off-by: David Howells Cc: Herbert Xu Cc: Kirill Marinushkin Cc: security@kernel.org Signed-off-by: Greg Kroah-Hartman commit cac291644b897b6a7f62eb373f20cd46a63ea8f2 Author: Jason A. Donenfeld Date: Wed Sep 20 16:58:38 2017 +0200 security/keys: properly zero out sensitive key material in big_key commit 910801809b2e40a4baedd080ef5d80b4a180e70e upstream. Error paths forgot to zero out sensitive material, so this patch changes some kfrees into a kzfrees. Signed-off-by: Jason A. Donenfeld Signed-off-by: David Howells Reviewed-by: Eric Biggers Cc: Herbert Xu Cc: Kirill Marinushkin Cc: security@kernel.org Signed-off-by: Greg Kroah-Hartman commit f18f482fd512c863ba1952a42872671329b7efbf Author: LEROY Christophe Date: Wed Sep 13 12:44:57 2017 +0200 crypto: talitos - fix hashing commit 886a27c0fc8a34633aadb0986dba11d8c150ae2e upstream. md5sum on some files gives wrong result Exemple: With the md5sum from libkcapi: c15115c05bad51113f81bdaee735dd09 test With the original md5sum: bbdf41d80ba7e8b2b7be3a0772be76cb test This patch fixes this issue Signed-off-by: Christophe Leroy Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 3f934b4fa7ff69ee4e1344401b887fa0eed52173 Author: LEROY Christophe Date: Wed Sep 13 12:44:51 2017 +0200 crypto: talitos - fix sha224 commit afd62fa26343be6445479e75de9f07092a061459 upstream. Kernel crypto tests report the following error at startup [ 2.752626] alg: hash: Test 4 failed for sha224-talitos [ 2.757907] 00000000: 30 e2 86 e2 e7 8a dd 0d d7 eb 9f d5 83 fe f1 b0 00000010: 2d 5a 6c a5 f9 55 ea fd 0e 72 05 22 This patch fixes it Signed-off-by: Christophe Leroy Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit b91a1fa5f120762e0d34686b067124c698eac478 Author: LEROY Christophe Date: Tue Sep 12 11:03:39 2017 +0200 crypto: talitos - Don't provide setkey for non hmac hashing algs. commit 56136631573baa537a15e0012055ffe8cfec1a33 upstream. Today, md5sum fails with error -ENOKEY because a setkey function is set for non hmac hashing algs, see strace output below: mmap(NULL, 378880, PROT_READ, MAP_SHARED, 6, 0) = 0x77f50000 accept(3, 0, NULL) = 7 vmsplice(5, [{"bin/\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 378880}], 1, SPLICE_F_MORE|SPLICE_F_GIFT) = 262144 splice(4, NULL, 7, NULL, 262144, SPLICE_F_MORE) = -1 ENOKEY (Required key not available) write(2, "Generation of hash for file kcap"..., 50) = 50 munmap(0x77f50000, 378880) = 0 This patch ensures that setkey() function is set only for hmac hashing. Signed-off-by: Christophe Leroy Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit ccf363f3e786d8fa40dcf625c2b08d22cb8d669c Author: Stephan Mueller Date: Thu Sep 14 17:10:28 2017 +0200 crypto: drbg - fix freeing of resources commit bd6227a150fdb56e7bb734976ef6e53a2c1cb334 upstream. During the change to use aligned buffers, the deallocation code path was not updated correctly. The current code tries to free the aligned buffer pointer and not the original buffer pointer as it is supposed to. Thus, the code is updated to free the original buffer pointer and set the aligned buffer pointer that is used throughout the code to NULL. Fixes: 3cfc3b9721123 ("crypto: drbg - use aligned buffers") CC: Herbert Xu Signed-off-by: Stephan Mueller Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 25c83d1654a81a034a9db6e176620b7ea00a238b Author: Alex Deucher Date: Fri Sep 15 11:55:27 2017 -0400 drm/radeon: disable hard reset in hibernate for APUs commit 820608548737e315c6f93e3099b4e65bde062334 upstream. Fixes a hibernation regression on APUs. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=191571 Fixes: 274ad65c9d02bdc (drm/radeon: hard reset r600 and newer GPU when hibernating.) Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 8b7e23bffecac5002408ef188fa766be13d42f2e Author: Jean Delvare Date: Mon Sep 11 17:43:56 2017 +0200 drm/amdgpu: revert tile table update for oland commit 4cf97582b46f123a4b7cd88d999f1806c2eb4093 upstream. Several users have complained that the tile table update broke Oland support. Despite several attempts to fix it, the root cause is still unknown at this point and no solution is available. As it is not acceptable to leave a known regression breaking a major functionality in the kernel for several releases, let's just reverse this optimization for now. It can be implemented again later if and only if the breakage is understood and fixed. As there were no complaints for Hainan so far, only the Oland part of the offending commit is reverted. Optimization is preserved on Hainan, so this commit isn't an actual revert of the original. This fixes bug #194761: https://bugzilla.kernel.org/show_bug.cgi?id=194761 Reviewed-by: Marek Olšák Signed-off-by: Jean Delvare Fixes: f8d9422ef80c ("drm/amdgpu: update tile table for oland/hainan") Cc: Flora Cui Cc: Junwei Zhang Cc: Alex Deucher Cc: Marek Olšák Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit c1ccc53fccbfc353727ebf76bb7daf6170678642 Author: Uma Shankar Date: Tue Sep 5 15:14:31 2017 +0530 Revert "drm/i915/bxt: Disable device ready before shutdown command" commit abeae421b03d800d33894df7fbca6d00c70c358e upstream. This reverts commit bbdf0b2ff32a ("drm/i915/bxt: Disable device ready before shutdown command"). Disable device ready before shutdown command was added previously to avoid a split screen issue seen on dual link DSI panels. As of now, dual link is not supported and will need some rework in the upstream code. For single link DSI panels, the change is not required. This will cause failure in sending SHUTDOWN packet during disable. Hence reverting the change. Will handle the change as part of dual link enabling in upstream. Fixes: bbdf0b2ff32a ("drm/i915/bxt: Disable device ready before shutdown command") Signed-off-by: Uma Shankar Signed-off-by: Vidya Srinivas Signed-off-by: Jani Nikula Link: https://patchwork.freedesktop.org/patch/msgid/1504604671-17237-1-git-send-email-vidya.srinivas@intel.com (cherry picked from commit 33c8d8870c67faf3161898a56af98ac3c1c71450) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman commit 7b2ac245fbfe8084361f444bb2ec8a67c7bf8835 Author: Changbin Du Date: Fri Aug 18 17:49:58 2017 +0800 drm/i915/gvt: Fix incorrect PCI BARs reporting commit 7b4dc3c0da0d66e7b20a826c537d41bb73e4df54 upstream. Looking at our virtual PCI device, we can see surprising Region 4 and Region 5. 00:10.0 VGA compatible controller: Intel Corporation Sky Lake Integrated Graphics (rev 06) (prog-if 00 [VGA controller]) .... Region 0: Memory at 140000000 (64-bit, non-prefetchable) [size=16M] Region 2: Memory at 180000000 (64-bit, prefetchable) [size=1G] Region 4: Memory at (32-bit, non-prefetchable) Region 5: Memory at (32-bit, non-prefetchable) Expansion ROM at febd6000 [disabled] [size=2K] The fact is that we only implemented BAR0 and BAR2. Surprising Region 4 and Region 5 are shown because we report their size as 0xffffffff. They should report size 0 instead. BTW, the physical GPU has a PIO BAR. GVTg hasn't implemented PIO access, so we ignored this BAR for vGPU device. v2: fix BAR size value calculation. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1458032 Signed-off-by: Changbin Du Signed-off-by: Zhenyu Wang (cherry picked from commit f1751362d6357a90bc6e53176cec715ff2dbed74) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman commit 3b930a9669b4a717d513c3524a930051e722890e Author: Marek Szyprowski Date: Thu Sep 14 14:01:00 2017 +0200 drm/exynos: Fix locking in the suspend/resume paths commit 5baf6bb0fd2388742a0846cc7bcacee6dec78235 upstream. Commit 48a92916729b ("drm/exynos: use drm_for_each_connector_iter()") replaced unsafe drm_for_each_connector() with drm_for_each_connector_iter() and removed surrounding drm_modeset_lock calls. However, that lock was there not only to protect unsafe drm_for_each_connector(), but it was also required to be held by the dpms code which was called from the loop body. This patch restores those drm_modeset_lock calls to fix broken suspend and resume of Exynos DRM subsystem in v4.13 kernel. Fixes: 48a92916729b ("drm/exynos: use drm_for_each_connector_iter()") Signed-off-by: Marek Szyprowski Acked-by: Krzysztof Kozlowski Signed-off-by: Inki Dae Signed-off-by: Greg Kroah-Hartman commit 93db68552b47a6d398e442c7ea5eaa51249389a2 Author: Guilherme G. Piccoli Date: Tue Sep 19 12:11:55 2017 -0300 scsi: aacraid: Add a small delay after IOP reset commit d1b490939d8c117a06dfc562c41d933f71d30289 upstream. Commit 0e9973ed3382 ("scsi: aacraid: Add periodic checks to see IOP reset status") changed the way driver checks if a reset succeeded. Now, after an IOP reset, aacraid immediately start polling a register to verify the reset is complete. This behavior cause regressions on the reset path in PowerPC (at least). Since the delay after the IOP reset was removed by the aforementioned patch, the fact driver just starts to read a register instantly after the reset was issued (by writing in another register) "corrupts" the reset procedure, which ends up failing all the time. The issue highly impacted kdump on PowerPC, since on kdump path we proactively issue a reset in adapter (through the reset_devices kernel parameter). This patch (re-)adds a delay right after IOP reset is issued. Empirically we measured that 3 seconds is enough, but for safety reasons we delay for 5s (and since it was 30s before, 5s is still a small amount). For reference, without this patch we observe the following messages on kdump kernel boot process: [ 76.294] aacraid 0003:01:00.0: IOP reset failed [ 76.294] aacraid 0003:01:00.0: ARC Reset attempt failed [ 86.524] aacraid 0003:01:00.0: adapter kernel panic'd ff. [ 86.524] aacraid 0003:01:00.0: Controller reset type is 3 [ 86.524] aacraid 0003:01:00.0: Issuing IOP reset [146.534] aacraid 0003:01:00.0: IOP reset failed [146.534] aacraid 0003:01:00.0: ARC Reset attempt failed Fixes: 0e9973ed3382 ("scsi: aacraid: Add periodic checks to see IOP reset status") Signed-off-by: Guilherme G. Piccoli Acked-by: Dave Carroll Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 52a2ae0dafdfa0f340cb8f5dc424e0851f2616bc Author: Dave Carroll Date: Fri Sep 15 11:04:28 2017 -0600 scsi: aacraid: Fix 2T+ drives on SmartIOC-2000 commit 6c92f7dbf25c36f35320e4ae0b508676410bac04 upstream. The logic for supporting large drives was previously tied to 4Kn support for SmartIOC-2000. As SmartIOC-2000 does not support volumes using 4Kn drives, use the intended option flag AAC_OPT_NEW_COMM_64 to determine support for volumes greater than 2T. Signed-off-by: Dave Carroll Reviewed-by: Christoph Hellwig Reviewed-by: Raghava Aditya Renukunta Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 1fc547cd09e2d7295656fde40a34706448bf6098 Author: Xin Long Date: Sun Aug 27 20:25:26 2017 +0800 scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly commit c88f0e6b06f4092995688211a631bb436125d77b upstream. ChunYu found a kernel crash by syzkaller: [ 651.617875] kasan: CONFIG_KASAN_INLINE enabled [ 651.618217] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 651.618731] general protection fault: 0000 [#1] SMP KASAN [ 651.621543] CPU: 1 PID: 9539 Comm: scsi Not tainted 4.11.0.cov #32 [ 651.621938] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 651.622309] task: ffff880117780000 task.stack: ffff8800a3188000 [ 651.622762] RIP: 0010:skb_release_data+0x26c/0x590 [...] [ 651.627260] Call Trace: [ 651.629156] skb_release_all+0x4f/0x60 [ 651.629450] consume_skb+0x1a5/0x600 [ 651.630705] netlink_unicast+0x505/0x720 [ 651.632345] netlink_sendmsg+0xab2/0xe70 [ 651.633704] sock_sendmsg+0xcf/0x110 [ 651.633942] ___sys_sendmsg+0x833/0x980 [ 651.637117] __sys_sendmsg+0xf3/0x240 [ 651.638820] SyS_sendmsg+0x32/0x50 [ 651.639048] entry_SYSCALL_64_fastpath+0x1f/0xc2 It's caused by skb_shared_info at the end of sk_buff was overwritten by ISCSI_KEVENT_IF_ERROR when parsing nlmsg info from skb in iscsi_if_rx. During the loop if skb->len == nlh->nlmsg_len and both are sizeof(*nlh), ev = nlmsg_data(nlh) will acutally get skb_shinfo(SKB) instead and set a new value to skb_shinfo(SKB)->nr_frags by ev->type. This patch is to fix it by checking nlh->nlmsg_len properly there to avoid over accessing sk_buff. Reported-by: ChunYu Wang Signed-off-by: Xin Long Acked-by: Chris Leech Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 954b490e78e5e67ddaf5e37b481c1b899497fb60 Author: Dennis Yang Date: Wed Sep 6 11:02:35 2017 +0800 md/raid5: preserve STRIPE_ON_UNPLUG_LIST in break_stripe_batch_list commit 184a09eb9a2fe425e49c9538f1604b05ed33cfef upstream. In release_stripe_plug(), if a stripe_head has its STRIPE_ON_UNPLUG_LIST set, it indicates that this stripe_head is already in the raid5_plug_cb list and release_stripe() would be called instead to drop a reference count. Otherwise, the STRIPE_ON_UNPLUG_LIST bit would be set for this stripe_head and it will get queued into the raid5_plug_cb list. Since break_stripe_batch_list() did not preserve STRIPE_ON_UNPLUG_LIST, A stripe could be re-added to plug list while it is still on that list in the following situation. If stripe_head A is added to another stripe_head B's batch list, in this case A will have its batch_head != NULL and be added into the plug list. After that, stripe_head B gets handled and called break_stripe_batch_list() to reset all the batched stripe_head(including A which is still on the plug list)'s state and reset their batch_head to NULL. Before the plug list gets processed, if there is another write request comes in and get stripe_head A, A will have its batch_head == NULL (cleared by calling break_stripe_batch_list() on B) and be added to plug list once again. Signed-off-by: Dennis Yang Signed-off-by: Shaohua Li Signed-off-by: Greg Kroah-Hartman commit da0a7f82078e1c542b6e5991e48cbf39a2c07b22 Author: Shaohua Li Date: Fri Aug 25 10:40:02 2017 -0700 md/raid5: fix a race condition in stripe batch commit 3664847d95e60a9a943858b7800f8484669740fc upstream. We have a race condition in below scenario, say have 3 continuous stripes, sh1, sh2 and sh3, sh1 is the stripe_head of sh2 and sh3: CPU1 CPU2 CPU3 handle_stripe(sh3) stripe_add_to_batch_list(sh3) -> lock(sh2, sh3) -> lock batch_lock(sh1) -> add sh3 to batch_list of sh1 -> unlock batch_lock(sh1) clear_batch_ready(sh1) -> lock(sh1) and batch_lock(sh1) -> clear STRIPE_BATCH_READY for all stripes in batch_list -> unlock(sh1) and batch_lock(sh1) ->clear_batch_ready(sh3) -->test_and_clear_bit(STRIPE_BATCH_READY, sh3) --->return 0 as sh->batch == NULL -> sh3->batch_head = sh1 -> unlock (sh2, sh3) In CPU1, handle_stripe will continue handle sh3 even it's in batch stripe list of sh1. By moving sh3->batch_head assignment in to batch_lock, we make it impossible to clear STRIPE_BATCH_READY before batch_head is set. Thanks Stephane for helping debug this tricky issue. Reported-and-tested-by: Stephane Thiell Signed-off-by: Shaohua Li Signed-off-by: Greg Kroah-Hartman commit 38f8ae6d625eeebb73c47e4a2876066af3e62cdb Author: Steven Rostedt (VMware) Date: Thu Sep 21 13:00:21 2017 -0400 tracing: Remove RCU work arounds from stack tracer commit 15516c89acce948debc4c598e03c3fee53045797 upstream. Currently the stack tracer calls rcu_irq_enter() to make sure RCU is watching when it records a stack trace. But if the stack tracer is triggered while tracing inside of a rcu_irq_enter(), calling rcu_irq_enter() unconditionally can be problematic. The reason for having rcu_irq_enter() in the first place has been fixed from within the saving of the stack trace code, and there's no reason for doing it in the stack tracer itself. Just remove it. Fixes: 0be964be0 ("module: Sanitize RCU usage and locking") Acked-by: Paul E. McKenney Suggested-by: "Paul E. McKenney" Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit 422f1a31123dd1b75a1964acabb63c23566ced16 Author: Bo Yan Date: Mon Sep 18 10:03:35 2017 -0700 tracing: Erase irqsoff trace with empty write commit 8dd33bcb7050dd6f8c1432732f930932c9d3a33e upstream. One convenient way to erase trace is "echo > trace". However, this is currently broken if the current tracer is irqsoff tracer. This is because irqsoff tracer use max_buffer as the default trace buffer. Set the max_buffer as the one to be cleared when it's the trace buffer currently in use. Link: http://lkml.kernel.org/r/1505754215-29411-1-git-send-email-byan@nvidia.com Cc: Fixes: 4acd4d00f ("tracing: give easy way to clear trace buffer") Signed-off-by: Bo Yan Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit 990a94beadbbb75fffcf8bd894561ab2e05231d7 Author: Tahsin Erdogan Date: Sun Sep 17 03:23:48 2017 -0700 tracing: Fix trace_pipe behavior for instance traces commit 75df6e688ccd517e339a7c422ef7ad73045b18a2 upstream. When reading data from trace_pipe, tracing_wait_pipe() performs a check to see if tracing has been turned off after some data was read. Currently, this check always looks at global trace state, but it should be checking the trace instance where trace_pipe is located at. Because of this bug, cat instances/i1/trace_pipe in the following script will immediately exit instead of waiting for data: cd /sys/kernel/debug/tracing echo 0 > tracing_on mkdir -p instances/i1 echo 1 > instances/i1/tracing_on echo 1 > instances/i1/events/sched/sched_process_exec/enable cat instances/i1/trace_pipe Link: http://lkml.kernel.org/r/20170917102348.1615-1-tahsin@google.com Fixes: 10246fa35d4f ("tracing: give easy way to clear trace buffer") Signed-off-by: Tahsin Erdogan Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit a9bb94fb1107455a0300d0061feda6d81ae5e33a Author: Benjamin Herrenschmidt Date: Wed Sep 6 15:20:55 2017 +1000 KVM: PPC: Book3S HV: Don't access XIVE PIPR register using byte accesses commit d222af072380c4470295c07d84ecb15f4937e365 upstream. The XIVE interrupt controller on POWER9 machines doesn't support byte accesses to any register in the thread management area other than the CPPR (current processor priority register). In particular, when reading the PIPR (pending interrupt priority register), we need to do a 32-bit or 64-bit load. Fixes: 2c4fb78f78b6 ("KVM: PPC: Book3S HV: Workaround POWER9 DD1.0 bug causing IPB bit loss") Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Paul Mackerras Signed-off-by: Greg Kroah-Hartman commit 14233b6e4dade947d10d04b0440982443aa9d4d5 Author: Paul Mackerras Date: Tue Sep 12 13:47:23 2017 +1000 KVM: PPC: Book3S HV: Fix bug causing host SLB to be restored incorrectly commit 67f8a8c1151c9ef3d1285905d1e66ebb769ecdf7 upstream. Aneesh Kumar reported seeing host crashes when running recent kernels on POWER8. The symptom was an oops like this: Unable to handle kernel paging request for data at address 0xf00000000786c620 Faulting instruction address: 0xc00000000030e1e4 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: powernv_op_panel CPU: 24 PID: 6663 Comm: qemu-system-ppc Tainted: G W 4.13.0-rc7-43932-gfc36c59 #2 task: c000000fdeadfe80 task.stack: c000000fdeb68000 NIP: c00000000030e1e4 LR: c00000000030de6c CTR: c000000000103620 REGS: c000000fdeb6b450 TRAP: 0300 Tainted: G W (4.13.0-rc7-43932-gfc36c59) MSR: 9000000000009033 CR: 24044428 XER: 20000000 CFAR: c00000000030e134 DAR: f00000000786c620 DSISR: 40000000 SOFTE: 0 GPR00: 0000000000000000 c000000fdeb6b6d0 c0000000010bd000 000000000000e1b0 GPR04: c00000000115e168 c000001fffa6e4b0 c00000000115d000 c000001e1b180386 GPR08: f000000000000000 c000000f9a8913e0 f00000000786c600 00007fff587d0000 GPR12: c000000fdeb68000 c00000000fb0f000 0000000000000001 00007fff587cffff GPR16: 0000000000000000 c000000000000000 00000000003fffff c000000fdebfe1f8 GPR20: 0000000000000004 c000000fdeb6b8a8 0000000000000001 0008000000000040 GPR24: 07000000000000c0 00007fff587cffff c000000fdec20bf8 00007fff587d0000 GPR28: c000000fdeca9ac0 00007fff587d0000 00007fff587c0000 00007fff587d0000 NIP [c00000000030e1e4] __get_user_pages_fast+0x434/0x1070 LR [c00000000030de6c] __get_user_pages_fast+0xbc/0x1070 Call Trace: [c000000fdeb6b6d0] [c00000000139dab8] lock_classes+0x0/0x35fe50 (unreliable) [c000000fdeb6b7e0] [c00000000030ef38] get_user_pages_fast+0xf8/0x120 [c000000fdeb6b830] [c000000000112318] kvmppc_book3s_hv_page_fault+0x308/0xf30 [c000000fdeb6b960] [c00000000010e10c] kvmppc_vcpu_run_hv+0xfdc/0x1f00 [c000000fdeb6bb20] [c0000000000e915c] kvmppc_vcpu_run+0x2c/0x40 [c000000fdeb6bb40] [c0000000000e5650] kvm_arch_vcpu_ioctl_run+0x110/0x300 [c000000fdeb6bbe0] [c0000000000d6468] kvm_vcpu_ioctl+0x528/0x900 [c000000fdeb6bd40] [c0000000003bc04c] do_vfs_ioctl+0xcc/0x950 [c000000fdeb6bde0] [c0000000003bc930] SyS_ioctl+0x60/0x100 [c000000fdeb6be30] [c00000000000b96c] system_call+0x58/0x6c Instruction dump: 7ca81a14 2fa50000 41de0010 7cc8182a 68c60002 78c6ffe2 0b060000 3cc2000a 794a3664 390610d8 e9080000 7d485214 7d435378 790507e1 408202f0 ---[ end trace fad4a342d0414aa2 ]--- It turns out that what has happened is that the SLB entry for the vmmemap region hasn't been reloaded on exit from a guest, and it has the wrong page size. Then, when the host next accesses the vmemmap region, it gets a page fault. Commit a25bd72badfa ("powerpc/mm/radix: Workaround prefetch issue with KVM", 2017-07-24) modified the guest exit code so that it now only clears out the SLB for hash guest. The code tests the radix flag and puts the result in a non-volatile CR field, CR2, and later branches based on CR2. Unfortunately, the kvmppc_save_tm function, which gets called between those two points, modifies all the user-visible registers in the case where the guest was in transactional or suspended state, except for a few which it restores (namely r1, r2, r9 and r13). Thus the hash/radix indication in CR2 gets corrupted. This fixes the problem by re-doing the comparison just before the result is needed. For good measure, this also adds comments next to the call sites of kvmppc_save_tm and kvmppc_restore_tm pointing out that non-volatile register state will be lost. Fixes: a25bd72badfa ("powerpc/mm/radix: Workaround prefetch issue with KVM") Tested-by: Aneesh Kumar K.V Signed-off-by: Paul Mackerras Signed-off-by: Greg Kroah-Hartman commit 6c2413877e0e058e2d426c3746152bcbf3128dd6 Author: Paul Mackerras Date: Mon Sep 11 16:05:30 2017 +1000 KVM: PPC: Book3S HV: Hold kvm->lock around call to kvmppc_update_lpcr commit cf5f6f3125241853462334b1bc696f3c3c492178 upstream. Commit 468808bd35c4 ("KVM: PPC: Book3S HV: Set process table for HPT guests on POWER9", 2017-01-30) added a call to kvmppc_update_lpcr() which doesn't hold the kvm->lock mutex around the call, as required. This adds the lock/unlock pair, and for good measure, includes the kvmppc_setup_partition_table() call in the locked region, since it is altering global state of the VM. This error appears not to have any fatal consequences for the host; the consequences would be that the VCPUs could end up running with different LPCR values, or an update to the LPCR value by userspace using the one_reg interface could get overwritten, or the update done by kvmhv_configure_mmu() could get overwritten. Fixes: 468808bd35c4 ("KVM: PPC: Book3S HV: Set process table for HPT guests on POWER9") Signed-off-by: Paul Mackerras Signed-off-by: Greg Kroah-Hartman commit bbbbdfcb53297a7bce3d328f393710f413b39d47 Author: Thomas Gleixner Date: Wed Sep 13 23:29:03 2017 +0200 genirq: Fix cpumask check in __irq_startup_managed() commit 9cb067ef8a10bb13112e4d1c0ea996ec96527422 upstream. The result of cpumask_any_and() is invalid when result greater or equal nr_cpu_ids. The current check is checking for greater only. Fix it. Fixes: 761ea388e8c4 ("genirq: Handle managed irqs gracefully in irq_startup()") Signed-off-by: Thomas Gleixner Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Tony Luck Cc: Chen Yu Cc: Marc Zyngier Cc: Alok Kataria Cc: Joerg Roedel Cc: "Rafael J. Wysocki" Cc: Steven Rostedt Cc: Christoph Hellwig Cc: Peter Zijlstra Cc: Borislav Petkov Cc: Paolo Bonzini Cc: Rui Zhang Cc: "K. Y. Srinivasan" Cc: Arjan van de Ven Cc: Dan Williams Cc: Len Brown Link: http://lkml.kernel.org/r/20170913213152.272283444@linutronix.de Signed-off-by: Greg Kroah-Hartman commit c43ceff9ed70be01e82e763415234ee87b5fa5d4 Author: John Keeping Date: Wed Sep 6 10:35:40 2017 +0100 genirq/msi: Fix populating multiple interrupts commit 596a7a1d0989c621c3ae49be73a1d1f9de22eb5a upstream. On allocating the interrupts routed via a wire-to-MSI bridge, the allocator iterates over the MSI descriptors to build the hierarchy, but fails to use the descriptor interrupt number, and instead uses the base number, generating the wrong IRQ domain mappings. The fix is to use the MSI descriptor interrupt number when setting up the interrupt instead of the base interrupt for the allocation range. The only saving grace is that although the MSI descriptors are allocated in bulk, the wired interrupts are only allocated one by one (so desc->irq == virq) and the bug went unnoticed so far. Fixes: 2145ac9310b60 ("genirq/msi: Add msi_domain_populate_irqs") Signed-off-by: John Keeping Signed-off-by: Thomas Gleixner Reviewed-by: Marc Zyngier Link: http://lkml.kernel.org/r/20170906103540.373864a2.john@metanate.com Signed-off-by: Greg Kroah-Hartman commit 693057fcf9c6ec84dd4cce032485c502e1485c6f Author: Thomas Gleixner Date: Tue Sep 5 10:12:20 2017 +0200 genirq: Make sparse_irq_lock protect what it should protect commit 12ac1d0f6c3e95732d144ffa65c8b20fbd9aa462 upstream. for_each_active_irq() iterates the sparse irq allocation bitmap. The caller must hold sparse_irq_lock. Several code pathes expect that an active bit in the sparse bitmap also has a valid interrupt descriptor. Unfortunately that's not true. The (de)allocation is a two step process, which holds the sparse_irq_lock only across the queue/remove from the radix tree and the set/clear in the allocation bitmap. If a iteration locks sparse_irq_lock between the two steps, then it might see an active bit but the corresponding irq descriptor is NULL. If that is dereferenced unconditionally, then the kernel oopses. Of course, all iterator sites could be audited and fixed, but.... There is no reason why the sparse_irq_lock needs to be dropped between the two steps, in fact the code becomes simpler when the mutex is held across both and the semantics become more straight forward, so future problems of missing NULL pointer checks in the iteration are avoided and all existing sites are fixed in one go. Expand the lock held sections so both operations are covered and the bitmap and the radixtree are in sync. Fixes: a05a900a51c7 ("genirq: Make sparse_lock a mutex") Reported-and-tested-by: Huang Ying Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 05975ba8b0cd9efed2104d01d6b68f29cf2f8177 Author: Johannes Berg Date: Wed Sep 6 15:01:42 2017 +0200 mac80211: fix deadlock in driver-managed RX BA session start commit bde59c475e0883e4c4294bcd9b9c7e08ae18c828 upstream. When an RX BA session is started by the driver, and it has to tell mac80211 about it, the corresponding bit in tid_rx_manage_offl gets set and the BA session work is scheduled. Upon testing this bit, it will call __ieee80211_start_rx_ba_session(), thus deadlocking as it already holds the ampdu_mlme.mtx, which that acquires again. Fix this by adding ___ieee80211_start_rx_ba_session(), a version of the function that requires the mutex already held. Fixes: 699cb58c8a52 ("mac80211: manage RX BA session offload without SKB queue") Reported-by: Matteo Croce Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 0451fbad733205821563d4ad1a736e127a5e63fb Author: Avraham Stern Date: Fri Aug 18 15:33:57 2017 +0300 mac80211: flush hw_roc_start work before cancelling the ROC commit 6e46d8ce894374fc135c96a8d1057c6af1fef237 upstream. When HW ROC is supported it is possible that after the HW notified that the ROC has started, the ROC was cancelled and another ROC was added while the hw_roc_start worker is waiting on the mutex (since cancelling the ROC and adding another one also holds the same mutex). As a result, the hw_roc_start worker will continue to run after the new ROC is added but before it is actually started by the HW. This may result in notifying userspace that the ROC has started before it actually does, or in case of management tx ROC, in an attempt to tx while not on the right channel. In addition, when the driver will notify mac80211 that the second ROC has started, mac80211 will warn that this ROC has already been notified. Fix this by flushing the hw_roc_start work before cancelling an ROC. Cc: stable@vger.kernel.org Signed-off-by: Avraham Stern Signed-off-by: Luca Coelho Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 1181978d7a56ccad9c9cad8f92daace6ec8183fa Author: Beni Lev Date: Tue Jul 25 11:25:25 2017 +0300 mac80211_hwsim: Use proper TX power commit 9de981f507474f326e42117858dc9a9321331ae5 upstream. In struct ieee80211_tx_info, control.vif pointer and rate_driver_data[0] falls on the same place, depending on the union usage. During the whole TX process, the union is referred to as a control struct, which holds the vif that is later used in the tx flow, especially in order to derive the used tx power. Referring direcly to rate_driver_data[0] and assigning a value to it, overwrites the vif pointer, hence making all later references irrelevant. Moreover, rate_driver_data[0] isn't used later in the flow in order to retrieve the channel that it is pointing to. Signed-off-by: Beni Lev Signed-off-by: Luca Coelho Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 9470810b784ccfc1c16fa27bd98c60d31798437d Author: Johannes Berg Date: Thu Jun 22 12:20:30 2017 +0200 mac80211: fix VLAN handling with TXQs commit 53168215909281a09d3afc6fb51a9d4f81f74d39 upstream. With TXQs, the AP_VLAN interfaces are resolved to their owner AP interface when enqueuing the frame, which makes sense since the frame really goes out on that as far as the driver is concerned. However, this introduces a problem: frames to be encrypted with a VLAN-specific GTK will now be encrypted with the AP GTK, since the information about which virtual interface to use to select the key is taken from the TXQ. Fix this by preserving info->control.vif and using that in the dequeue function. This now requires doing the driver-mapping in the dequeue as well. Since there's no way to filter the frames that are sitting on a TXQ, drop all frames, which may affect other interfaces, when an AP_VLAN is removed. Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 1ae6f05d4204d3a128bb9ba2c42e2a6c4ac687f1 Author: Steve French Date: Sun Sep 17 10:41:35 2017 -0500 SMB3: Add support for multidialect negotiate (SMB2.1 and later) commit 9764c02fcbad40001fd3f63558d918e4d519bb75 upstream. With the need to discourage use of less secure dialect, SMB1 (CIFS), we temporarily upgraded the dialect to SMB3 in 4.13, but since there are various servers which only support SMB2.1 (2.1 is more secure than CIFS/SMB1) but not optimal for a default dialect - add support for multidialect negotiation. cifs.ko will now request SMB2.1 or later (ie SMB2.1 or SMB3.0, SMB3.02) and the server will pick the latest most secure one it can support. In addition since we are sending multidialect negotiate, add support for secure negotiate to validate that a man in the middle didn't downgrade us. Signed-off-by: Steve French Reviewed-by: Pavel Shilovsky Signed-off-by: Greg Kroah-Hartman commit 7bcaa27f339f9b85f9d78740fd2490e76e8646fc Author: Christoph Hellwig Date: Thu Sep 7 13:54:36 2017 +0200 scsi: scsi_transport_fc: fix NULL pointer dereference in fc_bsg_job_timeout commit b468b6a4969f9bdddb31d484f151bfa03fbee767 upstream. bsg-lib now embeddeds the job structure into the request, and req->special can't be used anymore. Signed-off-by: Christoph Hellwig Reviewed-by: Ming Lei Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit ca2d34891acd1fedf7a7e8d2a9cbad7b90997b08 Author: John Ogness Date: Thu Sep 14 11:42:17 2017 +0200 fs/proc: Report eip/esp in /prod/PID/stat for coredumping commit fd7d56270b526ca3ed0c224362e3c64a0f86687a upstream. Commit 0a1eb2d474ed ("fs/proc: Stop reporting eip and esp in /proc/PID/stat") stopped reporting eip/esp because it is racy and dangerous for executing tasks. The comment adds: As far as I know, there are no use programs that make any material use of these fields, so just get rid of them. However, existing userspace core-dump-handler applications (for example, minicoredumper) are using these fields since they provide an excellent cross-platform interface to these valuable pointers. So that commit introduced a user space visible regression. Partially revert the change and make the readout possible for tasks with the proper permissions and only if the target task has the PF_DUMPCORE flag set. Fixes: 0a1eb2d474ed ("fs/proc: Stop reporting eip and esp in> /proc/PID/stat") Reported-by: Marco Felsch Signed-off-by: John Ogness Reviewed-by: Andy Lutomirski Cc: Tycho Andersen Cc: Kees Cook Cc: Peter Zijlstra Cc: Brian Gerst Cc: Tetsuo Handa Cc: Borislav Petkov Cc: Al Viro Cc: Linux API Cc: Andrew Morton Cc: Linus Torvalds Link: http://lkml.kernel.org/r/87poatfwg6.fsf@linutronix.de Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 3b4692fe4111e75c035f256e8732c95328e42f60 Author: Adrian Hunter Date: Thu Sep 7 10:40:35 2017 +0300 mmc: block: Fix incorrectly initialized requests commit 01f5bbd17a8066b58dba9b5049fad504bce67322 upstream. mmc_init_request() depends on card->bouncesz so it must be calculated before blk_init_allocated_queue() starts allocating requests. Reported-by: Seraphime Kirkovski Fixes: 304419d8a7e9 ("mmc: core: Allocate per-request data using the..") Signed-off-by: Adrian Hunter Tested-by: Seraphime Kirkovski Signed-off-by: Ulf Hansson Tested-by: Pavel Machek Signed-off-by: Greg Kroah-Hartman commit 33897f02a9696eddb9ced9d91daf17d4fbade69a Author: Hyunchul Lee Date: Mon Jul 31 16:22:20 2017 +0900 dm integrity: do not check integrity for failed read operations commit b7e326f7b7375392d06f9cfbc27a7c63181f69d7 upstream. Even though read operations fail, dm_integrity_map_continue() calls integrity_metadata() to check integrity. In this case, just complete these. This also makes it so read I/O errors do not generate integrity warnings in the kernel log. Signed-off-by: Hyunchul Lee Acked-by: Milan Broz Acked-by: Mikulas Patocka Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit bfc0ab41a82c8b7e68aae241b19bd716dd678521 Author: Mikulas Patocka Date: Thu Aug 31 21:47:43 2017 -0400 dax: remove the pmem_dax_ops->flush abstraction commit c3ca015fab6df124c933b91902f3f2a3473f9da5 upstream. Commit abebfbe2f731 ("dm: add ->flush() dax operation support") is buggy. A DM device may be composed of multiple underlying devices and all of them need to be flushed. That commit just routes the flush request to the first device and ignores the other devices. It could be fixed by adding more complex logic to the device mapper. But there is only one implementation of the method pmem_dax_ops->flush - that is pmem_dax_flush() - and it calls arch_wb_cache_pmem(). Consequently, we don't need the pmem_dax_ops->flush abstraction at all, we can call arch_wb_cache_pmem() directly from dax_flush() because dax_dev->ops->flush can't ever reach anything different from arch_wb_cache_pmem(). It should be also pointed out that for some uses of persistent memory it is needed to flush only a very small amount of data (such as 1 cacheline), and it would be overkill if we go through that device mapper machinery for a single flushed cache line. Fix this by removing the pmem_dax_ops->flush abstraction and call arch_wb_cache_pmem() directly from dax_flush(). Also, remove the device mapper code that forwards the flushes. Fixes: abebfbe2f731 ("dm: add ->flush() dax operation support") Signed-off-by: Mikulas Patocka Reviewed-by: Dan Williams Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 8ce9fe2f84576f67e6a2407ff18995c54470e990 Author: Christoph Hellwig Date: Wed Sep 6 12:19:57 2017 +0200 nvme-pci: propagate (some) errors from host memory buffer setup commit 9620cfba97a8b88ae91f0e275e8ff110b578bb6e upstream. We want to catch command execution errors when resetting the device, so propagate errors from the Set Features when setting up the host memory buffer. We keep ignoring memory allocation failures, as the spec clearly says that the controller must work without a host memory buffer. Signed-off-by: Christoph Hellwig Reviewed-by: Keith Busch Signed-off-by: Greg Kroah-Hartman commit 2e81e134465ed6fb4287d25d4e274ea436bf6936 Author: Akinobu Mita Date: Wed Sep 6 12:15:31 2017 +0200 nvme-pci: use appropriate initial chunk size for HMB allocation commit 30f92d62e5b41a94de2d0bbd677a6ea2fcfed74f upstream. The initial chunk size for host memory buffer allocation is currently PAGE_SIZE << MAX_ORDER. MAX_ORDER order allocation is usually failed without CONFIG_DMA_CMA. So the HMB allocation is retried with chunk size PAGE_SIZE << (MAX_ORDER - 1) in general, but there is no problem if the retry allocation works correctly. Signed-off-by: Akinobu Mita [hch: rebased] Signed-off-by: Christoph Hellwig Reviewed-by: Keith Busch Signed-off-by: Greg Kroah-Hartman commit 4230ffe5eeb424ad2e2ae7ca598770791af069dc Author: Christoph Hellwig Date: Mon Sep 11 12:08:43 2017 -0400 nvme-pci: fix host memory buffer allocation fallback commit 92dc689563170b90ba844b8a2eb95e8a5eda2e83 upstream. nvme_alloc_host_mem currently contains two loops that are interwinded, and the outer retry loop turns out to be broken. Fix this by untangling the two. Based on a report an initial patch from Akinobu Mita. Signed-off-by: Christoph Hellwig Reported-by: Akinobu Mita Tested-by: Akinobu Mita Reviewed-by: Keith Busch Signed-off-by: Greg Kroah-Hartman commit 78e73b27866e4012de133cbd11a4f9f71b42cbcd Author: Shu Wang Date: Fri Sep 8 18:48:33 2017 +0800 cifs: release auth_key.response for reconnect. commit f5c4ba816315d3b813af16f5571f86c8d4e897bd upstream. There is a race that cause cifs reconnect in cifs_mount, - cifs_mount - cifs_get_tcp_session - [ start thread cifs_demultiplex_thread - cifs_read_from_socket: -ECONNABORTED - DELAY_WORK smb2_reconnect_server ] - cifs_setup_session - [ smb2_reconnect_server ] auth_key.response was allocated in cifs_setup_session, and will release when the session destoried. So when session re- connect, auth_key.response should be check and released. Tested with my system: CIFS VFS: Free previous auth_key.response = ffff8800320bbf80 A simple auth_key.response allocation call trace: - cifs_setup_session - SMB2_sess_setup - SMB2_sess_auth_rawntlmssp_authenticate - build_ntlmssp_auth_blob - setup_ntlmv2_rsp Signed-off-by: Shu Wang Signed-off-by: Steve French Reviewed-by: Ronnie Sahlberg Signed-off-by: Greg Kroah-Hartman commit 9b2f0de91300994ddec13491a8e28683dbd0b8df Author: Shu Wang Date: Thu Sep 7 16:03:27 2017 +0800 cifs: release cifs root_cred after exit_cifs commit 94183331e815617246b1baa97e0916f358c794bb upstream. memory leak was found by kmemleak. exit_cifs_spnego should be called before cifs module removed, or cifs root_cred will not be released. kmemleak report: unreferenced object 0xffff880070a3ce40 (size 192): backtrace: kmemleak_alloc+0x4a/0xa0 kmem_cache_alloc+0xc7/0x1d0 prepare_kernel_cred+0x20/0x120 init_cifs_spnego+0x2d/0x170 [cifs] 0xffffffffc07801f3 do_one_initcall+0x51/0x1b0 do_init_module+0x60/0x1fd load_module+0x161e/0x1b60 SYSC_finit_module+0xa9/0x100 SyS_finit_module+0xe/0x10 Signed-off-by: Shu Wang Signed-off-by: Steve French Reviewed-by: Ronnie Sahlberg Signed-off-by: Greg Kroah-Hartman commit aa4ccfd1e2726f0d71d0ad08aeea9efbc6601e48 Author: Ronnie Sahlberg Date: Fri Sep 8 10:37:35 2017 +1000 cifs: check rsp for NULL before dereferencing in SMB2_open commit bf2afee14e07de16d3cafc67edbfc2a3cc65e4bc upstream. In SMB2_open there are several paths where the SendReceive2 call will return an error before it sets rsp_iov.iov_base thus leaving iov_base uninitialized. Thus we need to check rsp before we dereference it in the call to get_rfc1002_length(). A report of this issue was previously reported in http://www.spinics.net/lists/linux-cifs/msg12846.html RH-bugzilla : 1476151 Version 2 : * Lets properly initialize rsp_iov before we use it. Signed-off-by: Ronnie Sahlberg Reviewed-by: Pavel Shilovsky Signed-off-by: Steve French Reported-by: Xiaoli Feng Signed-off-by: Greg Kroah-Hartman