commit 95f15f5ed6e68b479c73da842012108c567c6999 Author: Greg Kroah-Hartman Date: Tue Aug 16 09:35:15 2016 +0200 Linux 4.7.1 commit 9cab14cefa37ebda994441163d0deb8b244666fe Author: Vegard Nossum Date: Thu Jul 14 23:02:47 2016 -0400 ext4: fix reference counting bug on block allocation error commit 554a5ccc4e4a20c5f3ec859de0842db4b4b9c77e upstream. If we hit this error when mounted with errors=continue or errors=remount-ro: EXT4-fs error (device loop0): ext4_mb_mark_diskspace_used:2940: comm ext4.exe: Allocating blocks 5090-6081 which overlap fs metadata then ext4_mb_new_blocks() will call ext4_mb_release_context() and try to continue. However, ext4_mb_release_context() is the wrong thing to call here since we are still actually using the allocation context. Instead, just error out. We could retry the allocation, but there is a possibility of getting stuck in an infinite loop instead, so this seems safer. [ Fixed up so we don't return EAGAIN to userspace. --tytso ] Fixes: 8556e8f3b6 ("ext4: Don't allow new groups to be added during block allocation") Signed-off-by: Vegard Nossum Signed-off-by: Theodore Ts'o Cc: Aneesh Kumar K.V Signed-off-by: Greg Kroah-Hartman commit b58d417d8af61e9ef36deced326d156c53cba228 Author: Vegard Nossum Date: Thu Jul 14 23:21:35 2016 -0400 ext4: short-cut orphan cleanup on error commit c65d5c6c81a1f27dec5f627f67840726fcd146de upstream. If we encounter a filesystem error during orphan cleanup, we should stop. Otherwise, we may end up in an infinite loop where the same inode is processed again and again. EXT4-fs (loop0): warning: checktime reached, running e2fsck is recommended EXT4-fs error (device loop0): ext4_mb_generate_buddy:758: group 2, block bitmap and bg descriptor inconsistent: 6117 vs 0 free clusters Aborting journal on device loop0-8. EXT4-fs (loop0): Remounting filesystem read-only EXT4-fs error (device loop0) in ext4_free_blocks:4895: Journal has aborted EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted EXT4-fs error (device loop0) in ext4_ext_remove_space:3068: IO failure EXT4-fs error (device loop0) in ext4_ext_truncate:4667: Journal has aborted EXT4-fs error (device loop0) in ext4_orphan_del:2927: Journal has aborted EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted EXT4-fs (loop0): Inode 16 (00000000618192a0): orphan list check failed! [...] EXT4-fs (loop0): Inode 16 (0000000061819748): orphan list check failed! [...] EXT4-fs (loop0): Inode 16 (0000000061819bf0): orphan list check failed! [...] See-also: c9eb13a9105 ("ext4: fix hang when processing corrupted orphaned inode list") Cc: Jan Kara Signed-off-by: Vegard Nossum Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 9cfaab733a0143f19de96c97c1876d56800478a4 Author: Theodore Ts'o Date: Tue Jul 5 20:01:52 2016 -0400 ext4: validate s_reserved_gdt_blocks on mount commit 5b9554dc5bf008ae7f68a52e3d7e76c0920938a2 upstream. If s_reserved_gdt_blocks is extremely large, it's possible for ext4_init_block_bitmap(), which is called when ext4 sets up an uninitialized block bitmap, to corrupt random kernel memory. Add the same checks which e2fsck has --- it must never be larger than blocksize / sizeof(__u32) --- and then add a backup check in ext4_init_block_bitmap() in case the superblock gets modified after the file system is mounted. Reported-by: Vegard Nossum Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit efc588bf340f52863f9faa30970928e829c1043a Author: Vegard Nossum Date: Mon Jul 4 11:03:00 2016 -0400 ext4: don't call ext4_should_journal_data() on the journal inode commit 6a7fd522a7c94cdef0a3b08acf8e6702056e635c upstream. If ext4_fill_super() fails early, it's possible for ext4_evict_inode() to call ext4_should_journal_data() before superblock options and flags are fully set up. In that case, the iput() on the journal inode can end up causing a BUG(). Work around this problem by reordering the tests so we only call ext4_should_journal_data() after we know it's not the journal inode. Fixes: 2d859db3e4 ("ext4: fix data corruption in inodes with journalled data") Fixes: 2b405bfa84 ("ext4: fix data=journal fast mount/umount hang") Cc: Jan Kara Signed-off-by: Vegard Nossum Signed-off-by: Theodore Ts'o Reviewed-by: Jan Kara Signed-off-by: Greg Kroah-Hartman commit 4e23a593dd50748f0c725ba3556ec3d00380e126 Author: Jan Kara Date: Mon Jul 4 10:14:01 2016 -0400 ext4: fix deadlock during page writeback commit 646caa9c8e196880b41cd3e3d33a2ebc752bdb85 upstream. Commit 06bd3c36a733 (ext4: fix data exposure after a crash) uncovered a deadlock in ext4_writepages() which was previously much harder to hit. After this commit xfstest generic/130 reproduces the deadlock on small filesystems. The problem happens when ext4_do_update_inode() sets LARGE_FILE feature and marks current inode handle as synchronous. That subsequently results in ext4_journal_stop() called from ext4_writepages() to block waiting for transaction commit while still holding page locks, reference to io_end, and some prepared bio in mpd structure each of which can possibly block transaction commit from completing and thus results in deadlock. Fix the problem by releasing page locks, io_end reference, and submitting prepared bio before calling ext4_journal_stop(). [ Changed to defer the call to ext4_journal_stop() only if the handle is synchronous. --tytso ] Reported-and-tested-by: Eryu Guan Signed-off-by: Theodore Ts'o Signed-off-by: Jan Kara Signed-off-by: Greg Kroah-Hartman commit 7fc7b085daded6124e1bab55f24c087358c340fd Author: Vegard Nossum Date: Thu Jun 30 11:53:46 2016 -0400 ext4: check for extents that wrap around commit f70749ca42943faa4d4dcce46dfdcaadb1d0c4b6 upstream. An extent with lblock = 4294967295 and len = 1 will pass the ext4_valid_extent() test: ext4_lblk_t last = lblock + len - 1; if (len == 0 || lblock > last) return 0; since last = 4294967295 + 1 - 1 = 4294967295. This would later trigger the BUG_ON(es->es_lblk + es->es_len < es->es_lblk) in ext4_es_end(). We can simplify it by removing the - 1 altogether and changing the test to use lblock + len <= lblock, since now if len = 0, then lblock + 0 == lblock and it fails, and if len > 0 then lblock + len > lblock in order to pass (i.e. it doesn't overflow). Fixes: 5946d0893 ("ext4: check for overlapping extents in ext4_valid_extent_entries()") Fixes: 2f974865f ("ext4: check for zero length extent explicitly") Cc: Eryu Guan Signed-off-by: Phil Turnbull Signed-off-by: Vegard Nossum Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 5b3045bc26bf4f7ee0817672daef69e65f63a252 Author: Thomas Petazzoni Date: Thu Jun 16 16:48:52 2016 +0200 serial: mvebu-uart: free the IRQ in ->shutdown() commit c2c1659b4f8f9e19fe82a4fd06cca4b3d59090ce upstream. As suggested by the serial port infrastructure documentation, the IRQ is requested in ->startup(). However, it is never freed in the ->shutdown() hook. With simple systems that open the serial port once for all and always have at least one process that keep the serial port opened, there was no problem. But with a more complicated system (*cough* systemd *cough*), the serial port is opened/closed many times, which at some point no processes having the serial port open at all. Due to this ->startup() gets called again, tries to request_irq() again, which fails. Fixes: 30530791a7a0 ("serial: mvebu-uart: initial support for Armada-3700 serial port") Cc: Ofer Heifetz Signed-off-by: Thomas Petazzoni Signed-off-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman commit 1539c4d5d3df046cb36c3fed2ea435c34abc4e05 Author: Herbert Xu Date: Tue Jul 12 13:17:57 2016 +0800 crypto: scatterwalk - Fix test in scatterwalk_done commit 5f070e81bee35f1b7bd1477bb223a873ff657803 upstream. When there is more data to be processed, the current test in scatterwalk_done may prevent us from calling pagedone even when we should. In particular, if we're on an SG entry spanning multiple pages where the last page is not a full page, we will incorrectly skip calling pagedone on the second last page. This patch fixes this by adding a separate test for whether we've reached the end of a page. Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit cab0805d7cf18537660ad51b6b9d991263676921 Author: Herbert Xu Date: Wed Jun 15 22:27:05 2016 +0800 crypto: gcm - Filter out async ghash if necessary commit b30bdfa86431afbafe15284a3ad5ac19b49b88e3 upstream. As it is if you ask for a sync gcm you may actually end up with an async one because it does not filter out async implementations of ghash. This patch fixes this by adding the necessary filter when looking for ghash. Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit efbe5a352f3b593a7210826a6398bddf3a527814 Author: Andreas Herrmann Date: Fri Jul 22 17:14:11 2016 +0200 Revert "cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency" commit da7d3abe1c9e5ebac2cf86f97e9e89888a5e2094 upstream. This reverts commit 790d849bf811a8ab5d4cd2cce0f6fda92f6aebf2. Using a v4.7-rc7 kernel on a HP ProLiant triggered following messages pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1200 MHz, 2800 MHz cpufreq: ondemand governor failed, too long transition latency of HW, fallback to performance governor The last line was shown for each CPU in the system. Testing v4.5 (where commit 790d849b was integrated) triggered similar messages. Same behaviour on a 2nd HP Proliant system. So commit 790d849bf (cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency) causes the system to use performance governor which, I guess, was not the intention of the patch. Enabling debug output in pcc-cpufreq provides following verbose output: pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1200 MHz, 2800 MHz pcc_get_offset: for CPU 0: pcc_cpu_data input_offset: 0x44, pcc_cpu_data output_offset: 0x48 init: policy->max is 2800000, policy->min is 1200000 get: get_freq for CPU 0 get: SUCCESS: (virtual) output_offset for cpu 0 is 0xffffc9000d7c0048, contains a value of: 0xff06. Speed is: 168000 MHz cpufreq: ondemand governor failed, too long transition latency of HW, fallback to performance governor target: CPU 0 should go to target freq: 2800000 (virtual) input_offset is 0xffffc9000d7c0044 target: was SUCCESSFUL for cpu 0 I am asking to revert 790d849bf to re-enable usage of ondemand governor with pcc-cpufreq. Fixes: 790d849bf (cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency) Signed-off-by: Andreas Herrmann Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 84603f74a6c18be9e7c5bb2548448c42cc28e2c5 Author: Wei Fang Date: Wed Jul 6 11:32:20 2016 +0800 fs/dcache.c: avoid soft-lockup in dput() commit 47be61845c775643f1aa4d2a54343549f943c94c upstream. We triggered soft-lockup under stress test which open/access/write/close one file concurrently on more than five different CPUs: WARN: soft lockup - CPU#0 stuck for 11s! [who:30631] ... [] dput+0x100/0x298 [] terminate_walk+0x4c/0x60 [] path_lookupat+0x5cc/0x7a8 [] filename_lookup+0x38/0xf0 [] user_path_at_empty+0x78/0xd0 [] user_path_at+0x1c/0x28 [] SyS_faccessat+0xb4/0x230 ->d_lock trylock may failed many times because of concurrently operations, and dput() may execute a long time. Fix this by replacing cpu_relax() with cond_resched(). dput() used to be sleepable, so make it sleepable again should be safe. Signed-off-by: Wei Fang Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit 378eef490f449c6583ce63e32ac8b9ab27d3cdb4 Author: Michal Hocko Date: Thu Jul 28 15:48:44 2016 -0700 Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are free elements" commit 4e390b2b2f34b8daaabf2df1df0cf8f798b87ddb upstream. This reverts commit f9054c70d28b ("mm, mempool: only set __GFP_NOMEMALLOC if there are free elements"). There has been a report about OOM killer invoked when swapping out to a dm-crypt device. The primary reason seems to be that the swapout out IO managed to completely deplete memory reserves. Ondrej was able to bisect and explained the issue by pointing to f9054c70d28b ("mm, mempool: only set __GFP_NOMEMALLOC if there are free elements"). The reason is that the swapout path is not throttled properly because the md-raid layer needs to allocate from the generic_make_request path which means it allocates from the PF_MEMALLOC context. dm layer uses mempool_alloc in order to guarantee a forward progress which used to inhibit access to memory reserves when using page allocator. This has changed by f9054c70d28b ("mm, mempool: only set __GFP_NOMEMALLOC if there are free elements") which has dropped the __GFP_NOMEMALLOC protection when the memory pool is depleted. If we are running out of memory and the only way forward to free memory is to perform swapout we just keep consuming memory reserves rather than throttling the mempool allocations and allowing the pending IO to complete up to a moment when the memory is depleted completely and there is no way forward but invoking the OOM killer. This is less than optimal. The original intention of f9054c70d28b was to help with the OOM situations where the oom victim depends on mempool allocation to make a forward progress. David has mentioned the following backtrace: schedule schedule_timeout io_schedule_timeout mempool_alloc __split_and_process_bio dm_request generic_make_request submit_bio mpage_readpages ext4_readpages __do_page_cache_readahead ra_submit filemap_fault handle_mm_fault __do_page_fault do_page_fault page_fault We do not know more about why the mempool is depleted without being replenished in time, though. In any case the dm layer shouldn't depend on any allocations outside of the dedicated pools so a forward progress should be guaranteed. If this is not the case then the dm should be fixed rather than papering over the problem and postponing it to later by accessing more memory reserves. mempools are a mechanism to maintain dedicated memory reserves to guaratee forward progress. Allowing them an unbounded access to the page allocator memory reserves is going against the whole purpose of this mechanism. Bisected by Ondrej Kozina. [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20160721145309.GR26379@dhcp22.suse.cz Signed-off-by: Michal Hocko Reported-by: Ondrej Kozina Reviewed-by: Johannes Weiner Acked-by: NeilBrown Cc: David Rientjes Cc: Mikulas Patocka Cc: Ondrej Kozina Cc: Tetsuo Handa Cc: Mel Gorman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 15b595ed1c3ab573d14020e5b75f9fc8d1142534 Author: Wei Fang Date: Mon Jul 25 21:17:04 2016 +0800 fuse: fix wrong assignment of ->flags in fuse_send_init() commit 9446385f05c9af25fed53dbed3cc75763730be52 upstream. FUSE_HAS_IOCTL_DIR should be assigned to ->flags, it may be a typo. Signed-off-by: Wei Fang Signed-off-by: Miklos Szeredi Fixes: 69fe05c90ed5 ("fuse: add missing INIT flags") Signed-off-by: Greg Kroah-Hartman commit 6628347ed38e1f53750b3536b338cb69cef981fb Author: Maxim Patlasov Date: Tue Jul 19 18:12:26 2016 -0700 fuse: fuse_flush must check mapping->flags for errors commit 9ebce595f63a407c5cec98f98f9da8459b73740a upstream. fuse_flush() calls write_inode_now() that triggers writeback, but actual writeback will happen later, on fuse_sync_writes(). If an error happens, fuse_writepage_end() will set error bit in mapping->flags. So, we have to check mapping->flags after fuse_sync_writes(). Signed-off-by: Maxim Patlasov Signed-off-by: Miklos Szeredi Fixes: 4d99ff8f12eb ("fuse: Turn writeback cache on") Signed-off-by: Greg Kroah-Hartman commit 3720e4ab16610e909e0cda92067f7493899fef4b Author: Alexey Kuznetsov Date: Tue Jul 19 12:48:01 2016 -0700 fuse: fsync() did not return IO errors commit ac7f052b9e1534c8248f814b6f0068ad8d4a06d2 upstream. Due to implementation of fuse writeback filemap_write_and_wait_range() does not catch errors. We have to do this directly after fuse_sync_writes() Signed-off-by: Alexey Kuznetsov Signed-off-by: Maxim Patlasov Signed-off-by: Miklos Szeredi Fixes: 4d99ff8f12eb ("fuse: Turn writeback cache on") Signed-off-by: Greg Kroah-Hartman commit 536c97709ac4309b3bd14d868100b05476e5c8b1 Author: Josh Poimboeuf Date: Thu Jul 28 23:15:21 2016 +0200 x86/power/64: Fix hibernation return address corruption commit 4ce827b4cc58bec7952591b96cce2b28553e4d5b upstream. In kernel bug 150021, a kernel panic was reported when restoring a hibernate image. Only a picture of the oops was reported, so I can't paste the whole thing here. But here are the most interesting parts: kernel tried to execute NX-protected page - exploit attempt? (uid: 0) BUG: unable to handle kernel paging request at ffff8804615cfd78 ... RIP: ffff8804615cfd78 RSP: ffff8804615f0000 RBP: ffff8804615cfdc0 ... Call Trace: do_signal+0x23 exit_to_usermode_loop+0x64 ... The RIP is on the same page as RBP, so it apparently started executing on the stack. The bug was bisected to commit ef0f3ed5a4ac (x86/asm/power: Create stack frames in hibernate_asm_64.S), which in retrospect seems quite dangerous, since that code saves and restores the stack pointer from a global variable ('saved_context'). There are a lot of moving parts in the hibernate save and restore paths, so I don't know exactly what caused the panic. Presumably, a FRAME_END was executed without the corresponding FRAME_BEGIN, or vice versa. That would corrupt the return address on the stack and would be consistent with the details of the above panic. [ rjw: One major problem is that by the time the FRAME_BEGIN in restore_registers() is executed, the stack pointer value may not be valid any more. Namely, the stack area pointed to by it previously may have been overwritten by some image memory contents and that page frame may now be used for whatever different purpose it had been allocated for before hibernation. In that case, the FRAME_BEGIN will corrupt that memory. ] Instead of doing the frame pointer save/restore around the bounds of the affected functions, just do it around the call to swsusp_save(). That has the same effect of ensuring that if swsusp_save() sleeps, the frame pointers will be correct. It's also a much more obviously safe way to do it than the original patch. And objtool still doesn't report any warnings. Fixes: ef0f3ed5a4ac (x86/asm/power: Create stack frames in hibernate_asm_64.S) Link: https://bugzilla.kernel.org/show_bug.cgi?id=150021 Reported-by: Andre Reinke Tested-by: Andre Reinke Signed-off-by: Josh Poimboeuf Acked-by: Ingo Molnar Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 74de73fd1233c710e71760a1ddd21d381bc553f9 Author: Borislav Petkov Date: Mon Jun 6 17:10:43 2016 +0200 x86/microcode: Fix suspend to RAM with builtin microcode commit 4b703305d98bf7350d4b2953ee39a3aa2eeb1778 upstream. Usually, after we have found the proper microcode blob for the current machine, we stash it away for later use with save_microcode_in_initrd(). However, with builtin microcode which doesn't come from the initrd, we don't call that function because CONFIG_BLK_DEV_INITRD=n and even if set, we don't have a valid initrd. In order to fix this, let's make save_microcode_in_initrd() an fs_initcall which runs before rootfs_initcall() as this was the time it was called previously through: rootfs_initcall(populate_rootfs) |-> free_initrd() |-> free_initrd_mem() |-> save_microcode_in_initrd() Also, we make it run independently from initrd functionality being present or not. And since it is called in the microcode loader only now, we can also make it static. Reported-and-tested-by: Jim Bos Signed-off-by: Borislav Petkov Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1465225850-7352-3-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 061783ef7814be4b2d625d763910a4b77a77cb11 Author: Vladimir Davydov Date: Tue Aug 2 14:03:01 2016 -0700 radix-tree: account nodes to memcg only if explicitly requested commit 05eb6e7263185a6bb0de9501ccf2addc52429414 upstream. Radix trees may be used not only for storing page cache pages, so unconditionally accounting radix tree nodes to the current memory cgroup is bad: if a radix tree node is used for storing data shared among different cgroups we risk pinning dead memory cgroups forever. So let's only account radix tree nodes if it was explicitly requested by passing __GFP_ACCOUNT to INIT_RADIX_TREE. Currently, we only want to account page cache entries, so mark mapping->page_tree so. Fixes: 58e698af4c63 ("radix-tree: account radix_tree_node to memory cgroup") Link: http://lkml.kernel.org/r/1470057188-7864-1-git-send-email-vdavydov@virtuozzo.com Signed-off-by: Vladimir Davydov Acked-by: Johannes Weiner Acked-by: Michal Hocko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 4a226bf30f44fd5cb4781666df73e228d37a46d8 Author: Fabian Frederick Date: Tue Aug 2 14:03:07 2016 -0700 sysv, ipc: fix security-layer leaking commit 9b24fef9f0410fb5364245d6cc2bd044cc064007 upstream. Commit 53dad6d3a8e5 ("ipc: fix race with LSMs") updated ipc_rcu_putref() to receive rcu freeing function but used generic ipc_rcu_free() instead of msg_rcu_free() which does security cleaning. Running LTP msgsnd06 with kmemleak gives the following: cat /sys/kernel/debug/kmemleak unreferenced object 0xffff88003c0a11f8 (size 8): comm "msgsnd06", pid 1645, jiffies 4294672526 (age 6.549s) hex dump (first 8 bytes): 1b 00 00 00 01 00 00 00 ........ backtrace: kmemleak_alloc+0x23/0x40 kmem_cache_alloc_trace+0xe1/0x180 selinux_msg_queue_alloc_security+0x3f/0xd0 security_msg_queue_alloc+0x2e/0x40 newque+0x4e/0x150 ipcget+0x159/0x1b0 SyS_msgget+0x39/0x40 entry_SYSCALL_64_fastpath+0x13/0x8f Manfred Spraul suggested to fix sem.c as well and Davidlohr Bueso to only use ipc_rcu_free in case of security allocation failure in newary() Fixes: 53dad6d3a8e ("ipc: fix race with LSMs") Link: http://lkml.kernel.org/r/1470083552-22966-1-git-send-email-fabf@skynet.be Signed-off-by: Fabian Frederick Cc: Davidlohr Bueso Cc: Manfred Spraul Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit aa56f0bd5d67d2eb0e59d6bc20578f83858ff43f Author: Vegard Nossum Date: Fri Jul 29 10:40:31 2016 +0200 block: fix use-after-free in seq file commit 77da160530dd1dc94f6ae15a981f24e5f0021e84 upstream. I got a KASAN report of use-after-free: ================================================================== BUG: KASAN: use-after-free in klist_iter_exit+0x61/0x70 at addr ffff8800b6581508 Read of size 8 by task trinity-c1/315 ============================================================================= BUG kmalloc-32 (Not tainted): kasan: bad access detected ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: Allocated in disk_seqf_start+0x66/0x110 age=144 cpu=1 pid=315 ___slab_alloc+0x4f1/0x520 __slab_alloc.isra.58+0x56/0x80 kmem_cache_alloc_trace+0x260/0x2a0 disk_seqf_start+0x66/0x110 traverse+0x176/0x860 seq_read+0x7e3/0x11a0 proc_reg_read+0xbc/0x180 do_loop_readv_writev+0x134/0x210 do_readv_writev+0x565/0x660 vfs_readv+0x67/0xa0 do_preadv+0x126/0x170 SyS_preadv+0xc/0x10 do_syscall_64+0x1a1/0x460 return_from_SYSCALL_64+0x0/0x6a INFO: Freed in disk_seqf_stop+0x42/0x50 age=160 cpu=1 pid=315 __slab_free+0x17a/0x2c0 kfree+0x20a/0x220 disk_seqf_stop+0x42/0x50 traverse+0x3b5/0x860 seq_read+0x7e3/0x11a0 proc_reg_read+0xbc/0x180 do_loop_readv_writev+0x134/0x210 do_readv_writev+0x565/0x660 vfs_readv+0x67/0xa0 do_preadv+0x126/0x170 SyS_preadv+0xc/0x10 do_syscall_64+0x1a1/0x460 return_from_SYSCALL_64+0x0/0x6a CPU: 1 PID: 315 Comm: trinity-c1 Tainted: G B 4.7.0+ #62 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 ffffea0002d96000 ffff880119b9f918 ffffffff81d6ce81 ffff88011a804480 ffff8800b6581500 ffff880119b9f948 ffffffff8146c7bd ffff88011a804480 ffffea0002d96000 ffff8800b6581500 fffffffffffffff4 ffff880119b9f970 Call Trace: [] dump_stack+0x65/0x84 [] print_trailer+0x10d/0x1a0 [] object_err+0x2f/0x40 [] kasan_report_error+0x221/0x520 [] __asan_report_load8_noabort+0x3e/0x40 [] klist_iter_exit+0x61/0x70 [] class_dev_iter_exit+0x9/0x10 [] disk_seqf_stop+0x3a/0x50 [] seq_read+0x4b2/0x11a0 [] proc_reg_read+0xbc/0x180 [] do_loop_readv_writev+0x134/0x210 [] do_readv_writev+0x565/0x660 [] vfs_readv+0x67/0xa0 [] do_preadv+0x126/0x170 [] SyS_preadv+0xc/0x10 This problem can occur in the following situation: open() - pread() - .seq_start() - iter = kmalloc() // succeeds - seqf->private = iter - .seq_stop() - kfree(seqf->private) - pread() - .seq_start() - iter = kmalloc() // fails - .seq_stop() - class_dev_iter_exit(seqf->private) // boom! old pointer As the comment in disk_seqf_stop() says, stop is called even if start failed, so we need to reinitialise the private pointer to NULL when seq iteration stops. An alternative would be to set the private pointer to NULL when the kmalloc() in disk_seqf_start() fails. Signed-off-by: Vegard Nossum Acked-by: Tejun Heo Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 64494393dff3ce45bfd38b2a382478d8f310c857 Author: David Howells Date: Wed Jul 27 11:42:38 2016 +0100 x86/syscalls/64: Add compat_sys_keyctl for 32-bit userspace commit f7d665627e103e82d34306c7d3f6f46f387c0d8b upstream. x86_64 needs to use compat_sys_keyctl for 32-bit userspace rather than calling sys_keyctl(). The latter will work in a lot of cases, thereby hiding the issue. Reported-by: Stephan Mueller Tested-by: Stephan Mueller Signed-off-by: David Howells Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: keyrings@vger.kernel.org Cc: linux-security-module@vger.kernel.org Link: http://lkml.kernel.org/r/146961615805.14395.5581949237156769439.stgit@warthog.procyon.org.uk Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit a0c2cc8e60fac319c3adfaaef349b87ead672925 Author: Vladimir Davydov Date: Thu Aug 11 15:33:03 2016 -0700 mm: memcontrol: fix memcg id ref counter on swap charge move commit 615d66c37c755c49ce022c9e5ac0875d27d2603d upstream. Since commit 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs") swap entries do not pin memcg->css.refcnt directly. Instead, they pin memcg->id.ref. So we should adjust the reference counters accordingly when moving swap charges between cgroups. Fixes: 73f576c04b941 ("mm: memcontrol: fix cgroup creation failure after many small jobs") Link: http://lkml.kernel.org/r/9ce297c64954a42dc90b543bc76106c4a94f07e8.1470219853.git.vdavydov@virtuozzo.com Signed-off-by: Vladimir Davydov Acked-by: Michal Hocko Acked-by: Johannes Weiner Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 0571b96f5fe82580bc84c19b826cc63cd37df0dd Author: Vladimir Davydov Date: Thu Aug 11 15:33:00 2016 -0700 mm: memcontrol: fix swap counter leak on swapout from offline cgroup commit 1f47b61fb4077936465dcde872a4e5cc4fe708da upstream. An offline memory cgroup might have anonymous memory or shmem left charged to it and no swap. Since only swap entries pin the id of an offline cgroup, such a cgroup will have no id and so an attempt to swapout its anon/shmem will not store memory cgroup info in the swap cgroup map. As a result, memcg->swap or memcg->memsw will never get uncharged from it and any of its ascendants. Fix this by always charging swapout to the first ancestor cgroup that hasn't released its id yet. [hannes@cmpxchg.org: add comment to mem_cgroup_swapout] [vdavydov@virtuozzo.com: use WARN_ON_ONCE() in mem_cgroup_id_get_online()] Link: http://lkml.kernel.org/r/20160803123445.GJ13263@esperanza Fixes: 73f576c04b941 ("mm: memcontrol: fix cgroup creation failure after many small jobs") Link: http://lkml.kernel.org/r/5336daa5c9a32e776067773d9da655d2dc126491.1470219853.git.vdavydov@virtuozzo.com Signed-off-by: Vladimir Davydov Acked-by: Johannes Weiner Acked-by: Michal Hocko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 80f35a804be48ac0fe1f3b3ae73bb60a7320d3c3 Author: Theodore Ts'o Date: Sun Jul 3 17:01:26 2016 -0400 random: strengthen input validation for RNDADDTOENTCNT commit 86a574de4590ffe6fd3f3ca34cdcf655a78e36ec upstream. Don't allow RNDADDTOENTCNT or RNDADDENTROPY to accept a negative entropy value. It doesn't make any sense to subtract from the entropy counter, and it can trigger a warning: random: negative entropy/overflow: pool input count -40000 ------------[ cut here ]------------ WARNING: CPU: 3 PID: 6828 at drivers/char/random.c:670[< none >] credit_entropy_bits+0x21e/0xad0 drivers/char/random.c:670 Modules linked in: CPU: 3 PID: 6828 Comm: a.out Not tainted 4.7.0-rc4+ #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 ffffffff880b58e0 ffff88005dd9fcb0 ffffffff82cc838f ffffffff87158b40 fffffbfff1016b1c 0000000000000000 0000000000000000 ffffffff87158b40 ffffffff83283dae 0000000000000009 ffff88005dd9fcf8 ffffffff8136d27f Call Trace: [< inline >] __dump_stack lib/dump_stack.c:15 [] dump_stack+0x12e/0x18f lib/dump_stack.c:51 [] __warn+0x19f/0x1e0 kernel/panic.c:516 [] warn_slowpath_null+0x2c/0x40 kernel/panic.c:551 [] credit_entropy_bits+0x21e/0xad0 drivers/char/random.c:670 [< inline >] credit_entropy_bits_safe drivers/char/random.c:734 [] random_ioctl+0x21d/0x250 drivers/char/random.c:1546 [< inline >] vfs_ioctl fs/ioctl.c:43 [] do_vfs_ioctl+0x18c/0xff0 fs/ioctl.c:674 [< inline >] SYSC_ioctl fs/ioctl.c:689 [] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:680 [] entry_SYSCALL_64_fastpath+0x23/0xc1 arch/x86/entry/entry_64.S:207 ---[ end trace 5d4902b2ba842f1f ]--- This was triggered using the test program: // autogenerated by syzkaller (http://github.com/google/syzkaller) int main() { int fd = open("/dev/random", O_RDWR); int val = -5000; ioctl(fd, RNDADDTOENTCNT, &val); return 0; } It's harmless in that (a) only root can trigger it, and (b) after complaining the code never does let the entropy count go negative, but it's better to simply not allow this userspace from passing in a negative entropy value altogether. Google-Bug-Id: #29575089 Reported-By: Dmitry Vyukov Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 3f0153ace2051b3f62c24503740b935e82451c22 Author: John Johansen Date: Wed Nov 18 11:41:05 2015 -0800 apparmor: fix ref count leak when profile sha1 hash is read commit 0b938a2e2cf0b0a2c8bac9769111545aff0fee97 upstream. Signed-off-by: John Johansen Acked-by: Seth Arnold Signed-off-by: Greg Kroah-Hartman commit 70e23611b051f548d8b8256162c6d5de2276e056 Author: Bart Van Assche Date: Tue Jul 19 10:03:44 2016 -0700 IB/hfi1: Disable by default commit a154a8cd080b437969ef194dee365bbb60a3b38a upstream. There is a strict policy in the Linux kernel that new drivers must be disabled by default. Hence leave out the "default m" line from Kconfig. Fixes: f48ad614c100 ("IB/hfi1: Move driver out of staging") Signed-off-by: Bart Van Assche Cc: Jubin John Cc: Dennis Dalessandro Cc: Ira Weiny Cc: Mike Marciniszyn Acked-by: Dennis Dalessandro Signed-off-by: Doug Ledford Signed-off-by: Greg Kroah-Hartman commit cd1a4c6dd65dd5fa47c953cc035adc214fa9ce23 Author: David Howells Date: Wed Jul 27 11:43:37 2016 +0100 KEYS: 64-bit MIPS needs to use compat_sys_keyctl for 32-bit userspace commit 20f06ed9f61a185c6dabd662c310bed6189470df upstream. MIPS64 needs to use compat_sys_keyctl for 32-bit userspace rather than calling sys_keyctl. The latter will work in a lot of cases, thereby hiding the issue. Reported-by: Stephan Mueller Signed-off-by: David Howells Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: linux-security-module@vger.kernel.org Cc: keyrings@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/13832/ Signed-off-by: Ralf Baechle Signed-off-by: Greg Kroah-Hartman commit 65413c151b68bae1ffa261b215c23441eefcfb10 Author: Dave Weinstein Date: Thu Jul 28 11:55:41 2016 -0700 arm: oabi compat: add missing access checks commit 7de249964f5578e67b99699c5f0b405738d820a2 upstream. Add access checks to sys_oabi_epoll_wait() and sys_oabi_semtimedop(). This fixes CVE-2016-3857, a local privilege escalation under CONFIG_OABI_COMPAT. Reported-by: Chiachih Wu Reviewed-by: Kees Cook Reviewed-by: Nicolas Pitre Signed-off-by: Dave Weinstein Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 89c2f98f31c17a81ed736b101e10b2472777e307 Author: Soheil Hassas Yeganeh Date: Fri Jul 29 09:34:02 2016 -0400 tcp: consider recv buf for the initial window scale [ Upstream commit f626300a3e776ccc9671b0dd94698fb3aa315966 ] tcp_select_initial_window() intends to advertise a window scaling for the maximum possible window size. To do so, it considers the maximum of net.ipv4.tcp_rmem[2] and net.core.rmem_max as the only possible upper-bounds. However, users with CAP_NET_ADMIN can use SO_RCVBUFFORCE to set the socket's receive buffer size to values larger than net.ipv4.tcp_rmem[2] and net.core.rmem_max. Thus, SO_RCVBUFFORCE is effectively ignored by tcp_select_initial_window(). To fix this, consider the maximum of net.ipv4.tcp_rmem[2], net.core.rmem_max and socket's initial buffer space. Fixes: b0573dea1fb3 ("[NET]: Introduce SO_{SND,RCV}BUFFORCE socket options") Signed-off-by: Soheil Hassas Yeganeh Suggested-by: Neal Cardwell Acked-by: Neal Cardwell Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit bb51025f301e45215def59891dc081687ff89470 Author: Beniamino Galvani Date: Tue Jul 26 12:24:53 2016 +0200 macsec: ensure rx_sa is set when validation is disabled [ Upstream commit e3a3b626010a14fe067f163c2c43409d5afcd2a9 ] macsec_decrypt() is not called when validation is disabled and so macsec_skb_cb(skb)->rx_sa is not set; but it is used later in macsec_post_decrypt(), ensure that it's always initialized. Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Beniamino Galvani Acked-by: Sabrina Dubroca Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6dc47138a6a08ac06f1b87157d5c8c5a5dff01c7 Author: Manish Chopra Date: Mon Jul 25 19:07:46 2016 +0300 qed: Fix setting/clearing bit in completion bitmap [ Upstream commit 59d3f1ceb69b54569685d0c34dff16a1e0816b19 ] Slowpath completion handling is incorrectly changing SPQ_RING_SIZE bits instead of a single one. Fixes: 76a9a3642a0b ("qed: fix handling of concurrent ramrods") Signed-off-by: Manish Chopra Signed-off-by: Yuval Mintz Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 77719251041381423bf6fb4ae9d529e2a42adb55 Author: Vegard Nossum Date: Sat Jul 23 09:42:35 2016 +0200 net/sctp: terminate rhashtable walk correctly [ Upstream commit 5fc382d87517707ad77ea4c9c12e2a3fde2c838a ] I was seeing a lot of these: BUG: sleeping function called from invalid context at mm/slab.h:388 in_atomic(): 0, irqs_disabled(): 0, pid: 14971, name: trinity-c2 Preemption disabled at:[] rhashtable_walk_start+0x46/0x150 [] preempt_count_add+0x1fb/0x280 [] _raw_spin_lock+0x12/0x40 [] console_unlock+0x2f7/0x930 [] vprintk_emit+0x2fb/0x520 [] vprintk_default+0x1a/0x20 [] printk+0x94/0xb0 [] print_stack_trace+0xe0/0x170 [] ___might_sleep+0x3be/0x460 [] __might_sleep+0x90/0x1a0 [] kmem_cache_alloc+0x153/0x1e0 [] rhashtable_walk_init+0xfe/0x2d0 [] sctp_transport_walk_start+0x1e/0x60 [] sctp_transport_seq_start+0x4d/0x150 [] seq_read+0x27b/0x1180 [] proc_reg_read+0xbc/0x180 [] __vfs_read+0xdb/0x610 [] vfs_read+0xea/0x2d0 [] SyS_pread64+0x11b/0x150 [] do_syscall_64+0x19c/0x410 [] return_from_SYSCALL_64+0x0/0x6a [] 0xffffffffffffffff Apparently we always need to call rhashtable_walk_stop(), even when rhashtable_walk_start() fails: * rhashtable_walk_start - Start a hash table walk * @iter: Hash table iterator * * Start a hash table walk. Note that we take the RCU lock in all * cases including when we return an error. So you must always call * rhashtable_walk_stop to clean up. otherwise we never call rcu_read_unlock() and we get the splat above. Fixes: 53fa1036 ("sctp: fix some rhashtable functions using in sctp proc/diag") See-also: 53fa1036 ("sctp: fix some rhashtable functions using in sctp proc/diag") See-also: f2dba9c6 ("rhashtable: Introduce rhashtable_walk_*") Cc: Xin Long Cc: Herbert Xu Cc: stable@vger.kernel.org Signed-off-by: Vegard Nossum Acked-by: Marcelo Ricardo Leitner Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e8f7ce7bc0ba489d18955796fb5b31291a345b65 Author: Vegard Nossum Date: Sat Jul 23 07:43:50 2016 +0200 net/irda: fix NULL pointer dereference on memory allocation failure [ Upstream commit d3e6952cfb7ba5f4bfa29d4803ba91f96ce1204d ] I ran into this: kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 2 PID: 2012 Comm: trinity-c3 Not tainted 4.7.0-rc7+ #19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 task: ffff8800b745f2c0 ti: ffff880111740000 task.ti: ffff880111740000 RIP: 0010:[] [] irttp_connect_request+0x36/0x710 RSP: 0018:ffff880111747bb8 EFLAGS: 00010286 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000069dd8358 RDX: 0000000000000009 RSI: 0000000000000027 RDI: 0000000000000048 RBP: ffff880111747c00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000069dd8358 R11: 1ffffffff0759723 R12: 0000000000000000 R13: ffff88011a7e4780 R14: 0000000000000027 R15: 0000000000000000 FS: 00007fc738404700(0000) GS:ffff88011af00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc737fdfb10 CR3: 0000000118087000 CR4: 00000000000006e0 Stack: 0000000000000200 ffff880111747bd8 ffffffff810ee611 ffff880119f1f220 ffff880119f1f4f8 ffff880119f1f4f0 ffff88011a7e4780 ffff880119f1f232 ffff880119f1f220 ffff880111747d58 ffffffff82bca542 0000000000000000 Call Trace: [] irda_connect+0x562/0x1190 [] SYSC_connect+0x202/0x2a0 [] SyS_connect+0x9/0x10 [] do_syscall_64+0x19c/0x410 [] entry_SYSCALL64_slow_path+0x25/0x25 Code: 41 89 ca 48 89 e5 41 57 41 56 41 55 41 54 41 89 d7 53 48 89 fb 48 83 c7 48 48 89 fa 41 89 f6 48 c1 ea 03 48 83 ec 20 4c 8b 65 10 <0f> b6 04 02 84 c0 74 08 84 c0 0f 8e 4c 04 00 00 80 7b 48 00 74 RIP [] irttp_connect_request+0x36/0x710 RSP ---[ end trace 4cda2588bc055b30 ]--- The problem is that irda_open_tsap() can fail and leave self->tsap = NULL, and then irttp_connect_request() almost immediately dereferences it. Cc: stable@vger.kernel.org Signed-off-by: Vegard Nossum Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2f9abe462b23e1f78cc55de4d92de58a63c9b710 Author: Marcelo Ricardo Leitner Date: Sat Jul 23 00:32:48 2016 -0300 sctp: fix BH handling on socket backlog [ Upstream commit eefc1b1d105ee4d2ce907833ce675f1e9599b5e3 ] Now that the backlog processing is called with BH enabled, we have to disable BH before taking the socket lock via bh_lock_sock() otherwise it may dead lock: sctp_backlog_rcv() bh_lock_sock(sk); if (sock_owned_by_user(sk)) { if (sk_add_backlog(sk, skb, sk->sk_rcvbuf)) sctp_chunk_free(chunk); else backloged = 1; } else sctp_inq_push(inqueue, chunk); bh_unlock_sock(sk); while sctp_inq_push() was disabling/enabling BH, but enabling BH triggers pending softirq, which then may try to re-lock the socket in sctp_rcv(). [ 219.187215] [ 219.187217] [] _raw_spin_lock+0x20/0x30 [ 219.187223] [] sctp_rcv+0x48c/0xba0 [sctp] [ 219.187225] [] ? nf_iterate+0x62/0x80 [ 219.187226] [] ip_local_deliver_finish+0x94/0x1e0 [ 219.187228] [] ip_local_deliver+0x6f/0xf0 [ 219.187229] [] ? ip_rcv_finish+0x3b0/0x3b0 [ 219.187230] [] ip_rcv_finish+0xd8/0x3b0 [ 219.187232] [] ip_rcv+0x282/0x3a0 [ 219.187233] [] ? update_curr+0x66/0x180 [ 219.187235] [] __netif_receive_skb_core+0x524/0xa90 [ 219.187236] [] ? update_cfs_shares+0x30/0xf0 [ 219.187237] [] ? __enqueue_entity+0x6c/0x70 [ 219.187239] [] ? enqueue_entity+0x204/0xdf0 [ 219.187240] [] __netif_receive_skb+0x18/0x60 [ 219.187242] [] process_backlog+0x9e/0x140 [ 219.187243] [] net_rx_action+0x22c/0x370 [ 219.187245] [] __do_softirq+0x112/0x2e7 [ 219.187247] [] do_softirq_own_stack+0x1c/0x30 [ 219.187247] [ 219.187248] [] do_softirq.part.14+0x38/0x40 [ 219.187249] [] __local_bh_enable_ip+0x7d/0x80 [ 219.187254] [] sctp_inq_push+0x68/0x80 [sctp] [ 219.187258] [] sctp_backlog_rcv+0x151/0x1c0 [sctp] [ 219.187260] [] __release_sock+0x87/0xf0 [ 219.187261] [] release_sock+0x30/0xa0 [ 219.187265] [] sctp_accept+0x17d/0x210 [sctp] [ 219.187266] [] ? prepare_to_wait_event+0xf0/0xf0 [ 219.187268] [] inet_accept+0x3c/0x130 [ 219.187269] [] SYSC_accept4+0x103/0x210 [ 219.187271] [] ? _raw_spin_unlock_bh+0x1a/0x20 [ 219.187272] [] ? release_sock+0x8c/0xa0 [ 219.187276] [] ? sctp_inet_listen+0x62/0x1b0 [sctp] [ 219.187277] [] SyS_accept+0x10/0x20 Fixes: 860fbbc343bf ("sctp: prepare for socket backlog behavior change") Cc: Eric Dumazet Signed-off-by: Marcelo Ricardo Leitner Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 201e07fe162762ac07cf074aa8b0cfdfe665eda9 Author: Mike Manning Date: Fri Jul 22 18:32:11 2016 +0100 net: ipv6: Always leave anycast and multicast groups on link down [ Upstream commit ea06f7176413e2538d13bb85b65387d0917943d9 ] Default kernel behavior is to delete IPv6 addresses on link down, which entails deletion of the multicast and the subnet-router anycast addresses. These deletions do not happen with sysctl setting to keep global IPv6 addresses on link down, so every link down/up causes an increment of the anycast and multicast refcounts. These bogus refcounts may stop these addrs from being removed on subsequent calls to delete them. The solution is to leave the groups for the multicast and subnet anycast on link down for the callflow when global IPv6 addresses are kept. Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional") Signed-off-by: Mike Manning Acked-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5986b7b7284dbb5a3ea117c36608b19504b862a5 Author: Ido Schimmel Date: Fri Jul 22 14:56:20 2016 +0300 bridge: Fix incorrect re-injection of LLDP packets [ Upstream commit baedbe55884c003819f5c8c063ec3d2569414296 ] Commit 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") caused LLDP packets arriving through a bridge port to be re-injected to the Rx path with skb->dev set to the bridge device, but this breaks the lldpad daemon. The lldpad daemon opens a packet socket with protocol set to ETH_P_LLDP for any valid device on the system, which doesn't not include soft devices such as bridge and VLAN. Since packet sockets (ptype_base) are processed in the Rx path after the Rx handler, LLDP packets with skb->dev set to the bridge device never reach the lldpad daemon. Fix this by making the bridge's Rx handler re-inject LLDP packets with RX_HANDLER_PASS, which effectively restores the behaviour prior to the mentioned commit. This means netfilter will never receive LLDP packets coming through a bridge port, as I don't see a way in which we can have okfn() consume the packet without breaking existing behaviour. I've already carried out a similar fix for STP packets in commit 56fae404fb2c ("bridge: Fix incorrect re-injection of STP packets"). Fixes: 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") Signed-off-by: Ido Schimmel Reviewed-by: Jiri Pirko Cc: Florian Westphal Cc: John Fastabend Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 746ac22c5abcb497611aa6435964a78b8928e70f Author: Mark Bloch Date: Thu Jul 21 11:52:55 2016 +0300 net/bonding: Enforce active-backup policy for IPoIB bonds [ Upstream commit 1533e77315220dc1d5ec3bd6d9fe32e2aa0a74c0 ] When using an IPoIB bond currently only active-backup mode is a valid use case and this commit strengthens it. Since commit 2ab82852a270 ("net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()") was introduced till 4.7-rc1, IPoIB didn't support the set_mac_address ndo, and hence the fail over mac policy always applied to IPoIB bonds. With the introduction of commit 492a7e67ff83 ("IB/IPoIB: Allow setting the device address"), that doesn't hold and practically IPoIB bonds are broken as of that. To fix it, lets go to fail over mac if the device doesn't support the ndo OR this is IPoIB device. As a by-product, this commit also prevents a stack corruption which occurred when trying to copy 20 bytes (IPoIB) device address to a sockaddr struct that has only 16 bytes of storage. Signed-off-by: Mark Bloch Signed-off-by: Or Gerlitz Signed-off-by: Saeed Mahameed Acked-by: Andy Gospodarek Signed-off-by: Jay Vosburgh Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ec3bdcb8073254fe4ef98aa55ff61c788e1ebaa3 Author: Daniel Borkmann Date: Mon Jul 25 18:06:12 2016 +0200 udp: use sk_filter_trim_cap for udp{,6}_queue_rcv_skb [ Upstream commit ba66bbe5480a012108958a71cff88b23dce84956 ] After a612769774a3 ("udp: prevent bugcheck if filter truncates packet too much"), there followed various other fixes for similar cases such as f4979fcea7fd ("rose: limit sk_filter trim to payload"). Latter introduced a new helper sk_filter_trim_cap(), where we can pass the trim limit directly to the socket filter handling. Make use of it here as well with sizeof(struct udphdr) as lower cap limit and drop the extra skb->len test in UDP's input path. Signed-off-by: Daniel Borkmann Cc: Willem de Bruijn Acked-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 99ffed7450336acce9045a2eb6a4ff66c6d19457 Author: Miklos Szeredi Date: Wed Aug 3 13:44:27 2016 +0200 vfs: fix deadlock in file_remove_privs() on overlayfs commit c1892c37769cf89c7e7ba57528ae2ccb5d153c9b upstream. file_remove_privs() is called with inode lock on file_inode(), which proceeds to calling notify_change() on file->f_path.dentry. Which triggers the WARN_ON_ONCE(!inode_is_locked(inode)) in addition to deadlocking later when ovl_setattr tries to lock the underlying inode again. Fix this mess by not mixing the layers, but doing everything on underlying dentry/inode. Signed-off-by: Miklos Szeredi Fixes: 07a2daab49c5 ("ovl: Copy up underlying inode's ->i_mode to overlay inode") Signed-off-by: Greg Kroah-Hartman commit 37fe52815ecb74f6aa1efd398ba0704f83c70550 Author: Scott Bauer Date: Wed Jul 27 19:11:29 2016 -0600 vfs: ioctl: prevent double-fetch in dedupe ioctl commit 10eec60ce79187686e052092e5383c99b4420a20 upstream. This prevents a double-fetch from user space that can lead to to an undersized allocation and heap overflow. Fixes: 54dbc1517237 ("vfs: hoist the btrfs deduplication ioctl to the vfs") Signed-off-by: Scott Bauer Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 3f66cf7a98699128acf210d6f671b3c60b9555f9 Author: Vegard Nossum Date: Fri Jul 15 00:22:07 2016 -0400 ext4: verify extent header depth commit 7bc9491645118c9461bd21099c31755ff6783593 upstream. Although the extent tree depth of 5 should enough be for the worst case of 2*32 extents of length 1, the extent tree code does not currently to merge nodes which are less than half-full with a sibling node, or to shrink the tree depth if possible. So it's possible, at least in theory, for the tree depth to be greater than 5. However, even in the worst case, a tree depth of 32 is highly unlikely, and if the file system is maliciously corrupted, an insanely large eh_depth can cause memory allocation failures that will trigger kernel warnings (here, eh_depth = 65280): JBD2: ext4.exe wants too many credits credits:195849 rsv_credits:0 max:256 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 50 at fs/jbd2/transaction.c:293 start_this_handle+0x569/0x580 CPU: 0 PID: 50 Comm: ext4.exe Not tainted 4.7.0-rc5+ #508 Stack: 604a8947 625badd8 0002fd09 00000000 60078643 00000000 62623910 601bf9bc 62623970 6002fc84 626239b0 900000125 Call Trace: [<6001c2dc>] show_stack+0xdc/0x1a0 [<601bf9bc>] dump_stack+0x2a/0x2e [<6002fc84>] __warn+0x114/0x140 [<6002fdff>] warn_slowpath_null+0x1f/0x30 [<60165829>] start_this_handle+0x569/0x580 [<60165d4e>] jbd2__journal_start+0x11e/0x220 [<60146690>] __ext4_journal_start_sb+0x60/0xa0 [<60120a81>] ext4_truncate+0x131/0x3a0 [<60123677>] ext4_setattr+0x757/0x840 [<600d5d0f>] notify_change+0x16f/0x2a0 [<600b2b16>] do_truncate+0x76/0xc0 [<600c3e56>] path_openat+0x806/0x1300 [<600c55c9>] do_filp_open+0x89/0xf0 [<600b4074>] do_sys_open+0x134/0x1e0 [<600b4140>] SyS_open+0x20/0x30 [<6001ea68>] handle_syscall+0x88/0x90 [<600295fd>] userspace+0x3fd/0x500 [<6001ac55>] fork_handler+0x85/0x90 ---[ end trace 08b0b88b6387a244 ]--- [ Commit message modified and the extent tree depath check changed from 5 to 32 -- tytso ] Cc: Darrick J. Wong Signed-off-by: Vegard Nossum Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman