commit 1a5b69df15c7c8df6d2eac76f21783554a37db19 Author: Willy Tarreau Date: Fri Jan 29 22:12:37 2016 +0100 Linux 2.6.32.70 Signed-off-by: Willy Tarreau commit 24c14705e2d60d414eaeee2a405b6793e693ed7e Author: Paolo Bonzini Date: Thu Jan 7 13:50:38 2016 +0100 kvm: x86: only channel 0 of the i8254 is linked to the HPET commit e5e57e7a03b1cdcb98e4aed135def2a08cbf3257 upstream. While setting the KVM PIT counters in 'kvm_pit_load_count', if 'hpet_legacy_start' is set, the function disables the timer on channel[0], instead of the respective index 'channel'. This is because channels 1-3 are not linked to the HPET. Fix the caller to only activate the special HPET processing for channel 0. Reported-by: P J P Fixes: 0185604c2d82c560dab2f2933a18f797e74ab5a8 Signed-off-by: Paolo Bonzini Signed-off-by: Ben Hutchings (cherry picked from commit ef90cf3d0b59e3b1dcfe94d1a241107667e6e96a) Signed-off-by: Willy Tarreau commit 4d1805f9990db66894c32bc12ec8d61778d53533 Author: Andrew Honig Date: Wed Nov 18 14:50:23 2015 -0800 KVM: x86: Reload pit counters for all channels when restoring state commit 0185604c2d82c560dab2f2933a18f797e74ab5a8 upstream. Currently if userspace restores the pit counters with a count of 0 on channels 1 or 2 and the guest attempts to read the count on those channels, then KVM will perform a mod of 0 and crash. This will ensure that 0 values are converted to 65536 as per the spec. This is CVE-2015-7513. Signed-off-by: Andy Honig Signed-off-by: Paolo Bonzini [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings (cherry picked from commit 08b8d1a6ccdefd3d517d04c472b7f42f51b3059b) Signed-off-by: Willy Tarreau commit c7bde2000d1dfadb741c10925cea49e8ab1dc039 Author: Andrew Banman Date: Tue Dec 29 14:54:25 2015 -0800 mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone() commit 5f0f2887f4de9508dcf438deab28f1de8070c271 upstream. test_pages_in_a_zone() does not account for the possibility of missing sections in the given pfn range. pfn_valid_within always returns 1 when CONFIG_HOLES_IN_ZONE is not set, allowing invalid pfns from missing sections to pass the test, leading to a kernel oops. Wrap an additional pfn loop with PAGES_PER_SECTION granularity to check for missing sections before proceeding into the zone-check code. This also prevents a crash from offlining memory devices with missing sections. Despite this, it may be a good idea to keep the related patch '[PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with missing sections' because missing sections in a memory block may lead to other problems not covered by the scope of this fix. Signed-off-by: Andrew Banman Acked-by: Alex Thorlton Cc: Russ Anderson Cc: Alex Thorlton Cc: Yinghai Lu Cc: Greg KH Cc: Seth Jennings Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings (cherry picked from commit 17f6a291c98199d7ce15a850ce5f548ceef628bc) Signed-off-by: Willy Tarreau commit 73734375c75ef304e2b4629bc48af9031be6f672 Author: Daniel Kiper Date: Tue May 24 17:12:51 2011 -0700 mm: add SECTION_ALIGN_UP() and SECTION_ALIGN_DOWN() macro commit a539f3533b78e39a22723d6d3e1e11b6c14454d9 upstream. Add SECTION_ALIGN_UP() and SECTION_ALIGN_DOWN() macro which aligns given pfn to upper section and lower section boundary accordingly. Required for the latest memory hotplug support for the Xen balloon driver. Signed-off-by: Daniel Kiper Reviewed-by: Konrad Rzeszutek Wilk David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [wt: only needed for next patch] Signed-off-by: Willy Tarreau commit 5a7fbabb106b0ba1b30921537f2e5928e153bdf3 Author: Andrey Ryabinin Date: Mon Dec 21 12:54:45 2015 +0300 ipv6/addrlabel: fix ip6addrlbl_get() commit e459dfeeb64008b2d23bdf600f03b3605dbb8152 upstream. ip6addrlbl_get() has never worked. If ip6addrlbl_hold() succeeded, ip6addrlbl_get() will exit with '-ESRCH'. If ip6addrlbl_hold() failed, ip6addrlbl_get() will use about to be free ip6addrlbl_entry pointer. Fix this by inverting ip6addrlbl_hold() check. Fixes: 2a8cc6c89039 ("[IPV6] ADDRCONF: Support RFC3484 configurable address selection policy table.") Signed-off-by: Andrey Ryabinin Reviewed-by: Cong Wang Acked-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings (cherry picked from commit 39b214ba1a357359f9c0be6ef8d21f2e5187567a) Signed-off-by: Willy Tarreau commit 3bf5fe19a9537b0976a4d7452b8b8702c2fb07e8 Author: Helge Deller Date: Mon Dec 21 10:03:30 2015 +0100 parisc: Fix syscall restarts commit 71a71fb5374a23be36a91981b5614590b9e722c3 upstream. On parisc syscalls which are interrupted by signals sometimes failed to restart and instead returned -ENOSYS which in the worst case lead to userspace crashes. A similiar problem existed on MIPS and was fixed by commit e967ef02 ("MIPS: Fix restart of indirect syscalls"). On parisc the current syscall restart code assumes that all syscall callers load the syscall number in the delay slot of the ble instruction. That's how it is e.g. done in the unistd.h header file: ble 0x100(%sr2, %r0) ldi #syscall_nr, %r20 Because of that assumption the current code never restored %r20 before returning to userspace. This assumption is at least not true for code which uses the glibc syscall() function, which instead uses this syntax: ble 0x100(%sr2, %r0) copy regX, %r20 where regX depend on how the compiler optimizes the code and register usage. This patch fixes this problem by adding code to analyze how the syscall number is loaded in the delay branch and - if needed - copy the syscall number to regX prior returning to userspace for the syscall restart. Signed-off-by: Helge Deller Cc: Mathieu Desnoyers [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings (cherry picked from commit 9f2dcffefc599c424ad1dd402e3a96da60639308) Signed-off-by: Willy Tarreau commit 912fcae9d4986486aa4d866d7c119a7a8ad37053 Author: Ed Swierk Date: Mon Jan 12 21:10:30 2015 -0800 MIPS: Fix restart of indirect syscalls commit e967ef022e00bb7c2e5b1a42007abfdd52055050 upstream. When 32-bit MIPS userspace invokes a syscall indirectly via syscall(number, arg1, ..., arg7), the kernel looks up the actual syscall based on the given number, shifts the other arguments to the left, and jumps to the syscall. If the syscall is interrupted by a signal and indicates it needs to be restarted by the kernel (by returning ERESTARTNOINTR for example), the syscall must be called directly, since the number is no longer the first argument, and the other arguments are now staged for a direct call. Before shifting the arguments, store the syscall number in pt_regs->regs[2]. This gets copied temporarily into pt_regs->regs[0] after the syscall returns. If the syscall needs to be restarted, handle_signal()/do_signal() copies the number back to pt_regs->reg[2], which ends up in $v0 once control returns to userspace. Signed-off-by: Ed Swierk Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/8929/ Signed-off-by: Ralf Baechle [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings (cherry picked from commit 08f865bba9c705aef95268a33393698e5385587e) Signed-off-by: Willy Tarreau commit 533030ca4271aea2f1fc95a3c64840654feebafe Author: Alan Stern Date: Wed Dec 16 13:32:38 2015 -0500 USB: fix invalid memory access in hub_activate() commit e50293ef9775c5f1cf3fcc093037dd6a8c5684ea upstream. Commit 8520f38099cc ("USB: change hub initialization sleeps to delayed_work") changed the hub_activate() routine to make part of it run in a workqueue. However, the commit failed to take a reference to the usb_hub structure or to lock the hub interface while doing so. As a result, if a hub is plugged in and quickly unplugged before the work routine can run, the routine will try to access memory that has been deallocated. Or, if the hub is unplugged while the routine is running, the memory may be deallocated while it is in active use. This patch fixes the problem by taking a reference to the usb_hub at the start of hub_activate() and releasing it at the end (when the work is finished), and by locking the hub interface while the work routine is running. It also adds a check at the start of the routine to see if the hub has already been disconnected, in which nothing should be done. Signed-off-by: Alan Stern Reported-by: Alexandru Cornea Tested-by: Alexandru Cornea Fixes: 8520f38099cc ("USB: change hub initialization sleeps to delayed_work") Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.2: add prototype for hub_release() before first use] Signed-off-by: Ben Hutchings (cherry picked from commit 10037421b529bc1fc18994e94e37d745184c4ea9) [wt: made a few changes : - adjusted context due to some autopm code being added only in 2.6.33 - no device_{lock,unlock}() in 2.6.32, use up/down(&->sem) instead] Signed-off-by: Willy Tarreau commit 00972bcd1fec22d93d6aa68e010b93d53a48b0fe Author: Dan Carpenter Date: Wed Dec 16 14:06:37 2015 +0300 USB: ipaq.c: fix a timeout loop commit abdc9a3b4bac97add99e1d77dc6d28623afe682b upstream. The code expects the loop to end with "retries" set to zero but, because it is a post-op, it will end set to -1. I have fixed this by moving the decrement inside the loop. Fixes: 014aa2a3c32e ('USB: ipaq: minor ipaq_open() cleanup.') Signed-off-by: Dan Carpenter Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings (cherry picked from commit 53a68d3f1629de82ddeb4e0882b0727fc230a6f3) Signed-off-by: Willy Tarreau commit 22a1caa0c96e94e710fcf2a9a01fdd3f091a7171 Author: Michael Holzheu Date: Thu Dec 17 19:06:02 2015 +0100 s390/dis: Fix handling of format specifiers commit 272fa59ccb4fc802af28b1d699c2463db6a71bf7 upstream. The print_insn() function returns strings like "lghi %r1,0". To escape the '%' character in sprintf() a second '%' is used. For example "lghi %%r1,0" is converted into "lghi %r1,0". After print_insn() the output string is passed to printk(). Because format specifiers like "%r" or "%f" are ignored by printk() this works by chance most of the time. But for instructions with control registers like "lctl %c6,%c6,780" this fails because printk() interprets "%c" as character format specifier. Fix this problem and escape the '%' characters twice. For example "lctl %%%%c6,%%%%c6,780" is then converted by sprintf() into "lctl %%c6,%%c6,780" and by printk() into "lctl %c6,%c6,780". Signed-off-by: Michael Holzheu Signed-off-by: Martin Schwidefsky [bwh: Backported to 3.2: drop the OPERAND_VR case] Signed-off-by: Ben Hutchings (cherry picked from commit 45f32e359d36dd045e4af0d1bea8bbbafd81eeb7) Signed-off-by: Willy Tarreau commit 2bafc5d56fdfebb9e74a9c6f2e50033498b7b424 Author: Johan Hovold Date: Mon Dec 14 16:16:19 2015 +0100 spi: fix parent-device reference leak commit 157f38f993919b648187ba341bfb05d0e91ad2f6 upstream. Fix parent-device reference leak due to SPI-core taking an unnecessary reference to the parent when allocating the master structure, a reference that was never released. Note that driver core takes its own reference to the parent when the master device is registered. Fixes: 49dce689ad4e ("spi doesn't need class_device") Signed-off-by: Johan Hovold Signed-off-by: Mark Brown Signed-off-by: Ben Hutchings (cherry picked from commit 40ddf30ca43b50019304303949c9b2676cfbd820) Signed-off-by: Willy Tarreau commit 0e0eaf7e451cc7543d1afd3bc379a7a103782f52 Author: Tilman Schmidt Date: Tue Dec 15 18:11:30 2015 +0100 ser_gigaset: fix deallocation of platform device structure commit 4c5e354a974214dfb44cd23fa0429327693bc3ea upstream. When shutting down the device, the struct ser_cardstate must not be kfree()d immediately after the call to platform_device_unregister() since the embedded struct platform_device is still in use. Move the kfree() call to the release method instead. Signed-off-by: Tilman Schmidt Fixes: 2869b23e4b95 ("drivers/isdn/gigaset: new M101 driver (v2)") Reported-by: Sasha Levin Signed-off-by: Paul Bolle Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings (cherry picked from commit 521e4101aaf46b2f37088453a1e935604cdd7f2a) Signed-off-by: Willy Tarreau commit 1fe6e68743f848b6bf49782508474b45544f8f40 Author: Dan Carpenter Date: Tue Dec 15 13:07:52 2015 +0300 mISDN: fix a loop count commit 40d24c4d8a7430aa4dfd7a665fa3faf3b05b673f upstream. There are two issue here. 1) cnt starts as maxloop + 1 so all these loops iterate one more time than intended. 2) At the end of the loop we test for "if (maxloop && !cnt)" but for the first two loops, we end with cnt equal to -1. Changing this to a pre-op means we end with cnt set to 0. Fixes: cae86d4a4e56 ('mISDN: Add driver for Infineon ISDN chipset family') Signed-off-by: Dan Carpenter Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings (cherry picked from commit 7eb2a0151e7c8a95d1be33a923718fb690452c61) Signed-off-by: Willy Tarreau commit 0dbf4344f37cd793e40f8e91ca19875cf849e7b0 Author: Peter Hurley Date: Fri Nov 27 14:25:08 2015 -0500 tty: Fix GPF in flush_to_ldisc() commit 9ce119f318ba1a07c29149301f1544b6c4bea52a upstream. A line discipline which does not define a receive_buf() method can can cause a GPF if data is ever received [1]. Oddly, this was known to the author of n_tracesink in 2011, but never fixed. [1] GPF report BUG: unable to handle kernel NULL pointer dereference at (null) IP: [< (null)>] (null) PGD 3752d067 PUD 37a7b067 PMD 0 Oops: 0010 [#1] SMP KASAN Modules linked in: CPU: 2 PID: 148 Comm: kworker/u10:2 Not tainted 4.4.0-rc2+ #51 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: events_unbound flush_to_ldisc task: ffff88006da94440 ti: ffff88006db60000 task.ti: ffff88006db60000 RIP: 0010:[<0000000000000000>] [< (null)>] (null) RSP: 0018:ffff88006db67b50 EFLAGS: 00010246 RAX: 0000000000000102 RBX: ffff88003ab32f88 RCX: 0000000000000102 RDX: 0000000000000000 RSI: ffff88003ab330a6 RDI: ffff88003aabd388 RBP: ffff88006db67c48 R08: ffff88003ab32f9c R09: ffff88003ab31fb0 R10: ffff88003ab32fa8 R11: 0000000000000000 R12: dffffc0000000000 R13: ffff88006db67c20 R14: ffffffff863df820 R15: ffff88003ab31fb8 FS: 0000000000000000(0000) GS:ffff88006dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000037938000 CR4: 00000000000006e0 Stack: ffffffff829f46f1 ffff88006da94bf8 ffff88006da94bf8 0000000000000000 ffff88003ab31fb0 ffff88003aabd438 ffff88003ab31ff8 ffff88006430fd90 ffff88003ab32f9c ffffed0007557a87 1ffff1000db6cf78 ffff88003ab32078 Call Trace: [] process_one_work+0x8f1/0x17a0 kernel/workqueue.c:2030 [] worker_thread+0xd4/0x1180 kernel/workqueue.c:2162 [] kthread+0x1cf/0x270 drivers/block/aoe/aoecmd.c:1302 [] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468 Code: Bad RIP value. RIP [< (null)>] (null) RSP CR2: 0000000000000000 ---[ end trace a587f8947e54d6ea ]--- Reported-by: Dmitry Vyukov Signed-off-by: Peter Hurley Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings (cherry picked from commit b23324ffa8ef8cc96865db76db938905d61d949a) [wt: applied to drivers/char/tty_buffer.c instead] Signed-off-by: Willy Tarreau commit d4b6a10d1fc87c0078f6d9b107d91029bff6cc8a Author: James Bottomley Date: Fri Dec 11 09:16:38 2015 -0800 ses: fix additional element traversal bug commit 5e1033561da1152c57b97ee84371dba2b3d64c25 upstream. KASAN found that our additional element processing scripts drop off the end of the VPD page into unallocated space. The reason is that not every element has additional information but our traversal routines think they do, leading to them expecting far more additional information than is present. Fix this by adding a gate to the traversal routine so that it only processes elements that are expected to have additional information (list is in SES-2 section 6.1.13.1: Additional Element Status diagnostic page overview) Reported-by: Pavel Tikhomirov Tested-by: Pavel Tikhomirov Signed-off-by: James Bottomley Signed-off-by: Ben Hutchings (cherry picked from commit 344d6d02690a650607e6372bce8fe40e647a51b5) Signed-off-by: Willy Tarreau commit f49fbe9ebc13e57ba5b2d91a1c1ffe0b4edf0cb1 Author: James Bottomley Date: Tue Dec 8 09:00:31 2015 -0800 ses: Fix problems with simple enclosures commit 3417c1b5cb1fdc10261dbed42b05cc93166a78fd upstream. Simple enclosure implementations (mostly USB) are allowed to return only page 8 to every diagnostic query. That really confuses our implementation because we assume the return is the page we asked for and end up doing incorrect offsets based on bogus information leading to accesses outside of allocated ranges. Fix that by checking the page code of the return and giving an error if it isn't the one we asked for. This should fix reported bugs with USB storage by simply refusing to attach to enclosures that behave like this. It's also good defensive practise now that we're starting to see more USB enclosures. Reported-by: Andrea Gelmini Reviewed-by: Ewan D. Milne Reviewed-by: Tomas Henzl Signed-off-by: James Bottomley Signed-off-by: Ben Hutchings (cherry picked from commit 25ef938516d26c3db9cb1f5b927ad7ee95be1022) Signed-off-by: Willy Tarreau commit 8a90c7575b8103e48f5d9daa1557d319148fb445 Author: Johannes Berg Date: Thu Dec 10 10:37:51 2015 +0100 rfkill: copy the name into the rfkill struct commit b7bb110008607a915298bf0f47d25886ecb94477 upstream. Some users of rfkill, like NFC and cfg80211, use a dynamic name when allocating rfkill, in those cases dev_name(). Therefore, the pointer passed to rfkill_alloc() might not be valid forever, I specifically found the case that the rfkill name was quite obviously an invalid pointer (or at least garbage) when the wiphy had been renamed. Fix this by making a copy of the rfkill name in rfkill_alloc(). Signed-off-by: Johannes Berg Signed-off-by: Ben Hutchings (cherry picked from commit 6f23bc6f6be370267332a0278a4646126836baee) Signed-off-by: Willy Tarreau commit 24ba3c534f7f6e17fb3fbdcee9427f05a0fd074e Author: Eric Dumazet Date: Wed May 1 05:24:03 2013 +0000 af_unix: fix a fatal race with bit fields commit 60bc851ae59bfe99be6ee89d6bc50008c85ec75d upstream. Using bit fields is dangerous on ppc64/sparc64, as the compiler [1] uses 64bit instructions to manipulate them. If the 64bit word includes any atomic_t or spinlock_t, we can lose critical concurrent changes. This is happening in af_unix, where unix_sk(sk)->gc_candidate/ gc_maybe_cycle/lock share the same 64bit word. This leads to fatal deadlock, as one/several cpus spin forever on a spinlock that will never be available again. A safer way would be to use a long to store flags. This way we are sure compiler/arch wont do bad things. As we own unix_gc_lock spinlock when clearing or setting bits, we can use the non atomic __set_bit()/__clear_bit(). recursion_level can share the same 64bit location with the spinlock, as it is set only with this spinlock held. [1] bug fixed in gcc-4.8.0 : http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080 Reported-by: Ambrose Feinstein Signed-off-by: Eric Dumazet Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings (cherry picked from commit 2ee9cbe7e7bfe2d36374288b818aa31b2c4981db) [wt: adjusted context] Signed-off-by: Willy Tarreau commit 7e0e67b0bbd024c074cd9ac61c2a8f8ebb9be619 Author: Marcelo Ricardo Leitner Date: Fri Dec 4 15:14:04 2015 -0200 sctp: update the netstamp_needed counter when copying sockets [ Upstream commit 01ce63c90170283a9855d1db4fe81934dddce648 ] Dmitry Vyukov reported that SCTP was triggering a WARN on socket destroy related to disabling sock timestamp. When SCTP accepts an association or peel one off, it copies sock flags but forgot to call net_enable_timestamp() if a packet timestamping flag was copied, leading to extra calls to net_disable_timestamp() whenever such clones were closed. The fix is to call net_enable_timestamp() whenever we copy a sock with that flag on, like tcp does. Reported-by: Dmitry Vyukov Signed-off-by: Marcelo Ricardo Leitner Acked-by: Vlad Yasevich Signed-off-by: David S. Miller [bwh: Backported to 3.2: SK_FLAGS_TIMESTAMP is newly defined] Signed-off-by: Ben Hutchings (cherry picked from commit d85242d91610acbe4f905624a5758a01ae7bb32c) Signed-off-by: Willy Tarreau commit 224994fa6d46e7aba39b8bd1d4121dd949642848 Author: Daniel Borkmann Date: Fri Nov 20 00:11:56 2015 +0100 net, scm: fix PaX detected msg_controllen overflow in scm_detach_fds [ Upstream commit 6900317f5eff0a7070c5936e5383f589e0de7a09 ] David and HacKurx reported a following/similar size overflow triggered in a grsecurity kernel, thanks to PaX's gcc size overflow plugin: (Already fixed in later grsecurity versions by Brad and PaX Team.) [ 1002.296137] PAX: size overflow detected in function scm_detach_fds net/core/scm.c:314 cicus.202_127 min, count: 4, decl: msg_controllen; num: 0; context: msghdr; [ 1002.296145] CPU: 0 PID: 3685 Comm: scm_rights_recv Not tainted 4.2.3-grsec+ #7 [ 1002.296149] Hardware name: Apple Inc. MacBookAir5,1/Mac-66F35F19FE2A0D05, [...] [ 1002.296153] ffffffff81c27366 0000000000000000 ffffffff81c27375 ffffc90007843aa8 [ 1002.296162] ffffffff818129ba 0000000000000000 ffffffff81c27366 ffffc90007843ad8 [ 1002.296169] ffffffff8121f838 fffffffffffffffc fffffffffffffffc ffffc90007843e60 [ 1002.296176] Call Trace: [ 1002.296190] [] dump_stack+0x45/0x57 [ 1002.296200] [] report_size_overflow+0x38/0x60 [ 1002.296209] [] scm_detach_fds+0x2ce/0x300 [ 1002.296220] [] unix_stream_read_generic+0x609/0x930 [ 1002.296228] [] unix_stream_recvmsg+0x4f/0x60 [ 1002.296236] [] ? unix_set_peek_off+0x50/0x50 [ 1002.296243] [] sock_recvmsg+0x47/0x60 [ 1002.296248] [] ___sys_recvmsg+0xe2/0x1e0 [ 1002.296257] [] __sys_recvmsg+0x46/0x80 [ 1002.296263] [] SyS_recvmsg+0x2c/0x40 [ 1002.296271] [] entry_SYSCALL_64_fastpath+0x12/0x85 Further investigation showed that this can happen when an *odd* number of fds are being passed over AF_UNIX sockets. In these cases CMSG_LEN(i * sizeof(int)) and CMSG_SPACE(i * sizeof(int)), where i is the number of successfully passed fds, differ by 4 bytes due to the extra CMSG_ALIGN() padding in CMSG_SPACE() to an 8 byte boundary on 64 bit. The padding is used to align subsequent cmsg headers in the control buffer. When the control buffer passed in from the receiver side *lacks* these 4 bytes (e.g. due to buggy/wrong API usage), then msg->msg_controllen will overflow in scm_detach_fds(): int cmlen = CMSG_LEN(i * sizeof(int)); <--- cmlen w/o tail-padding err = put_user(SOL_SOCKET, &cm->cmsg_level); if (!err) err = put_user(SCM_RIGHTS, &cm->cmsg_type); if (!err) err = put_user(cmlen, &cm->cmsg_len); if (!err) { cmlen = CMSG_SPACE(i * sizeof(int)); <--- cmlen w/ 4 byte extra tail-padding msg->msg_control += cmlen; msg->msg_controllen -= cmlen; <--- iff no tail-padding space here ... } ... wrap-around F.e. it will wrap to a length of 18446744073709551612 bytes in case the receiver passed in msg->msg_controllen of 20 bytes, and the sender properly transferred 1 fd to the receiver, so that its CMSG_LEN results in 20 bytes and CMSG_SPACE in 24 bytes. In case of MSG_CMSG_COMPAT (scm_detach_fds_compat()), I haven't seen an issue in my tests as alignment seems always on 4 byte boundary. Same should be in case of native 32 bit, where we end up with 4 byte boundaries as well. In practice, passing msg->msg_controllen of 20 to recvmsg() while receiving a single fd would mean that on successful return, msg->msg_controllen is being set by the kernel to 24 bytes instead, thus more than the input buffer advertised. It could f.e. become an issue if such application later on zeroes or copies the control buffer based on the returned msg->msg_controllen elsewhere. Maximum number of fds we can send is a hard upper limit SCM_MAX_FD (253). Going over the code, it seems like msg->msg_controllen is not being read after scm_detach_fds() in scm_recv() anymore by the kernel, good! Relevant recvmsg() handler are unix_dgram_recvmsg() (unix_seqpacket_recvmsg()) and unix_stream_recvmsg(). Both return back to their recvmsg() caller, and ___sys_recvmsg() places the updated length, that is, new msg_control - old msg_control pointer into msg->msg_controllen (hence the 24 bytes seen in the example). Long time ago, Wei Yongjun fixed something related in commit 1ac70e7ad24a ("[NET]: Fix function put_cmsg() which may cause usr application memory overflow"). RFC3542, section 20.2. says: The fields shown as "XX" are possible padding, between the cmsghdr structure and the data, and between the data and the next cmsghdr structure, if required by the implementation. While sending an application may or may not include padding at the end of last ancillary data in msg_controllen and implementations must accept both as valid. On receiving a portable application must provide space for padding at the end of the last ancillary data as implementations may copy out the padding at the end of the control message buffer and include it in the received msg_controllen. When recvmsg() is called if msg_controllen is too small for all the ancillary data items including any trailing padding after the last item an implementation may set MSG_CTRUNC. Since we didn't place MSG_CTRUNC for already quite a long time, just do the same as in 1ac70e7ad24a to avoid an overflow. Btw, even man-page author got this wrong :/ See db939c9b26e9 ("cmsg.3: Fix error in SCM_RIGHTS code sample"). Some people must have copied this (?), thus it got triggered in the wild (reported several times during boot by David and HacKurx). No Fixes tag this time as pre 2002 (that is, pre history tree). Reported-by: David Sterba Reported-by: HacKurx Cc: PaX Team Cc: Emese Revfy Cc: Brad Spengler Cc: Wei Yongjun Cc: Eric Dumazet Reviewed-by: Hannes Frederic Sowa Signed-off-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings (cherry picked from commit 831a2a17da39d93cf68981ff99cce5d31551044b) Signed-off-by: Willy Tarreau commit 03bc31b9e864a317ef68935c18a41418f2031732 Author: Eric Dumazet Date: Thu Nov 26 08:18:14 2015 -0800 tcp: initialize tp->copied_seq in case of cross SYN connection [ Upstream commit 142a2e7ece8d8ac0e818eb2c91f99ca894730e2a ] Dmitry provided a syzkaller (http://github.com/google/syzkaller) generated program that triggers the WARNING at net/ipv4/tcp.c:1729 in tcp_recvmsg() : WARN_ON(tp->copied_seq != tp->rcv_nxt && !(flags & (MSG_PEEK | MSG_TRUNC))); His program is specifically attempting a Cross SYN TCP exchange, that we support (for the pleasure of hackers ?), but it looks we lack proper tcp->copied_seq initialization. Thanks again Dmitry for your report and testings. Signed-off-by: Eric Dumazet Reported-by: Dmitry Vyukov Tested-by: Dmitry Vyukov Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings (cherry picked from commit 6cfa9781d3bf950eed455369966bbdf9d05871c5) Signed-off-by: Willy Tarreau commit ac5da7cfac88c0626165ec822f1bdde0d2752a96 Author: Jan Stancek Date: Tue Dec 8 13:57:51 2015 -0500 ipmi: move timer init to before irq is setup commit 27f972d3e00b50639deb4cc1392afaeb08d3cecc upstream. We encountered a panic on boot in ipmi_si on a dell per320 due to an uninitialized timer as follows. static int smi_start_processing(void *send_info, ipmi_smi_t intf) { /* Try to claim any interrupts. */ if (new_smi->irq_setup) new_smi->irq_setup(new_smi); --> IRQ arrives here and irq handler tries to modify uninitialized timer which triggers BUG_ON(!timer->function) in __mod_timer(). Call Trace: [] start_new_msg+0x47/0x80 [ipmi_si] [] start_check_enables+0x4e/0x60 [ipmi_si] [] smi_event_handler+0x1e8/0x640 [ipmi_si] [] ? __rcu_process_callbacks+0x54/0x350 [] si_irq_handler+0x3c/0x60 [ipmi_si] [] handle_IRQ_event+0x60/0x170 [] handle_edge_irq+0xde/0x180 [] handle_irq+0x49/0xa0 [] do_IRQ+0x6c/0xf0 [] ret_from_intr+0x0/0x11 /* Set up the timer that drives the interface. */ setup_timer(&new_smi->si_timer, smi_timeout, (long)new_smi); The following patch fixes the problem. To: Openipmi-developer@lists.sourceforge.net To: Corey Minyard CC: linux-kernel@vger.kernel.org Signed-off-by: Jan Stancek Signed-off-by: Tony Camuso Signed-off-by: Corey Minyard [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings (cherry picked from commit 207ffa8cbcf6e780d862316a94c95a7e70d1e72e) Signed-off-by: Willy Tarreau commit 788c44ff3dc52cc20cec88200bda758acb57cc66 Author: Sasha Levin Date: Mon Nov 30 20:34:20 2015 -0500 sched/core: Remove false-positive warning from wake_up_process() commit 119d6f6a3be8b424b200dcee56e74484d5445f7e upstream. Because wakeups can (fundamentally) be late, a task might not be in the expected state. Therefore testing against a task's state is racy, and can yield false positives. Signed-off-by: Sasha Levin Signed-off-by: Peter Zijlstra (Intel) Acked-by: Linus Torvalds Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: oleg@redhat.com Fixes: 9067ac85d533 ("wake_up_process() should be never used to wakeup a TASK_STOPPED/TRACED task") Link: http://lkml.kernel.org/r/1448933660-23082-1-git-send-email-sasha.levin@oracle.com Signed-off-by: Ingo Molnar [bwh: Backported to 3.2: adjust filename] Signed-off-by: Ben Hutchings (cherry picked from commit 0e796c1b57fdd97fc040b0b78ff7fea6c0a4a39d) Signed-off-by: Willy Tarreau commit 5f4876761cb8a57db5cb8dc856961b6a24944c67 Author: Andrew Lunn Date: Tue Dec 1 16:31:08 2015 +0100 ipv4: igmp: Allow removing groups from a removed interface commit 4eba7bb1d72d9bde67d810d09bf62dc207b63c5c upstream. When a multicast group is joined on a socket, a struct ip_mc_socklist is appended to the sockets mc_list containing information about the joined group. If the interface is hot unplugged, this entry becomes stale. Prior to commit 52ad353a5344f ("igmp: fix the problem when mc leave group") it was possible to remove the stale entry by performing a IP_DROP_MEMBERSHIP, passing either the old ifindex or ip address on the interface. However, this fix enforces that the interface must still exist. Thus with time, the number of stale entries grows, until sysctl_igmp_max_memberships is reached and then it is not possible to join and more groups. The previous patch fixes an issue where a IP_DROP_MEMBERSHIP is performed without specifying the interface, either by ifindex or ip address. However here we do supply one of these. So loosen the restriction on device existence to only apply when the interface has not been specified. This then restores the ability to clean up the stale entries. Signed-off-by: Andrew Lunn Fixes: 52ad353a5344f "(igmp: fix the problem when mc leave group") Signed-off-by: David S. Miller [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings (cherry picked from commit b1fa852672fb1ea4b0e34bfa1136bb65dba66151) Signed-off-by: Willy Tarreau commit abffc10c391bd6e46f9fe4ddef1c557307a8dece Author: Peter Hurley Date: Fri Nov 27 14:18:39 2015 -0500 wan/x25: Fix use-after-free in x25_asy_open_tty() commit ee9159ddce14bc1dec9435ae4e3bd3153e783706 upstream. The N_X25 line discipline may access the previous line discipline's closed and already-freed private data on open [1]. The tty->disc_data field _never_ refers to valid data on entry to the line discipline's open() method. Rather, the ldisc is expected to initialize that field for its own use for the lifetime of the instance (ie. from open() to close() only). [1] [ 634.336761] ================================================================== [ 634.338226] BUG: KASAN: use-after-free in x25_asy_open_tty+0x13d/0x490 at addr ffff8800a743efd0 [ 634.339558] Read of size 4 by task syzkaller_execu/8981 [ 634.340359] ============================================================================= [ 634.341598] BUG kmalloc-512 (Not tainted): kasan: bad access detected ... [ 634.405018] Call Trace: [ 634.405277] dump_stack (lib/dump_stack.c:52) [ 634.405775] print_trailer (mm/slub.c:655) [ 634.406361] object_err (mm/slub.c:662) [ 634.406824] kasan_report_error (mm/kasan/report.c:138 mm/kasan/report.c:236) [ 634.409581] __asan_report_load4_noabort (mm/kasan/report.c:279) [ 634.411355] x25_asy_open_tty (drivers/net/wan/x25_asy.c:559 (discriminator 1)) [ 634.413997] tty_ldisc_open.isra.2 (drivers/tty/tty_ldisc.c:447) [ 634.414549] tty_set_ldisc (drivers/tty/tty_ldisc.c:567) [ 634.415057] tty_ioctl (drivers/tty/tty_io.c:2646 drivers/tty/tty_io.c:2879) [ 634.423524] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607) [ 634.427491] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613) [ 634.427945] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:188) Reported-and-tested-by: Sasha Levin Signed-off-by: Peter Hurley Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings (cherry picked from commit 485274cf9a69504cf4a7ef182c508f3541eee84e) Signed-off-by: Willy Tarreau commit 86ba50116c357109d57300ff08bf4ee66463cae3 Author: Jeff Layton Date: Wed Nov 25 13:50:11 2015 -0500 nfs: if we have no valid attrs, then don't declare the attribute cache valid commit c812012f9ca7cf89c9e1a1cd512e6c3b5be04b85 upstream. If we pass in an empty nfs_fattr struct to nfs_update_inode, it will (correctly) not update any of the attributes, but it then clears the NFS_INO_INVALID_ATTR flag, which indicates that the attributes are up to date. Don't clear the flag if the fattr struct has no valid attrs to apply. Reviewed-by: Steve French Signed-off-by: Jeff Layton Signed-off-by: Trond Myklebust Signed-off-by: Ben Hutchings (cherry picked from commit ddab0155aa9589600861e3f65d505982b719496a) Signed-off-by: Willy Tarreau commit 74147da1b621c713937081929a7ffc175e9c4691 Author: David Turner Date: Tue Nov 24 14:34:37 2015 -0500 ext4: Fix handling of extended tv_sec commit a4dad1ae24f850410c4e60f22823cba1289b8d52 upstream. In ext4, the bottom two bits of {a,c,m}time_extra are used to extend the {a,c,m}time fields, deferring the year 2038 problem to the year 2446. When decoding these extended fields, for times whose bottom 32 bits would represent a negative number, sign extension causes the 64-bit extended timestamp to be negative as well, which is not what's intended. This patch corrects that issue, so that the only negative {a,c,m}times are those between 1901 and 1970 (as per 32-bit signed timestamps). Some older kernels might have written pre-1970 dates with 1,1 in the extra bits. This patch treats those incorrectly-encoded dates as pre-1970, instead of post-2311, until kernel 4.20 is released. Hopefully by then e2fsck will have fixed up the bad data. Also add a comment explaining the encoding of ext4's extra {a,c,m}time bits. Signed-off-by: David Turner Signed-off-by: Theodore Ts'o Reported-by: Mark Harris Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=23732 Signed-off-by: Ben Hutchings (cherry picked from commit 6dfd5f6abf1825fc351f663bf630603f9b78251b) Signed-off-by: Willy Tarreau commit dce8179bf6a93d91e5c1edcde71a91cbb91c9f0b Author: Jan Kara Date: Mon Nov 23 13:09:51 2015 +0100 vfs: Avoid softlockups with sendfile(2) commit c2489e07c0a71a56fb2c84bc0ee66cddfca7d068 upstream. The following test program from Dmitry can cause softlockups or RCU stalls as it copies 1GB from tmpfs into eventfd and we don't have any scheduling point at that path in sendfile(2) implementation: int r1 = eventfd(0, 0); int r2 = memfd_create("", 0); unsigned long n = 1<<30; fallocate(r2, 0, 0, n); sendfile(r1, r2, 0, n); Add cond_resched() into __splice_from_pipe() to fix the problem. CC: Dmitry Vyukov Signed-off-by: Jan Kara Signed-off-by: Al Viro Signed-off-by: Ben Hutchings (cherry picked from commit 47ae562efbd0fd8440959304915291865b945c67) Signed-off-by: Willy Tarreau commit 54146dba78db0ee7ff31cb8311c78aebfd21c2ba Author: Al Viro Date: Mon Nov 23 21:11:08 2015 -0500 fix sysvfs symlinks commit 0ebf7f10d67a70e120f365018f1c5fce9ddc567d upstream. The thing got broken back in 2002 - sysvfs does *not* have inline symlinks; even short ones have bodies stored in the first block of file. sysv_symlink() handles that correctly; unfortunately, attempting to look an existing symlink up will end up confusing them for inline symlinks, and interpret the block number containing the body as the body itself. Nobody has noticed until now, which says something about the level of testing sysvfs gets ;-/ Signed-off-by: Al Viro [bwh: Backported to 3.2: - Adjust context - Also delete unused sysv_fast_symlink_inode_operations] Signed-off-by: Ben Hutchings (cherry picked from commit 081c769778f7a41f4eb3d633df26ad572575c980) Signed-off-by: Willy Tarreau commit de743b3d3670be21ab126559ae6734c02be99756 Author: Roman Gushchin Date: Mon Oct 12 16:33:44 2015 +0300 fuse: break infinite loop in fuse_fill_write_pages() commit 3ca8138f014a913f98e6ef40e939868e1e9ea876 upstream. I got a report about unkillable task eating CPU. Further investigation shows, that the problem is in the fuse_fill_write_pages() function. If iov's first segment has zero length, we get an infinite loop, because we never reach iov_iter_advance() call. Fix this by calling iov_iter_advance() before repeating an attempt to copy data from userspace. A similar problem is described in 124d3b7041f ("fix writev regression: pan hanging unkillable and un-straceable"). If zero-length segmend is followed by segment with invalid address, iov_iter_fault_in_readable() checks only first segment (zero-length), iov_iter_copy_from_user_atomic() skips it, fails at second and returns zero -> goto again without skipping zero-length segment. Patch calls iov_iter_advance() before goto again: we'll skip zero-length segment at second iteraction and iov_iter_fault_in_readable() will detect invalid address. Special thanks to Konstantin Khlebnikov, who helped a lot with the commit description. Cc: Andrew Morton Cc: Maxim Patlasov Cc: Konstantin Khlebnikov Signed-off-by: Roman Gushchin Signed-off-by: Miklos Szeredi Fixes: ea9b9907b82a ("fuse: implement perform_write") Signed-off-by: Ben Hutchings (cherry picked from commit a5b234167a1ff46f311f5835828eec2f971b9bb4) [wt: adjusted context, as commit 478e0841b from 3.1 was never backported to call mark_page_accessed() eventhough it seems it should have been] Signed-off-by: Willy Tarreau commit 0ee4375197988ba13f7b055cee71d5bdbe5ee658 Author: lucien Date: Thu Nov 12 13:07:07 2015 +0800 sctp: translate host order to network order when setting a hmacid commit ed5a377d87dc4c87fb3e1f7f698cba38cd893103 upstream. now sctp auth cannot work well when setting a hmacid manually, which is caused by that we didn't use the network order for hmacid, so fix it by adding the transformation in sctp_auth_ep_set_hmacs. even we set hmacid with the network order in userspace, it still can't work, because of this condition in sctp_auth_ep_set_hmacs(): if (id > SCTP_AUTH_HMAC_ID_MAX) return -EOPNOTSUPP; so this wasn't working before and thus it won't break compatibility. Fixes: 65b07e5d0d09 ("[SCTP]: API updates to suport SCTP-AUTH extensions.") Signed-off-by: Xin Long Signed-off-by: Marcelo Ricardo Leitner Acked-by: Neil Horman Acked-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings (cherry picked from commit 98d37e7fed9568f37d4a4a150a3773cfafef91ed) Signed-off-by: Willy Tarreau commit ba76e37468ac7158282460b06c03bbe71174ce8e Author: David S. Miller Date: Tue Dec 15 15:39:08 2015 -0500 bluetooth: Validate socket address length in sco_sock_bind(). commit 5233252fce714053f0151680933571a2da9cbfb4 upstream. Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings Signed-off-by: Willy Tarreau commit 073a63f16e3f34ef520270d998d2afea9b79d35f Author: Hannes Frederic Sowa Date: Mon Dec 14 23:30:43 2015 +0100 net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration commit 7bbadd2d1009575dad675afc16650ebb5aa10612 upstream. Docbook does not like the definition of macros inside a field declaration and adds a warning. Move the definition out. Fixes: 79462ad02e86180 ("net: add validation for the socket syscall protocol argument") Reported-by: kbuild test robot Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller [bwh: Backported to 3.2: keep open-coding U8_MAX] Signed-off-by: Ben Hutchings (cherry picked from commit 35da6d62b608fd3beb576d72e48e2bc866d714c9) [wt: adjusted context] Signed-off-by: Willy Tarreau commit be9b6c294668e0686e8ac69f617c6c1963a91472 Author: Hannes Frederic Sowa Date: Mon Dec 14 22:03:39 2015 +0100 net: add validation for the socket syscall protocol argument commit 79462ad02e861803b3840cc782248c7359451cd9 upstream. 郭永刚 reported that one could simply crash the kernel as root by using a simple program: int socket_fd; struct sockaddr_in addr; addr.sin_port = 0; addr.sin_addr.s_addr = INADDR_ANY; addr.sin_family = 10; socket_fd = socket(10,3,0x40000000); connect(socket_fd , &addr,16); AF_INET, AF_INET6 sockets actually only support 8-bit protocol identifiers. inet_sock's skc_protocol field thus is sized accordingly, thus larger protocol identifiers simply cut off the higher bits and store a zero in the protocol fields. This could lead to e.g. NULL function pointer because as a result of the cut off inet_num is zero and we call down to inet_autobind, which is NULL for raw sockets. kernel: Call Trace: kernel: [] ? inet_autobind+0x2e/0x70 kernel: [] inet_dgram_connect+0x54/0x80 kernel: [] SYSC_connect+0xd9/0x110 kernel: [] ? ptrace_notify+0x5b/0x80 kernel: [] ? syscall_trace_enter_phase2+0x108/0x200 kernel: [] SyS_connect+0xe/0x10 kernel: [] tracesys_phase2+0x84/0x89 I found no particular commit which introduced this problem. CVE: CVE-2015-8543 Cc: Cong Wang Reported-by: 郭永刚 Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller [bwh: Backported to 2.6.32: open-code U8_MAX; adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Willy Tarreau commit 67fbe958a57bfa443b2e4f7c3a15d3e109082e4b Author: David Howells Date: Fri Dec 18 01:34:26 2015 +0000 KEYS: Fix race between read and revoke commit b4a1b4f5047e4f54e194681125c74c0aa64d637d upstream. This fixes CVE-2015-7550. There's a race between keyctl_read() and keyctl_revoke(). If the revoke happens between keyctl_read() checking the validity of a key and the key's semaphore being taken, then the key type read method will see a revoked key. This causes a problem for the user-defined key type because it assumes in its read method that there will always be a payload in a non-revoked key and doesn't check for a NULL pointer. Fix this by making keyctl_read() check the validity of a key after taking semaphore instead of before. I think the bug was introduced with the original keyrings code. This was discovered by a multithreaded test program generated by syzkaller (http://github.com/google/syzkaller). Here's a cleaned up version: #include #include #include void *thr0(void *arg) { key_serial_t key = (unsigned long)arg; keyctl_revoke(key); return 0; } void *thr1(void *arg) { key_serial_t key = (unsigned long)arg; char buffer[16]; keyctl_read(key, buffer, 16); return 0; } int main() { key_serial_t key = add_key("user", "%", "foo", 3, KEY_SPEC_USER_KEYRING); pthread_t th[5]; pthread_create(&th[0], 0, thr0, (void *)(unsigned long)key); pthread_create(&th[1], 0, thr1, (void *)(unsigned long)key); pthread_create(&th[2], 0, thr0, (void *)(unsigned long)key); pthread_create(&th[3], 0, thr1, (void *)(unsigned long)key); pthread_join(th[0], 0); pthread_join(th[1], 0); pthread_join(th[2], 0); pthread_join(th[3], 0); return 0; } Build as: cc -o keyctl-race keyctl-race.c -lkeyutils -lpthread Run as: while keyctl-race; do :; done as it may need several iterations to crash the kernel. The crash can be summarised as: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: [] user_read+0x56/0xa3 ... Call Trace: [] keyctl_read_key+0xb6/0xd7 [] SyS_keyctl+0x83/0xe0 [] entry_SYSCALL_64_fastpath+0x12/0x6f Reported-by: Dmitry Vyukov Signed-off-by: David Howells Tested-by: Dmitry Vyukov Signed-off-by: James Morris [bwh: Backported to 2.6.32: adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Willy Tarreau commit 5ccf7b4d5211c75460601b6d291b5f252ab2fca2 Author: Eric Dumazet Date: Wed Dec 30 08:51:12 2015 -0500 udp: properly support MSG_PEEK with truncated buffers commit 197c949e7798fbf28cfadc69d9ca0c2abbf93191 upstream. Backport of this upstream commit into stable kernels : 89c22d8c3b27 ("net: Fix skb csum races when peeking") exposed a bug in udp stack vs MSG_PEEK support, when user provides a buffer smaller than skb payload. In this case, skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr), msg->msg_iov); returns -EFAULT. This bug does not happen in upstream kernels since Al Viro did a great job to replace this into : skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg); This variant is safe vs short buffers. For the time being, instead reverting Herbert Xu patch and add back skb->ip_summed invalid changes, simply store the result of udp_lib_checksum_complete() so that we avoid computing the checksum a second time, and avoid the problematic skb_copy_and_csum_datagram_iovec() call. This patch can be applied on recent kernels as it avoids a double checksumming, then backported to stable kernels as a bug fix. Signed-off-by: Eric Dumazet Acked-by: Herbert Xu Signed-off-by: David S. Miller [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings (cherry picked from commit 18a6eba2eabbcb50a78210b16f7dd43d888a537b) Signed-off-by: Willy Tarreau commit d699b47f3c87c2fb30ed548ffacae85cebf2cafe Author: Willy Tarreau Date: Sat Jan 23 11:16:20 2016 +0100 Revert "net: add length argument to skb_copy_and_csum_datagram_iovec" This reverts commit c507639ba963bb47e3f515670a7cace76af76ab6. As reported by Michal Kubecek, this fix doesn't handle truncated reads correctly. Next patch from Eric fixes it better. Signed-off-by: Willy Tarreau commit 6e5577bfd9e561ddb3167c4fcc815a8032b7952c Author: Ben Hutchings Date: Tue Dec 1 02:50:10 2015 +0000 ext4: Fix null dereference in ext4_fill_super() Fix failure paths in ext4_fill_super() that can lead to a null dereference. This was designated CVE-2015-8324. Mostly extracted from commit 744692dc0598 ("ext4: use ext4_get_block_write in buffer write"). However there's one more incorrect goto to fix, removed upstream in commit cf40db137cc2 ("ext4: remove failed journal checksum check"). Reference: https://bugs.openvz.org/browse/OVZ-6541 Signed-off-by: Ben Hutchings Signed-off-by: Willy Tarreau commit 60bc010667ef06e0fb08d5ec599c0977adc2ac72 Author: Rainer Weikusat Date: Fri Nov 20 22:07:23 2015 +0000 unix: avoid use-after-free in ep_remove_wait_queue commit 7d267278a9ece963d77eefec61630223fce08c6c upstream. Rainer Weikusat writes: An AF_UNIX datagram socket being the client in an n:1 association with some server socket is only allowed to send messages to the server if the receive queue of this socket contains at most sk_max_ack_backlog datagrams. This implies that prospective writers might be forced to go to sleep despite none of the message presently enqueued on the server receive queue were sent by them. In order to ensure that these will be woken up once space becomes again available, the present unix_dgram_poll routine does a second sock_poll_wait call with the peer_wait wait queue of the server socket as queue argument (unix_dgram_recvmsg does a wake up on this queue after a datagram was received). This is inherently problematic because the server socket is only guaranteed to remain alive for as long as the client still holds a reference to it. In case the connection is dissolved via connect or by the dead peer detection logic in unix_dgram_sendmsg, the server socket may be freed despite "the polling mechanism" (in particular, epoll) still has a pointer to the corresponding peer_wait queue. There's no way to forcibly deregister a wait queue with epoll. Based on an idea by Jason Baron, the patch below changes the code such that a wait_queue_t belonging to the client socket is enqueued on the peer_wait queue of the server whenever the peer receive queue full condition is detected by either a sendmsg or a poll. A wake up on the peer queue is then relayed to the ordinary wait queue of the client socket via wake function. The connection to the peer wait queue is again dissolved if either a wake up is about to be relayed or the client socket reconnects or a dead peer is detected or the client socket is itself closed. This enables removing the second sock_poll_wait from unix_dgram_poll, thus avoiding the use-after-free, while still ensuring that no blocked writer sleeps forever. Signed-off-by: Rainer Weikusat Fixes: ec0d215f9420 ("af_unix: fix 'poll for write'/connected DGRAM sockets") Reviewed-by: Jason Baron Signed-off-by: David S. Miller [bwh: Backported to 2.6.32: - Access sk_sleep directly, not through sk_sleep() function - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Willy Tarreau commit 31fefb1f987a7958f75bb19c296807e53d9f3644 Author: Quentin Casasnovas Date: Tue Nov 24 17:13:21 2015 -0500 RDS: fix race condition when sending a message on unbound socket commit 8c7188b23474cca017b3ef354c4a58456f68303a upstream. Sasha's found a NULL pointer dereference in the RDS connection code when sending a message to an apparently unbound socket. The problem is caused by the code checking if the socket is bound in rds_sendmsg(), which checks the rs_bound_addr field without taking a lock on the socket. This opens a race where rs_bound_addr is temporarily set but where the transport is not in rds_bind(), leading to a NULL pointer dereference when trying to dereference 'trans' in __rds_conn_create(). Vegard wrote a reproducer for this issue, so kindly ask him to share if you're interested. I cannot reproduce the NULL pointer dereference using Vegard's reproducer with this patch, whereas I could without. Complete earlier incomplete fix to CVE-2015-6937: 74e98eb08588 ("RDS: verify the underlying transport exists before creating a connection") Cc: David S. Miller Reviewed-by: Vegard Nossum Reviewed-by: Sasha Levin Acked-by: Santosh Shilimkar Signed-off-by: Quentin Casasnovas Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings Signed-off-by: Willy Tarreau commit 42fc512469e78939c1e419d3310c47de55bdcbb8 Author: Ben Hutchings Date: Sun Nov 1 16:22:53 2015 +0000 ppp, slip: Validate VJ compression slot parameters completely commit 4ab42d78e37a294ac7bc56901d563c642e03c4ae upstream. Currently slhc_init() treats out-of-range values of rslots and tslots as equivalent to 0, except that if tslots is too large it will dereference a null pointer (CVE-2015-7799). Add a range-check at the top of the function and make it return an ERR_PTR() on error instead of NULL. Change the callers accordingly. Compile-tested only. Reported-by: 郭永刚 References: http://article.gmane.org/gmane.comp.security.oss.general/17908 Signed-off-by: Ben Hutchings Signed-off-by: David S. Miller [bwh: Backported to 2.6.32: adjust filenames, context, indentation] Signed-off-by: Willy Tarreau commit 1debe900d2802b56dfa14f89e2918453539d85c6 Author: Ben Hutchings Date: Sun Nov 1 16:21:24 2015 +0000 isdn_ppp: Add checks for allocation failure in isdn_ppp_open() commit 0baa57d8dc32db78369d8b5176ef56c5e2e18ab3 upstream. Compile-tested only. Signed-off-by: Ben Hutchings Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 977dc430a2b849161cb4018f568e403d724d2595 Author: WANG Cong Date: Tue Mar 31 11:01:47 2015 -0700 ip6mr: call del_timer_sync() in ip6mr_free_table() commit 7ba0c47c34a1ea5bc7a24ca67309996cce0569b5 upstream. We need to wait for the flying timers, since we are going to free the mrtable right after it. Cc: Hannes Frederic Sowa Signed-off-by: Cong Wang Signed-off-by: David S. Miller Cc: Ben Hutchings [ wt: 2.6.32 has a single table hence a single timer. ip6_mr_init() has the same del_timer() call on the error path, but we don't need to change it since at this point the timer hasn't been started yet ] Signed-off-by: Willy Tarreau