commit 7f62be2f53a2353a7e5c124075ec22a4344e8d57 Author: Willy Tarreau Date: Mon May 19 07:53:10 2014 +0200 Linux 2.6.32.62 Signed-off-by: Willy Tarreau commit f0fa6c0306fc48c9b982c412e926b284906ba62d Author: Matthew Daley Date: Mon Apr 28 19:05:21 2014 +1200 floppy: don't write kernel-only members to FDRAWCMD ioctl output Do not leak kernel-only floppy_raw_cmd structure members to userspace. This includes the linked-list pointer and the pointer to the allocated DMA space. Signed-off-by: Matthew Daley Signed-off-by: Linus Torvalds (cherry picked from commit 2145e15e0557a01b9195d1c7199a1b92cb9be81f) Signed-off-by: Willy Tarreau commit f76b441b9d5879587f53525933d27b16217ca78c Author: Matthew Daley Date: Mon Apr 28 19:05:20 2014 +1200 floppy: ignore kernel-only members in FDRAWCMD ioctl input Always clear out these floppy_raw_cmd struct members after copying the entire structure from userspace so that the in-kernel version is always valid and never left in an interdeterminate state. Signed-off-by: Matthew Daley Signed-off-by: Linus Torvalds (cherry picked from commit ef87dbe7614341c2e7bfe8d32fcb7028cc97442c) [wt: be careful in 2.6.32 we still have the ugly macros everywhere] Signed-off-by: Willy Tarreau commit d751df64d58a9927b9c4b47aa7cf20204d0ab2f6 Author: Daniel Borkmann Date: Mon Jan 6 00:57:54 2014 +0100 netfilter: nf_conntrack_dccp: fix skb_header_pointer API usages commit b22f5126a24b3b2f15448c3f2a254fc10cbc2b92 upstream Some occurences in the netfilter tree use skb_header_pointer() in the following way ... struct dccp_hdr _dh, *dh; ... skb_header_pointer(skb, dataoff, sizeof(_dh), &dh); ... where dh itself is a pointer that is being passed as the copy buffer. Instead, we need to use &_dh as the forth argument so that we're copying the data into an actual buffer that sits on the stack. Currently, we probably could overwrite memory on the stack (e.g. with a possibly mal-formed DCCP packet), but unintentionally, as we only want the buffer to be placed into _dh variable. Fixes: 2bc780499aa3 ("[NETFILTER]: nf_conntrack: add DCCP protocol support") Signed-off-by: Daniel Borkmann Signed-off-by: Pablo Neira Ayuso Signed-off-by: Willy Tarreau commit 0e37e30a8f55140dc1702b2d2008d41155bb3360 Author: Martin Schwidefsky Date: Mon Feb 3 17:37:15 2014 +0100 s390: fix kernel crash due to linkage stack instructions commit 8d7f6690cedb83456edd41c9bd583783f0703bf0 upstream The kernel currently crashes with a low-address-protection exception if a user space process executes an instruction that tries to use the linkage stack. Set the base-ASTE origin and the subspace-ASTE origin of the dispatchable-unit-control-table to point to a dummy ASTE. Set up control register 15 to point to an empty linkage stack with no room left. A user space process with a linkage stack instruction will still crash but with a different exception which is correctly translated to a segmentation fault instead of a kernel oops. Signed-off-by: Martin Schwidefsky [dannf: backported to Debian's 2.6.32] Signed-off-by: Willy Tarreau commit efe55430442061cb01177c88b891a678935a6eae Author: Stephen Smalley Date: Thu Jan 30 11:26:59 2014 -0500 SELinux: Fix kernel BUG on empty security contexts. commit 2172fa709ab32ca60e86179dc67d0857be8e2c98 upstream Setting an empty security context (length=0) on a file will lead to incorrectly dereferencing the type and other fields of the security context structure, yielding a kernel BUG. As a zero-length security context is never valid, just reject all such security contexts whether coming from userspace via setxattr or coming from the filesystem upon a getxattr request by SELinux. Setting a security context value (empty or otherwise) unknown to SELinux in the first place is only possible for a root process (CAP_MAC_ADMIN), and, if running SELinux in enforcing mode, only if the corresponding SELinux mac_admin permission is also granted to the domain by policy. In Fedora policies, this is only allowed for specific domains such as livecd for setting down security contexts that are not defined in the build host policy. Reproducer: su setenforce 0 touch foo setfattr -n security.selinux foo Caveat: Relabeling or removing foo after doing the above may not be possible without booting with SELinux disabled. Any subsequent access to foo after doing the above will also trigger the BUG. BUG output from Matthew Thode: [ 473.893141] ------------[ cut here ]------------ [ 473.962110] kernel BUG at security/selinux/ss/services.c:654! [ 473.995314] invalid opcode: 0000 [#6] SMP [ 474.027196] Modules linked in: [ 474.058118] CPU: 0 PID: 8138 Comm: ls Tainted: G D I 3.13.0-grsec #1 [ 474.116637] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0 07/29/10 [ 474.149768] task: ffff8805f50cd010 ti: ffff8805f50cd488 task.ti: ffff8805f50cd488 [ 474.183707] RIP: 0010:[] [] context_struct_compute_av+0xce/0x308 [ 474.219954] RSP: 0018:ffff8805c0ac3c38 EFLAGS: 00010246 [ 474.252253] RAX: 0000000000000000 RBX: ffff8805c0ac3d94 RCX: 0000000000000100 [ 474.287018] RDX: ffff8805e8aac000 RSI: 00000000ffffffff RDI: ffff8805e8aaa000 [ 474.321199] RBP: ffff8805c0ac3cb8 R08: 0000000000000010 R09: 0000000000000006 [ 474.357446] R10: 0000000000000000 R11: ffff8805c567a000 R12: 0000000000000006 [ 474.419191] R13: ffff8805c2b74e88 R14: 00000000000001da R15: 0000000000000000 [ 474.453816] FS: 00007f2e75220800(0000) GS:ffff88061fc00000(0000) knlGS:0000000000000000 [ 474.489254] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 474.522215] CR2: 00007f2e74716090 CR3: 00000005c085e000 CR4: 00000000000207f0 [ 474.556058] Stack: [ 474.584325] ffff8805c0ac3c98 ffffffff811b549b ffff8805c0ac3c98 ffff8805f1190a40 [ 474.618913] ffff8805a6202f08 ffff8805c2b74e88 00068800d0464990 ffff8805e8aac860 [ 474.653955] ffff8805c0ac3cb8 000700068113833a ffff880606c75060 ffff8805c0ac3d94 [ 474.690461] Call Trace: [ 474.723779] [] ? lookup_fast+0x1cd/0x22a [ 474.778049] [] security_compute_av+0xf4/0x20b [ 474.811398] [] avc_compute_av+0x2a/0x179 [ 474.843813] [] avc_has_perm+0x45/0xf4 [ 474.875694] [] inode_has_perm+0x2a/0x31 [ 474.907370] [] selinux_inode_getattr+0x3c/0x3e [ 474.938726] [] security_inode_getattr+0x1b/0x22 [ 474.970036] [] vfs_getattr+0x19/0x2d [ 475.000618] [] vfs_fstatat+0x54/0x91 [ 475.030402] [] vfs_lstat+0x19/0x1b [ 475.061097] [] SyS_newlstat+0x15/0x30 [ 475.094595] [] ? __audit_syscall_entry+0xa1/0xc3 [ 475.148405] [] system_call_fastpath+0x16/0x1b [ 475.179201] Code: 00 48 85 c0 48 89 45 b8 75 02 0f 0b 48 8b 45 a0 48 8b 3d 45 d0 b6 00 8b 40 08 89 c6 ff ce e8 d1 b0 06 00 48 85 c0 49 89 c7 75 02 <0f> 0b 48 8b 45 b8 4c 8b 28 eb 1e 49 8d 7d 08 be 80 01 00 00 e8 [ 475.255884] RIP [] context_struct_compute_av+0xce/0x308 [ 475.296120] RSP [ 475.328734] ---[ end trace f076482e9d754adc ]--- Reported-by: Matthew Thode Signed-off-by: Stephen Smalley Cc: stable@vger.kernel.org Signed-off-by: Paul Moore Signed-off-by: Willy Tarreau commit a7060b9d99974ff98d0e87f3f3fd2914db2141ea Author: Dan Carpenter Date: Tue Oct 29 19:11:06 2013 +0000 aacraid: missing capable() check in compat ioctl commit f856567b930dfcdbc3323261bf77240ccdde01f5 upstream In commit d496f94d22d1 ('[SCSI] aacraid: fix security weakness') we added a check on CAP_SYS_RAWIO to the ioctl. The compat ioctls need the check as well. Signed-off-by: Dan Carpenter Cc: stable@kernel.org Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit e017ce714b950b8d7ec563e0843a34a6b7d0263c Author: Dan Carpenter Date: Thu Oct 31 21:00:10 2013 +0300 xfs: underflow bug in xfs_attrlist_by_handle() If we allocate less than sizeof(struct attrlist) then we end up corrupting memory or doing a ZERO_PTR_SIZE dereference. This can only be triggered with CAP_SYS_ADMIN. Reported-by: Nico Golde Reported-by: Fabian Yamaguchi Signed-off-by: Dan Carpenter Reviewed-by: Dave Chinner Signed-off-by: Ben Myers (cherry picked from commit 071c529eb672648ee8ca3f90944bcbcc730b4c06) [dannf: backported to Debian's 2.6.32] Signed-off-by: Willy Tarreau commit 857c509645568201e67c3566a8293fd7b9e7cf94 Author: Ursula Braun Date: Wed Nov 6 08:04:52 2013 +0000 qeth: avoid buffer overflow in snmp ioctl commit 6fb392b1a63ae36c31f62bc3fc8630b49d602b62 upstream Check user-defined length in snmp ioctl request and allow request only if it fits into a qeth command buffer. Signed-off-by: Ursula Braun Signed-off-by: Frank Blaschka Reviewed-by: Heiko Carstens Reported-by: Nico Golde Reported-by: Fabian Yamaguchi Cc: Signed-off-by: David S. Miller [jmm: backport 2.6.32] Signed-off-by: Willy Tarreau commit e3123670efb2e10e536fa7fe752d72a4ca5bce9a Author: Andy Honig Date: Tue Nov 19 14:12:18 2013 -0800 KVM: x86: Fix potential divide by 0 in lapic (CVE-2013-6367) commit b963a22e6d1a266a67e9eecc88134713fd54775c upstream Under guest controllable circumstances apic_get_tmcct will execute a divide by zero and cause a crash. If the guest cpuid support tsc deadline timers and performs the following sequence of requests the host will crash. - Set the mode to periodic - Set the TMICT to 0 - Set the mode bits to 11 (neither periodic, nor one shot, nor tsc deadline) - Set the TMICT to non-zero. Then the lapic_timer.period will be 0, but the TMICT will not be. If the guest then reads from the TMCCT then the host will perform a divide by 0. This patch ensures that if the lapic_timer.period is 0, then the division does not occur. Reported-by: Andrew Honig Cc: stable@vger.kernel.org Signed-off-by: Andrew Honig Signed-off-by: Paolo Bonzini [dannf: backported to Debian's 2.6.32] Signed-off-by: Willy Tarreau commit ba6e9bbbd6e3e28d92ff43981c2354307c095b7c Author: Andy Honig Date: Tue Nov 19 00:09:22 2013 +0000 KVM: Improve create VCPU parameter (CVE-2013-4587) commit 338c7dbadd2671189cec7faf64c84d01071b3f96 upstream In multiple functions the vcpu_id is used as an offset into a bitfield. Ag malicious user could specify a vcpu_id greater than 255 in order to set or clear bits in kernel memory. This could be used to elevate priveges in the kernel. This patch verifies that the vcpu_id provided is less than 255. The api documentation already specifies that the vcpu_id must be less than max_vcpus, but this is currently not checked. Reported-by: Andrew Honig Cc: stable@vger.kernel.org Signed-off-by: Andrew Honig Signed-off-by: Paolo Bonzini Signed-off-by: Willy Tarreau commit 4b500c7ff030f35aa6f8badf2f243bc259021e39 Author: Dan Carpenter Date: Tue Oct 29 19:06:04 2013 +0000 uml: check length in exitcode_proc_write() commit 201f99f170df14ba52ea4c52847779042b7a623b upstream We don't cap the size of buffer from the user so we could write past the end of the array here. Only root can write to this file. Reported-by: Nico Golde Reported-by: Fabian Yamaguchi Signed-off-by: Dan Carpenter Cc: stable@kernel.org Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit a35f8f7b89f6c91bae8aa46507556de4185e9e7a Author: Neil Horman Date: Tue Sep 17 12:33:11 2013 +0000 crypto: ansi_cprng - Fix off by one error in non-block size request commit 714b33d15130cbb5ab426456d4e3de842d6c5b8a upstream Stephan Mueller reported to me recently a error in random number generation in the ansi cprng. If several small requests are made that are less than the instances block size, the remainder for loop code doesn't increment rand_data_valid in the last iteration, meaning that the last bytes in the rand_data buffer gets reused on the subsequent smaller-than-a-block request for random data. The fix is pretty easy, just re-code the for loop to make sure that rand_data_valid gets incremented appropriately Signed-off-by: Neil Horman Reported-by: Stephan Mueller CC: Stephan Mueller CC: Petr Matousek CC: Herbert Xu CC: "David S. Miller" Signed-off-by: Herbert Xu Signed-off-by: Willy Tarreau commit 1b3b362f84f5f2dd62d180447ba1a1681f5d993a Author: Mikulas Patocka Date: Mon Oct 21 12:59:30 2013 +0100 dm snapshot: fix data corruption CVE-2013-4299 BugLink: http://bugs.launchpad.net/bugs/1241769 This patch fixes a particular type of data corruption that has been encountered when loading a snapshot's metadata from disk. When we allocate a new chunk in persistent_prepare, we increment ps->next_free and we make sure that it doesn't point to a metadata area by further incrementing it if necessary. When we load metadata from disk on device activation, ps->next_free is positioned after the last used data chunk. However, if this last used data chunk is followed by a metadata area, ps->next_free is positioned erroneously to the metadata area. A newly-allocated chunk is placed at the same location as the metadata area, resulting in data or metadata corruption. This patch changes the code so that ps->next_free skips the metadata area when metadata are loaded in function read_exceptions. The patch also moves a piece of code from persistent_prepare_exception to a separate function skip_metadata to avoid code duplication. CVE-2013-4299 Signed-off-by: Mikulas Patocka Cc: stable@vger.kernel.org Cc: Mike Snitzer Signed-off-by: Alasdair G Kergon (back ported from commit e9c6a182649f4259db704ae15a91ac820e63b0ca) Signed-off-by: Luis Henriques Acked-by: Stefan Bader Signed-off-by: Tim Gardner Signed-off-by: Willy Tarreau commit 74fbc288d30a6da65187bded5834f36097fd738e Author: Hannes Frederic Sowa Date: Thu Aug 1 14:26:10 2013 +0100 ipv6: call udp_push_pending_frames when uncorking a socket with AF_INET pending data CVE-2013-4162 BugLink: http://bugs.launchpad.net/bugs/1205070 We accidentally call down to ip6_push_pending_frames when uncorking pending AF_INET data on a ipv6 socket. This results in the following splat (from Dave Jones): skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev: ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37 task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000 RIP: 0010:[] [] skb_panic+0x63/0x65 RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282 RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006 RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520 RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800 R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800 FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0 Call Trace: [] skb_push+0x3a/0x40 [] ip6_push_pending_frames+0x1f6/0x4d0 [] ? mark_held_locks+0xbb/0x140 [] udp_v6_push_pending_frames+0x2b9/0x3d0 [] ? udplite_getfrag+0x20/0x20 [] udp_lib_setsockopt+0x1aa/0x1f0 [] ? fget_light+0x387/0x4f0 [] udpv6_setsockopt+0x34/0x40 [] sock_common_setsockopt+0x14/0x20 [] SyS_setsockopt+0x71/0xd0 [] tracesys+0xdd/0xe2 Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 RIP [] skb_panic+0x63/0x65 RSP This patch adds a check if the pending data is of address family AF_INET and directly calls udp_push_ending_frames from udp_v6_push_pending_frames if that is the case. This bug was found by Dave Jones with trinity. (Also move the initialization of fl6 below the AF_INET check, even if not strictly necessary.) Cc: Dave Jones Cc: YOSHIFUJI Hideaki Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller (back ported from commit 8822b64a0fa64a5dd1dfcf837c5b0be83f8c05d1) Signed-off-by: Luis Henriques Acked-by: Brad Figg Signed-off-by: Tim Gardner Signed-off-by: Willy Tarreau commit 65405bf4b82ddf753dea6569948cf686c33d6dba Author: Kees Cook Date: Tue Nov 12 15:11:17 2013 -0800 exec/ptrace: fix get_dumpable() incorrect tests commit d049f74f2dbe71354d43d393ac3a188947811348 upstream The get_dumpable() return value is not boolean. Most users of the function actually want to be testing for non-SUID_DUMP_USER(1) rather than SUID_DUMP_DISABLE(0). The SUID_DUMP_ROOT(2) is also considered a protected state. Almost all places did this correctly, excepting the two places fixed in this patch. Wrong logic: if (dumpable == SUID_DUMP_DISABLE) { /* be protective */ } or if (dumpable == 0) { /* be protective */ } or if (!dumpable) { /* be protective */ } Correct logic: if (dumpable != SUID_DUMP_USER) { /* be protective */ } or if (dumpable != 1) { /* be protective */ } Without this patch, if the system had set the sysctl fs/suid_dumpable=2, a user was able to ptrace attach to processes that had dropped privileges to that user. (This may have been partially mitigated if Yama was enabled.) The macros have been moved into the file that declares get/set_dumpable(), which means things like the ia64 code can see them too. CVE-2013-2929 Reported-by: Vasily Kulikov Signed-off-by: Kees Cook Cc: Tony Luck Cc: Oleg Nesterov Cc: "Eric W. Biederman" Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [dannf: backported to Debian's 2.6.32] Signed-off-by: Willy Tarreau commit a14cc8a6f5bc2e3c491edc92c07e17ba2f94f1d0 Author: Peter Hurley Date: Sat May 3 14:04:59 2014 +0200 n_tty: Fix n_tty_write crash when echoing in raw mode The tty atomic_write_lock does not provide an exclusion guarantee for the tty driver if the termios settings are LECHO & !OPOST. And since it is unexpected and not allowed to call TTY buffer helpers like tty_insert_flip_string concurrently, this may lead to crashes when concurrect writers call pty_write. In that case the following two writers: * the ECHOing from a workqueue and * pty_write from the process race and can overflow the corresponding TTY buffer like follows. If we look into tty_insert_flip_string_fixed_flag, there is: int space = __tty_buffer_request_room(port, goal, flags); struct tty_buffer *tb = port->buf.tail; ... memcpy(char_buf_ptr(tb, tb->used), chars, space); ... tb->used += space; so the race of the two can result in something like this: A B __tty_buffer_request_room __tty_buffer_request_room memcpy(buf(tb->used), ...) tb->used += space; memcpy(buf(tb->used), ...) ->BOOM B's memcpy is past the tty_buffer due to the previous A's tb->used increment. Since the N_TTY line discipline input processing can output concurrently with a tty write, obtain the N_TTY ldisc output_lock to serialize echo output with normal tty writes. This ensures the tty buffer helper tty_insert_flip_string is not called concurrently and everything is fine. Note that this is nicely reproducible by an ordinary user using forkpty and some setup around that (raw termios + ECHO). And it is present in kernels at least after commit d945cb9cce20ac7143c2de8d88b187f62db99bdc (pty: Rework the pty layer to use the normal buffering logic) in 2.6.31-rc3. js: add more info to the commit log js: switch to bool js: lock unconditionally js: lock only the tty->ops->write call References: CVE-2014-0196 Reported-and-tested-by: Jiri Slaby Signed-off-by: Peter Hurley Signed-off-by: Jiri Slaby Cc: Linus Torvalds Cc: Alan Cox Cc: Signed-off-by: Greg Kroah-Hartman (cherry picked from commit 4291086b1f081b869c6d79e5b7441633dc3ace00) [wt: 2.6.32 has no n_tty_data, so output_lock is in tty, not tty->disc_data] Signed-off-by: Willy Tarreau commit 204466be5875914b7833e80f8404dc2c6ece6249 Author: Liu Yu Date: Wed Apr 30 17:34:09 2014 +0800 tcp_cubic: fix the range of delayed_ack commit b9f47a3aaeab (tcp_cubic: limit delayed_ack ratio to prevent divide error) try to prevent divide error, but there is still a little chance that delayed_ack can reach zero. In case the param cnt get negative value, then ratio+cnt would overflow and may happen to be zero. As a result, min(ratio, ACK_RATIO_LIMIT) will calculate to be zero. In some old kernels, such as 2.6.32, there is a bug that would pass negative param, which then ultimately leads to this divide error. commit 5b35e1e6e9c (tcp: fix tcp_trim_head() to adjust segment count with skb MSS) fixed the negative param issue. However, it's safe that we fix the range of delayed_ack as well, to make sure we do not hit a divide by zero. CC: Stephen Hemminger Signed-off-by: Liu Yu Signed-off-by: Eric Dumazet Acked-by: Neal Cardwell Signed-off-by: David S. Miller (cherry picked from commit 0cda345d1b2201dd15591b163e3c92bad5191745) Signed-off-by: Willy Tarreau commit b7daa2a2f4d09b288026b0c99f86677d513b976d Author: stephen hemminger Date: Wed May 4 10:04:56 2011 +0000 tcp_cubic: limit delayed_ack ratio to prevent divide error TCP Cubic keeps a metric that estimates the amount of delayed acknowledgements to use in adjusting the window. If an abnormally large number of packets are acknowledged at once, then the update could wrap and reach zero. This kind of ACK could only happen when there was a large window and huge number of ACK's were lost. This patch limits the value of delayed ack ratio. The choice of 32 is just a conservative value since normally it should be range of 1 to 4 packets. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller (cherry picked from commit b9f47a3aaeabdce3b42829bbb27765fa340f76ba) [wt: in 2.6.32, this fix is needed for the next one] Signed-off-by: Willy Tarreau commit 18fbf564ddc8a54e5b22961cb50ece22b50cbfba Author: Neal Cardwell Date: Sat Jan 28 17:29:46 2012 +0000 tcp: fix tcp_trim_head() to adjust segment count with skb MSS This commit fixes tcp_trim_head() to recalculate the number of segments in the skb with the skb's existing MSS, so trimming the head causes the skb segment count to be monotonically non-increasing - it should stay the same or go down, but not increase. Previously tcp_trim_head() used the current MSS of the connection. But if there was a decrease in MSS between original transmission and ACK (e.g. due to PMTUD), this could cause tcp_trim_head() to counter-intuitively increase the segment count when trimming bytes off the head of an skb. This violated assumptions in tcp_tso_acked() that tcp_trim_head() only decreases the packet count, so that packets_acked in tcp_tso_acked() could underflow, leading tcp_clean_rtx_queue() to pass u32 pkts_acked values as large as 0xffffffff to ca_ops->pkts_acked(). As an aside, if tcp_trim_head() had really wanted the skb to reflect the current MSS, it should have called tcp_set_skb_tso_segs() unconditionally, since a decrease in MSS would mean that a single-packet skb should now be sliced into multiple segments. Signed-off-by: Neal Cardwell Acked-by: Nandita Dukkipati Acked-by: Ilpo Järvinen Signed-off-by: David S. Miller (cherry picked from commit 5b35e1e6e9ca651e6b291c96d1106043c9af314a) Signed-off-by: Willy Tarreau commit 997029d6b7871d4ac4276948836bee7277c51909 Author: Mikulas Patocka Date: Tue Apr 8 18:04:55 2014 -0400 powernow-k6: reorder frequencies commit 22c73795b101597051924556dce019385a1e2fa0 upstream This patch reorders reported frequencies from the highest to the lowest, just like in other frequency drivers. Signed-off-by: Mikulas Patocka Acked-by: Viresh Kumar Signed-off-by: Rafael J. Wysocki Signed-off-by: Willy Tarreau commit 969d4d2cbfb9f42f40094b0a168de457e8cf3a96 Author: Mikulas Patocka Date: Tue Apr 8 18:04:25 2014 -0400 powernow-k6: correctly initialize default parameters commit d82b922a4acc1781d368aceac2f9da43b038cab2 upstream The powernow-k6 driver used to read the initial multiplier from the powernow register. However, there is a problem with this: * If there was a frequency transition before, the multiplier read from the register corresponds to the current multiplier. * If there was no frequency transition since reset, the field in the register always reads as zero, regardless of the current multiplier that is set using switches on the mainboard and that the CPU is running at. The zero value corresponds to multiplier 4.5, so as a consequence, the powernow-k6 driver always assumes multiplier 4.5. For example, if we have 550MHz CPU with bus frequency 100MHz and multiplier 5.5, the powernow-k6 driver thinks that the multiplier is 4.5 and bus frequency is 122MHz. The powernow-k6 driver then sets the multiplier to 4.5, underclocking the CPU to 450MHz, but reports the current frequency as 550MHz. There is no reliable way how to read the initial multiplier. I modified the driver so that it contains a table of known frequencies (based on parameters of existing CPUs and some common overclocking schemes) and sets the multiplier according to the frequency. If the frequency is unknown (because of unusual overclocking or underclocking), the user must supply the bus speed and maximum multiplier as module parameters. This patch should be backported to all stable kernels. If it doesn't apply cleanly, change it, or ask me to change it. Signed-off-by: Mikulas Patocka Signed-off-by: Rafael J. Wysocki Signed-off-by: Willy Tarreau commit 923a78074bfb799508bea3ca3700ee29da7fc58d Author: Mikulas Patocka Date: Tue Apr 8 18:03:54 2014 -0400 powernow-k6: disable cache when changing frequency commit e20e1d0ac02308e2211306fc67abcd0b2668fb8b upstream I found out that a system with k6-3+ processor is unstable during network server load. The system locks up or the network card stops receiving. The reason for the instability is the CPU frequency scaling. During frequency transition the processor is in "EPM Stop Grant" state. The documentation says that the processor doesn't respond to inquiry requests in this state. Consequently, coherency of processor caches and bus master devices is not maintained, causing the system instability. This patch flushes the cache during frequency transition. It fixes the instability. Other minor changes: * u64 invalue changed to unsigned long because the variable is 32-bit * move the logic to set the multiplier to a separate function powernow_k6_set_cpu_multiplier * preserve lower 5 bits of the powernow port instead of 4 (the voltage field has 5 bits) * mask interrupts when reading the multiplier, so that the port is not open during other activity (running other kernel code with the port open shouldn't cause any misbehavior, but we should better be safe and keep the port closed) This patch should be backported to all stable kernels. If it doesn't apply cleanly, change it, or ask me to change it. Signed-off-by: Mikulas Patocka Signed-off-by: Rafael J. Wysocki Signed-off-by: Willy Tarreau commit 9d73a68445d6a36c9fabd96906663c5db4a685ff Author: Krzysztof Helt Date: Sun Oct 25 19:45:57 2009 +0100 powernow-k6: set transition latency value so ondemand governor can be used Set the transition latency to value smaller than CPUFREQ_ETERNAL so governors other than "performance" work (like the "ondemand" one). The value is found in "AMD PowerNow! Technology Platform Design Guide for Embedded Processors" dated December 2000 (AMD doc #24267A). There is the answer to one of FAQs on page 40 which states that suggested complete transition period is 200 us. Tested on K6-2+ CPU with K6-3 core (model 13, stepping 4). Signed-off-by: Krzysztof Helt Signed-off-by: Dave Jones (cherry picked from commit db2820dd5445a44b4726f15a2bc89b9ded2503eb) [wt: in 2.6.32, we only need this one so that next series applies cleanly] Signed-off-by: Willy Tarreau commit 3ae3f8b7150ecde872a4a033127218f43c75d4b6 Author: Zhu Yanjun Date: Thu Apr 3 16:41:13 2014 +0800 gianfar: disable TX vlan based on kernel 2.6.x 2.6.x kernels require a similar logic change as commit e1653c3e [gianfar: do vlan cleanup] and commit 51b8cbfc [gianfar: fix bug caused by e1653c3e] introduces for newer kernels. Since there is something wrong with tx vlan of gianfar nic driver, in kernel(3.1+), tx vlan is disabled. But in kernel 2.6.x, tx vlan is still enabled. Thus,gianfar nic driver can not support vlan packets and non-vlan packets at the same time. Signed-off-by: Zhu Yanjun Signed-off-by: Willy Tarreau commit 00659c4e9ddd25ac81c2ce1490dd0d2b12ff1f1e Author: Linus Torvalds Date: Sat Jan 11 19:15:52 2014 -0800 x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround Before we do an EMMS in the AMD FXSAVE information leak workaround we need to clear any pending exceptions, otherwise we trap with a floating-point exception inside this code. Reported-by: halfdog Tested-by: Borislav Petkov Link: http://lkml.kernel.org/r/CA%2B55aFxQnY_PCG_n4=0w-VG=YLXL-yr7oMxyy0WU2gCBAf3ydg@mail.gmail.com Signed-off-by: H. Peter Anvin (cherry picked from commit 26bef1318adc1b3a530ecc807ef99346db2aa8b0) [wt: in 2.6.32, patch applies to arch/x86/include/asm/i387.h. There's no static_cpu_has() so we use boot_cpu_has() like other kernels do with gcc3. ] Signed-off-by: Willy Tarreau commit b3dc3e65760a6e83e7b028d11620b9c9fa30982e Author: Dan Carpenter Date: Wed Oct 30 20:12:51 2013 +0300 libertas: potential oops in debugfs If we do a zero size allocation then it will oops. Also we can't be sure the user passes us a NUL terminated string so I've added a terminator. This code can only be triggered by root. Reported-by: Nico Golde Reported-by: Fabian Yamaguchi Signed-off-by: Dan Carpenter Acked-by: Dan Williams Signed-off-by: John W. Linville (cherry picked from commit a497e47d4aec37aaf8f13509f3ef3d1f6a717d88) Signed-off-by: Willy Tarreau commit a533423568c378269da0e8b48e9bfb3bcc1765bf Author: Linus Torvalds Date: Tue Oct 29 10:21:34 2013 -0700 Fix a few incorrectly checked [io_]remap_pfn_range() calls Nico Golde reports a few straggling uses of [io_]remap_pfn_range() that really should use the vm_iomap_memory() helper. This trivially converts two of them to the helper, and comments about why the third one really needs to continue to use remap_pfn_range(), and adds the missing size check. Reported-by: Nico Golde Signed-off-by: Linus Torvalds (cherry picked from commit 7314e613d5ff9f0934f7a0f74ed7973b903315d1) [wt: vm_flags were absent in mainline, Ben removed them in 3.2, but I kept them to minimize changes and avoid any side effect] Signed-off-by: Willy Tarreau commit dc734909be2b9689c3ef84a36c24832b32041d80 Author: Linus Torvalds Date: Tue Apr 16 13:45:37 2013 -0700 vm: add vm_iomap_memory() helper function Various drivers end up replicating the code to mmap() their memory buffers into user space, and our core memory remapping function may be very flexible but it is unnecessarily complicated for the common cases to use. Our internal VM uses pfn's ("page frame numbers") which simplifies things for the VM, and allows us to pass physical addresses around in a denser and more efficient format than passing a "phys_addr_t" around, and having to shift it up and down by the page size. But it just means that drivers end up doing that shifting instead at the interface level. It also means that drivers end up mucking around with internal VM things like the vma details (vm_pgoff, vm_start/end) way more than they really need to. So this just exports a function to map a certain physical memory range into user space (using a phys_addr_t based interface that is much more natural for a driver) and hides all the complexity from the driver. Some drivers will still end up tweaking the vm_page_prot details for things like prefetching or cacheability etc, but that's actually relevant to the driver, rather than caring about what the page offset of the mapping is into the particular IO memory region. Acked-by: Greg Kroah-Hartman Signed-off-by: Linus Torvalds (cherry picked from commit b4cbb197c7e7a68dbad0d491242e3ca67420c13e) [WT: only needed in 2.6.32 for next commit] Signed-off-by: Willy Tarreau commit dba949f0d8c29903fbd76ee425f68275247c7359 Author: Hannes Frederic Sowa Date: Tue Oct 22 00:07:47 2013 +0200 inet: fix possible memory corruption with UDP_CORK and UFO [ This is a simplified -stable version of a set of upstream commits. ] This is a replacement patch only for stable which does fix the problems handled by the following two commits in -net: "ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9) "ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b) Three frames are written on a corked udp socket for which the output netdevice has UFO enabled. If the first and third frame are smaller than the mtu and the second one is bigger, we enqueue the second frame with skb_append_datato_frags without initializing the gso fields. This leads to the third frame appended regulary and thus constructing an invalid skb. This fixes the problem by always using skb_append_datato_frags as soon as the first frag got enqueued to the skb without marking the packet as SKB_GSO_UDP. The problem with only two frames for ipv6 was fixed by "ipv6: udp packets following an UFO enqueued packet need also be handled by UFO" (2811ebac2521ceac84f2bdae402455baa6a7fb47). Cc: Jiri Pirko Cc: Eric Dumazet Cc: David Miller Signed-off-by: Hannes Frederic Sowa Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings (cherry picked from commit 5124ae99ac8a8f63d0fca9b75adaef40b20678ff) Signed-off-by: Willy Tarreau commit a0feaea482be55e26370b9f1f91f6eeac9248c66 Author: Hannes Frederic Sowa Date: Sat Sep 21 06:27:00 2013 +0200 ipv6: udp packets following an UFO enqueued packet need also be handled by UFO In the following scenario the socket is corked: If the first UDP packet is larger then the mtu we try to append it to the write queue via ip6_ufo_append_data. A following packet, which is smaller than the mtu would be appended to the already queued up gso-skb via plain ip6_append_data. This causes random memory corruptions. In ip6_ufo_append_data we also have to be careful to not queue up the same skb multiple times. So setup the gso frame only when no first skb is available. This also fixes a shortcoming where we add the current packet's length to cork->length but return early because of a packet > mtu with dontfrag set (instead of sutracting it again). Found with trinity. Cc: YOSHIFUJI Hideaki Signed-off-by: Hannes Frederic Sowa Reported-by: Dmitry Vyukov Signed-off-by: David S. Miller (cherry picked from commit 2811ebac2521ceac84f2bdae402455baa6a7fb47) [wt: 2.6.32 doesn't have dontfrag so remove the optimization] Signed-off-by: Willy Tarreau commit 4d464ca439a6951821f039dd341732fa859df767 Author: Mahesh Rajashekhara Date: Thu Oct 31 14:01:02 2013 +0530 aacraid: prevent invalid pointer dereference It appears that driver runs into a problem here if fibsize is too small because we allocate user_srbcmd with fibsize size only but later we access it until user_srbcmd->sg.count to copy it over to srbcmd. It is not correct to test (fibsize < sizeof(*user_srbcmd)) because this structure already includes one sg element and this is not needed for commands without data. So, we would recommend to add the following (instead of test for fibsize == 0). Signed-off-by: Mahesh Rajashekhara Reported-by: Nico Golde Reported-by: Fabian Yamaguchi Signed-off-by: Linus Torvalds (cherry picked from commit b4789b8e6be3151a955ade74872822f30e8cd914) CVE-2013-6380 Signed-off-by: Willy Tarreau commit 8663707a797e6a473c32cf08eb7597543bbdae79 Author: Nicolas Dichtel Date: Fri Nov 8 11:13:55 2013 +0100 sctp: unbalanced rcu lock in ip_queue_xmit() The bug was introduced in 2.6.32.61 by commit b8710128e201 ("inet: add RCU protection to inet->opt") (it's a backport of upstream commit f6d8bd051c39). In SCTP case, packet is already routed, hence we jump to the label 'packet_routed', but without rcu_read_lock(). After this label, rcu_read_unlock() is called unconditionally. Spotted-by: Guo Fengtian Signed-off-by: Nicolas Dichtel Signed-off-by: Willy Tarreau commit 8753987aa676b02e907d72713d78ff3a9de8a5ad Author: YOSHIFUJI Hideaki Date: Wed Apr 2 12:48:42 2014 +0900 isdnloop: Validate NUL-terminated strings from user. [ Upstream commit 77bc6bed7121936bb2e019a8c336075f4c8eef62 ] Return -EINVAL unless all of user-given strings are correctly NUL-terminated. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit dfce42782f140401b052eee1aec9b577a5c12c3b Author: Sasha Levin Date: Sat Mar 29 20:39:35 2014 -0400 rds: prevent dereference of a NULL device in rds_iw_laddr_check [ Upstream commit bf39b4247b8799935ea91d90db250ab608a58e50 ] Binding might result in a NULL device which is later dereferenced without checking. Signed-off-by: Sasha Levin Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 3979c481572d031934ce63d1e3e40555247a959a Author: Dan Carpenter Date: Tue Apr 8 12:23:09 2014 +0300 isdnloop: several buffer overflows [ Upstream commit 7563487cbf865284dcd35e9ef5a95380da046737 ] There are three buffer overflows addressed in this patch. 1) In isdnloop_fake_err() we add an 'E' to a 60 character string and then copy it into a 60 character buffer. I have made the destination buffer 64 characters and I'm changed the sprintf() to a snprintf(). 2) In isdnloop_parse_cmd(), p points to a 6 characters into a 60 character buffer so we have 54 characters. The ->eazlist[] is 11 characters long. I have modified the code to return if the source buffer is too long. 3) In isdnloop_command() the cbuf[] array was 60 characters long but the max length of the string then can be up to 79 characters. I made the cbuf array 80 characters long and changed the sprintf() to snprintf(). I also removed the temporary "dial" buffer and changed it to use "p" directly. Unfortunately, we pass the "cbuf" string from isdnloop_command() to isdnloop_writecmd() which truncates anything over 60 characters to make it fit in card->omsg[]. (It can accept values up to 255 characters so long as there is a '\n' character every 60 characters). For now I have just fixed the memory corruption bug and left the other problems in this driver alone. Signed-off-by: Dan Carpenter Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 4f89fd9ae4e7bb3f8268c742c7d35d267d3c10a2 Author: Pablo Neira Date: Tue Apr 1 19:38:44 2014 +0200 netlink: don't compare the nul-termination in nla_strcmp [ Upstream commit 8b7b932434f5eee495b91a2804f5b64ebb2bc835 ] nla_strcmp compares the string length plus one, so it's implicitly including the nul-termination in the comparison. int nla_strcmp(const struct nlattr *nla, const char *str) { int len = strlen(str) + 1; ... d = memcmp(nla_data(nla), str, len); However, if NLA_STRING is used, userspace can send us a string without the nul-termination. This is a problem since the string comparison will not match as the last byte may be not the nul-termination. Fix this by skipping the comparison of the nul-termination if the attribute data is nul-terminated. Suggested by Thomas Graf. Cc: Florian Westphal Cc: Thomas Graf Signed-off-by: Pablo Neira Ayuso Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit a452ae0af4da31194e70c76b070bc60a31b985c6 Author: Matthew Leach Date: Tue Mar 11 11:58:27 2014 +0000 net: socket: error on a negative msg_namelen [ Upstream commit dbb490b96584d4e958533fb637f08b557f505657 ] When copying in a struct msghdr from the user, if the user has set the msg_namelen parameter to a negative value it gets clamped to a valid size due to a comparison between signed and unsigned values. Ensure the syscall errors when the user passes in a negative value. Signed-off-by: Matthew Leach Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 3c3e5e546a2a8fe6b2d5d5e5e3385430999ce585 Author: Daniel Borkmann Date: Tue Mar 4 16:35:51 2014 +0100 net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk [ Upstream commit c485658bae87faccd7aed540fd2ca3ab37992310 ] While working on ec0223ec48a9 ("net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable"), we noticed that there's a skb memory leakage in the error path. Running the same reproducer as in ec0223ec48a9 and by unconditionally jumping to the error label (to simulate an error condition) in sctp_sf_do_5_1D_ce() receive path lets kmemleak detector bark about the unfreed chunk->auth_chunk skb clone: Unreferenced object 0xffff8800b8f3a000 (size 256): comm "softirq", pid 0, jiffies 4294769856 (age 110.757s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 89 ab 75 5e d4 01 58 13 00 00 00 00 00 00 00 00 ..u^..X......... backtrace: [] kmemleak_alloc+0x4e/0xb0 [] kmem_cache_alloc+0xc8/0x210 [] skb_clone+0x49/0xb0 [] sctp_endpoint_bh_rcv+0x1d9/0x230 [sctp] [] sctp_inq_push+0x4c/0x70 [sctp] [] sctp_rcv+0x82e/0x9a0 [sctp] [] ip_local_deliver_finish+0xa8/0x210 [] nf_reinject+0xbf/0x180 [] nfqnl_recv_verdict+0x1d2/0x2b0 [nfnetlink_queue] [] nfnetlink_rcv_msg+0x14b/0x250 [nfnetlink] [] netlink_rcv_skb+0xa9/0xc0 [] nfnetlink_rcv+0x23f/0x408 [nfnetlink] [] netlink_unicast+0x168/0x250 [] netlink_sendmsg+0x2e1/0x3f0 [] sock_sendmsg+0x8b/0xc0 [] ___sys_sendmsg+0x369/0x380 What happens is that commit bbd0d59809f9 clones the skb containing the AUTH chunk in sctp_endpoint_bh_rcv() when having the edge case that an endpoint requires COOKIE-ECHO chunks to be authenticated: ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------> <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------- ------------------ AUTH; COOKIE-ECHO ----------------> <-------------------- COOKIE-ACK --------------------- When we enter sctp_sf_do_5_1D_ce() and before we actually get to the point where we process (and subsequently free) a non-NULL chunk->auth_chunk, we could hit the "goto nomem_init" path from an error condition and thus leave the cloned skb around w/o freeing it. The fix is to centrally free such clones in sctp_chunk_destroy() handler that is invoked from sctp_chunk_free() after all refs have dropped; and also move both kfree_skb(chunk->auth_chunk) there, so that chunk->auth_chunk is either NULL (since sctp_chunkify() allocs new chunks through kmem_cache_zalloc()) or non-NULL with a valid skb pointer. chunk->skb and chunk->auth_chunk are the only skbs in the sctp_chunk structure that need to be handeled. While at it, we should use consume_skb() for both. It is the same as dev_kfree_skb() but more appropriately named as we are not a device but a protocol. Also, this effectively replaces the kfree_skb() from both invocations into consume_skb(). Functions are the same only that kfree_skb() assumes that the frame was being dropped after a failure (e.g. for tools like drop monitor), usage of consume_skb() seems more appropriate in function sctp_chunk_destroy() though. Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk") Signed-off-by: Daniel Borkmann Cc: Vlad Yasevich Cc: Neil Horman Acked-by: Vlad Yasevich Acked-by: Neil Horman Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 57b7a1c227f4d069878e9d4199a11e53e99c3240 Author: Daniel Borkmann Date: Mon Mar 3 17:23:04 2014 +0100 net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable [ Upstream commit ec0223ec48a90cb605244b45f7c62de856403729 ] RFC4895 introduced AUTH chunks for SCTP; during the SCTP handshake RANDOM; CHUNKS; HMAC-ALGO are negotiated (CHUNKS being optional though): ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------> <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------- -------------------- COOKIE-ECHO --------------------> <-------------------- COOKIE-ACK --------------------- A special case is when an endpoint requires COOKIE-ECHO chunks to be authenticated: ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------> <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------- ------------------ AUTH; COOKIE-ECHO ----------------> <-------------------- COOKIE-ACK --------------------- RFC4895, section 6.3. Receiving Authenticated Chunks says: The receiver MUST use the HMAC algorithm indicated in the HMAC Identifier field. If this algorithm was not specified by the receiver in the HMAC-ALGO parameter in the INIT or INIT-ACK chunk during association setup, the AUTH chunk and all the chunks after it MUST be discarded and an ERROR chunk SHOULD be sent with the error cause defined in Section 4.1. [...] If no endpoint pair shared key has been configured for that Shared Key Identifier, all authenticated chunks MUST be silently discarded. [...] When an endpoint requires COOKIE-ECHO chunks to be authenticated, some special procedures have to be followed because the reception of a COOKIE-ECHO chunk might result in the creation of an SCTP association. If a packet arrives containing an AUTH chunk as a first chunk, a COOKIE-ECHO chunk as the second chunk, and possibly more chunks after them, and the receiver does not have an STCB for that packet, then authentication is based on the contents of the COOKIE-ECHO chunk. In this situation, the receiver MUST authenticate the chunks in the packet by using the RANDOM parameters, CHUNKS parameters and HMAC_ALGO parameters obtained from the COOKIE-ECHO chunk, and possibly a local shared secret as inputs to the authentication procedure specified in Section 6.3. If authentication fails, then the packet is discarded. If the authentication is successful, the COOKIE-ECHO and all the chunks after the COOKIE-ECHO MUST be processed. If the receiver has an STCB, it MUST process the AUTH chunk as described above using the STCB from the existing association to authenticate the COOKIE-ECHO chunk and all the chunks after it. [...] Commit bbd0d59809f9 introduced the possibility to receive and verification of AUTH chunk, including the edge case for authenticated COOKIE-ECHO. On reception of COOKIE-ECHO, the function sctp_sf_do_5_1D_ce() handles processing, unpacks and creates a new association if it passed sanity checks and also tests for authentication chunks being present. After a new association has been processed, it invokes sctp_process_init() on the new association and walks through the parameter list it received from the INIT chunk. It checks SCTP_PARAM_RANDOM, SCTP_PARAM_HMAC_ALGO and SCTP_PARAM_CHUNKS, and copies them into asoc->peer meta data (peer_random, peer_hmacs, peer_chunks) in case sysctl -w net.sctp.auth_enable=1 is set. If in INIT's SCTP_PARAM_SUPPORTED_EXT parameter SCTP_CID_AUTH is set, peer_random != NULL and peer_hmacs != NULL the peer is to be assumed asoc->peer.auth_capable=1, in any other case asoc->peer.auth_capable=0. Now, if in sctp_sf_do_5_1D_ce() chunk->auth_chunk is available, we set up a fake auth chunk and pass that on to sctp_sf_authenticate(), which at latest in sctp_auth_calculate_hmac() reliably dereferences a NULL pointer at position 0..0008 when setting up the crypto key in crypto_hash_setkey() by using asoc->asoc_shared_key that is NULL as condition key_id == asoc->active_key_id is true if the AUTH chunk was injected correctly from remote. This happens no matter what net.sctp.auth_enable sysctl says. The fix is to check for net->sctp.auth_enable and for asoc->peer.auth_capable before doing any operations like sctp_sf_authenticate() as no key is activated in sctp_auth_asoc_init_active_key() for each case. Now as RFC4895 section 6.3 states that if the used HMAC-ALGO passed from the INIT chunk was not used in the AUTH chunk, we SHOULD send an error; however in this case it would be better to just silently discard such a maliciously prepared handshake as we didn't even receive a parameter at all. Also, as our endpoint has no shared key configured, section 6.3 says that MUST silently discard, which we are doing from now onwards. Before calling sctp_sf_pdiscard(), we need not only to free the association, but also the chunk->auth_chunk skb, as commit bbd0d59809f9 created a skb clone in that case. I have tested this locally by using netfilter's nfqueue and re-injecting packets into the local stack after maliciously modifying the INIT chunk (removing RANDOM; HMAC-ALGO param) and the SCTP packet containing the COOKIE_ECHO (injecting AUTH chunk before COOKIE_ECHO). Fixed with this patch applied. Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk") Signed-off-by: Daniel Borkmann Cc: Vlad Yasevich Cc: Neil Horman Acked-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit c21303a5e972380414aa202c927cd25190e943c2 Author: Michael Chan Date: Fri Feb 28 15:05:10 2014 -0800 tg3: Don't check undefined error bits in RXBD [ Upstream commit d7b95315cc7f441418845a165ee56df723941487 ] Redefine the RXD_ERR_MASK to include only relevant error bits. This fixes a customer reported issue of randomly dropping packets on the 5719. Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 6421a2a4c25e94eab1f93628090e833acf4a8b43 Author: Jason Wang Date: Fri Feb 21 13:08:04 2014 +0800 virtio-net: alloc big buffers also when guest can receive UFO [ Upstream commit 0e7ede80d929ff0f830c44a543daa1acd590c749 ] We should alloc big buffers also when guest can receive UFO packets to let the big packets fit into guest rx buffer. Fixes 5c5167515d80f78f6bb538492c423adcae31ad65 (virtio-net: Allow UFO feature to be set and advertised.) Cc: Rusty Russell Cc: Michael S. Tsirkin Cc: Sridhar Samudrala Signed-off-by: Jason Wang Acked-by: Michael S. Tsirkin Acked-by: Rusty Russell Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit dc66dd92bd8f9ebfea5e3f72a270c06f23670f92 Author: Daniel Borkmann Date: Mon Feb 17 12:11:11 2014 +0100 net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode [ Upstream commit ffd5939381c609056b33b7585fb05a77b4c695f3 ] SCTP's sctp_connectx() abi breaks for 64bit kernels compiled with 32bit emulation (e.g. ia32 emulation or x86_x32). Due to internal usage of 'struct sctp_getaddrs_old' which includes a struct sockaddr pointer, sizeof(param) check will always fail in kernel as the structure in 64bit kernel space is 4bytes larger than for user binaries compiled in 32bit mode. Thus, applications making use of sctp_connectx() won't be able to run under such circumstances. Introduce a compat interface in the kernel to deal with such situations by using a 'struct compat_sctp_getaddrs_old' structure where user data is copied into it, and then sucessively transformed into a 'struct sctp_getaddrs_old' structure with the help of compat_ptr(). That fixes sctp_connectx() abi without any changes needed in user space, and lets the SCTP test suite pass when compiled in 32bit and run on 64bit kernels. Fixes: f9c67811ebc0 ("sctp: Fix regression introduced by new sctp_connectx api") Signed-off-by: Daniel Borkmann Acked-by: Neil Horman Acked-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 452009dc786507a5f76c6353e7e98e6a4f920292 Author: Jiri Bohac Date: Fri Feb 14 18:13:50 2014 +0100 bonding: 802.3ad: make aggregator_identifier bond-private [ Upstream commit 163c8ff30dbe473abfbb24a7eac5536c87f3baa9 ] aggregator_identifier is used to assign unique aggregator identifiers to aggregators of a bond during device enslaving. aggregator_identifier is currently a global variable that is zeroed in bond_3ad_initialize(). This sequence will lead to duplicate aggregator identifiers for eth1 and eth3: create bond0 change bond0 mode to 802.3ad enslave eth0 to bond0 //eth0 gets agg id 1 enslave eth1 to bond0 //eth1 gets agg id 2 create bond1 change bond1 mode to 802.3ad enslave eth2 to bond1 //aggregator_identifier is reset to 0 //eth2 gets agg id 1 enslave eth3 to bond0 //eth3 gets agg id 2 Fix this by making aggregator_identifier private to the bond. Signed-off-by: Jiri Bohac Acked-by: Veaceslav Falico Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit d7b6372b76ada3c89363db5b53b9c954860389eb Author: Nithin Sujir Date: Thu Feb 6 14:13:05 2014 -0800 tg3: Fix deadlock in tg3_change_mtu() [ Upstream commit c6993dfd7db9b0c6b7ca7503a56fda9236a4710f ] Quoting David Vrabel - "5780 cards cannot have jumbo frames and TSO enabled together. When jumbo frames are enabled by setting the MTU, the TSO feature must be cleared. This is done indirectly by calling netdev_update_features() which will call tg3_fix_features() to actually clear the flags. netdev_update_features() will also trigger a new netlink message for the feature change event which will result in a call to tg3_get_stats64() which deadlocks on the tg3 lock." tg3_set_mtu() does not need to be under the tg3 lock since converting the flags to use set_bit(). Move it out to after tg3_netif_stop(). Reported-by: David Vrabel Tested-by: David Vrabel Signed-off-by: Michael Chan Signed-off-by: Nithin Nayak Sujir Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit bfdd0948bd10f5208f032b92f5071b2eb9cf9f43 Author: Maciej Zenczykowski Date: Fri Feb 7 16:23:48 2014 -0800 net: fix 'ip rule' iif/oif device rename [ Upstream commit 946c032e5a53992ea45e062ecb08670ba39b99e3 ] ip rules with iif/oif references do not update: (detach/attach) across interface renames. Signed-off-by: Maciej Zenczykowski CC: Willem de Bruijn CC: Eric Dumazet CC: Chris Davis CC: Carlo Contavalli Google-Bug-Id: 12936021 Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit a5e83991e904ac4071efc6a4035a441f993ae45e Author: Neal Cardwell Date: Sun Feb 2 20:40:13 2014 -0500 inet_diag: fix inet_diag_dump_icsk() timewait socket state logic [ Based upon upstream commit 70315d22d3c7383f9a508d0aab21e2eb35b2303a ] Fix inet_diag_dump_icsk() to reflect the fact that both TIME_WAIT and FIN_WAIT2 connections are represented by inet_timewait_sock (not just TIME_WAIT). Thus: (a) We need to iterate through the time_wait buckets if the user wants either TIME_WAIT or FIN_WAIT2. (Before fixing this, "ss -nemoi state fin-wait-2" would not return any sockets, even if there were some in FIN_WAIT2.) (b) We need to check tw_substate to see if the user wants to dump sockets in the particular substate (TIME_WAIT or FIN_WAIT2) that a given connection is in. (Before fixing this, "ss -nemoi state time-wait" would actually return sockets in state FIN_WAIT2.) An analogous fix is in v3.13: 70315d22d3c7383f9a508d0aab21e2eb35b2303a ("inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets") but that patch is quite different because 3.13 code is very different in this area due to the unification of TCP hash tables in 05dbc7b ("tcp/dccp: remove twchain") in v3.13-rc1. I tested that this applies cleanly between v3.3 and v3.12, and tested that it works in both 3.3 and 3.12. It does not apply cleanly to 3.2 and earlier (though it makes semantic sense), and semantically is not the right fix for 3.13 and beyond (as mentioned above). Signed-off-by: Neal Cardwell Cc: Eric Dumazet Acked-by: Eric Dumazet Signed-off-by: Willy Tarreau commit 6a6c028da0e743587e84b3d4a7a6d8bb613adcdd Author: Daniel Borkmann Date: Mon Dec 30 23:40:50 2013 +0100 net: llc: fix use after free in llc_ui_recvmsg [ Upstream commit 4d231b76eef6c4a6bd9c96769e191517765942cb ] While commit 30a584d944fb fixes datagram interface in LLC, a use after free bug has been introduced for SOCK_STREAM sockets that do not make use of MSG_PEEK. The flow is as follow ... if (!(flags & MSG_PEEK)) { ... sk_eat_skb(sk, skb, false); ... } ... if (used + offset < skb->len) continue; ... where sk_eat_skb() calls __kfree_skb(). Therefore, cache original length and work on skb_len to check partial reads. Fixes: 30a584d944fb ("[LLX]: SOCK_DGRAM interface fixes") Signed-off-by: Daniel Borkmann Cc: Stephen Hemminger Cc: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit db091f2ff42280daa4a00dc8b4899ff959e0c24d Author: Florian Westphal Date: Mon Dec 23 00:32:31 2013 +0100 net: rose: restore old recvmsg behavior [ Upstream commit f81152e35001e91997ec74a7b4e040e6ab0acccf ] recvmsg handler in net/rose/af_rose.c performs size-check ->msg_namelen. After commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c (net: rework recvmsg handler msg_name and msg_namelen logic), we now always take the else branch due to namelen being initialized to 0. Digging in netdev-vger-cvs git repo shows that msg_namelen was initialized with a fixed-size since at least 1995, so the else branch was never taken. Compile tested only. Signed-off-by: Florian Westphal Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 8343966188413732d66f044ff9fac730808f4207 Author: Sasha Levin Date: Wed Dec 18 23:49:42 2013 -0500 rds: prevent dereference of a NULL device [ Upstream commit c2349758acf1874e4c2b93fe41d072336f1a31d0 ] Binding might result in a NULL device, which is dereferenced causing this BUG: [ 1317.260548] BUG: unable to handle kernel NULL pointer dereference at 000000000000097 4 [ 1317.261847] IP: [] rds_ib_laddr_check+0x82/0x110 [ 1317.263315] PGD 418bcb067 PUD 3ceb21067 PMD 0 [ 1317.263502] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 1317.264179] Dumping ftrace buffer: [ 1317.264774] (ftrace buffer empty) [ 1317.265220] Modules linked in: [ 1317.265824] CPU: 4 PID: 836 Comm: trinity-child46 Tainted: G W 3.13.0-rc4- next-20131218-sasha-00013-g2cebb9b-dirty #4159 [ 1317.267415] task: ffff8803ddf33000 ti: ffff8803cd31a000 task.ti: ffff8803cd31a000 [ 1317.268399] RIP: 0010:[] [] rds_ib_laddr_check+ 0x82/0x110 [ 1317.269670] RSP: 0000:ffff8803cd31bdf8 EFLAGS: 00010246 [ 1317.270230] RAX: 0000000000000000 RBX: ffff88020b0dd388 RCX: 0000000000000000 [ 1317.270230] RDX: ffffffff8439822e RSI: 00000000000c000a RDI: 0000000000000286 [ 1317.270230] RBP: ffff8803cd31be38 R08: 0000000000000000 R09: 0000000000000000 [ 1317.270230] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 1317.270230] R13: 0000000054086700 R14: 0000000000a25de0 R15: 0000000000000031 [ 1317.270230] FS: 00007ff40251d700(0000) GS:ffff88022e200000(0000) knlGS:000000000000 0000 [ 1317.270230] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1317.270230] CR2: 0000000000000974 CR3: 00000003cd478000 CR4: 00000000000006e0 [ 1317.270230] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1317.270230] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602 [ 1317.270230] Stack: [ 1317.270230] 0000000054086700 5408670000a25de0 5408670000000002 0000000000000000 [ 1317.270230] ffffffff84223542 00000000ea54c767 0000000000000000 ffffffff86d26160 [ 1317.270230] ffff8803cd31be68 ffffffff84223556 ffff8803cd31beb8 ffff8800c6765280 [ 1317.270230] Call Trace: [ 1317.270230] [] ? rds_trans_get_preferred+0x42/0xa0 [ 1317.270230] [] rds_trans_get_preferred+0x56/0xa0 [ 1317.270230] [] rds_bind+0x73/0xf0 [ 1317.270230] [] SYSC_bind+0x92/0xf0 [ 1317.270230] [] ? context_tracking_user_exit+0xb8/0x1d0 [ 1317.270230] [] ? trace_hardirqs_on+0xd/0x10 [ 1317.270230] [] ? syscall_trace_enter+0x32/0x290 [ 1317.270230] [] SyS_bind+0xe/0x10 [ 1317.270230] [] tracesys+0xdd/0xe2 [ 1317.270230] Code: 00 8b 45 cc 48 8d 75 d0 48 c7 45 d8 00 00 00 00 66 c7 45 d0 02 00 89 45 d4 48 89 df e8 78 49 76 ff 41 89 c4 85 c0 75 0c 48 8b 03 <80> b8 74 09 00 00 01 7 4 06 41 bc 9d ff ff ff f6 05 2a b6 c2 02 [ 1317.270230] RIP [] rds_ib_laddr_check+0x82/0x110 [ 1317.270230] RSP [ 1317.270230] CR2: 0000000000000974 Signed-off-by: Sasha Levin Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 7dd522c85d315c25f5ea43e5117a3a4e4f977155 Author: Salva Peiró Date: Tue Dec 17 10:06:30 2013 +0100 hamradio/yam: fix info leak in ioctl [ Upstream commit 8e3fbf870481eb53b2d3a322d1fc395ad8b367ed ] The yam_ioctl() code fails to initialise the cmd field of the struct yamdrv_ioctl_cfg. Add an explicit memset(0) before filling the structure to avoid the 4-byte info leak. Signed-off-by: Salva Peiró Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 7985df5a4f2b8431a831c3d4b12dff14e0efa12e Author: Wenliang Fan Date: Tue Dec 17 11:25:28 2013 +0800 drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl() [ Upstream commit e9db5c21d3646a6454fcd04938dd215ac3ab620a ] The local variable 'bi' comes from userspace. If userspace passed a large number to 'bi.data.calibrate', there would be an integer overflow in the following line: s->hdlctx.calibrate = bi.data.calibrate * s->par.bitrate / 16; Signed-off-by: Wenliang Fan Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit dc332885abd71f603e01d8b27636a935c313d1c2 Author: Sasha Levin Date: Fri Dec 13 10:54:22 2013 -0500 net: unix: allow bind to fail on mutex lock [ Upstream commit 37ab4fa7844a044dc21fde45e2a0fc2f3c3b6490 ] This is similar to the set_peek_off patch where calling bind while the socket is stuck in unix_dgram_recvmsg() will block and cause a hung task spew after a while. This is also the last place that did a straightforward mutex_lock(), so there shouldn't be any more of these patches. Signed-off-by: Sasha Levin Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 9f308b7eb7b44f93fc814dce31f58a200f73d19c Author: Changli Gao Date: Sun Dec 8 09:36:56 2013 -0500 net: drop_monitor: fix the value of maxattr [ Upstream commit d323e92cc3f4edd943610557c9ea1bb4bb5056e8 ] maxattr in genl_family should be used to save the max attribute type, but not the max command type. Drop monitor doesn't support any attributes, so we should leave it as zero. Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 8f8a85b82f37d02e845b988ef3b751dcc1f4b7f8 Author: fan.du Date: Sun Dec 1 16:28:48 2013 +0800 {pktgen, xfrm} Update IPv4 header total len and checksum after tranformation [ Upstream commit 3868204d6b89ea373a273e760609cb08020beb1a ] commit a553e4a6317b2cfc7659542c10fe43184ffe53da ("[PKTGEN]: IPSEC support") tried to support IPsec ESP transport transformation for pktgen, but acctually this doesn't work at all for two reasons(The orignal transformed packet has bad IPv4 checksum value, as well as wrong auth value, reported by wireshark) - After transpormation, IPv4 header total length needs update, because encrypted payload's length is NOT same as that of plain text. - After transformation, IPv4 checksum needs re-caculate because of payload has been changed. With this patch, armmed pktgen with below cofiguration, Wireshark is able to decrypted ESP packet generated by pktgen without any IPv4 checksum error or auth value error. pgset "flag IPSEC" pgset "flows 1" Signed-off-by: Fan Du Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit b9e0d1d183da04f802c6e58484e2fce535e3964b Author: Hannes Frederic Sowa Date: Fri Nov 29 06:39:44 2013 +0100 ipv6: fix possible seqlock deadlock in ip6_finish_output2 [ Upstream commit 7f88c6b23afbd31545c676dea77ba9593a1a14bf ] IPv6 stats are 64 bits and thus are protected with a seqlock. By not disabling bottom-half we could deadlock here if we don't disable bh and a softirq reentrantly updates the same mib. Cc: Eric Dumazet Signed-off-by: Hannes Frederic Sowa Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 1bbc9e9187f0cff97caaa9b9f85a020092e6ab0b Author: Eric Dumazet Date: Thu Nov 28 09:51:22 2013 -0800 inet: fix possible seqlock deadlocks [ Upstream commit f1d8cba61c3c4b1eb88e507249c4cb8d635d9a76 ] In commit c9e9042994d3 ("ipv4: fix possible seqlock deadlock") I left another places where IP_INC_STATS_BH() were improperly used. udp_sendmsg(), ping_v4_sendmsg() and tcp_v4_connect() are called from process context, not from softirq context. This was detected by lockdep seqlock support. Reported-by: jongman heo Fixes: 584bdf8cbdf6 ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP") Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Signed-off-by: Eric Dumazet Cc: Hannes Frederic Sowa Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit e02fadf846f7f7528fcda3269cde203a9a397a4b Author: Ding Tianhong Date: Sat Dec 7 22:12:05 2013 +0800 bridge: flush br's address entry in fdb when remove the bridge dev [ Upstream commit f873042093c0b418d2351fe142222b625c740149 ] When the following commands are executed: brctl addbr br0 ifconfig br0 hw ether rmmod bridge The calltrace will occur: [ 563.312114] device eth1 left promiscuous mode [ 563.312188] br0: port 1(eth1) entered disabled state [ 563.468190] kmem_cache_destroy bridge_fdb_cache: Slab cache still has objects [ 563.468197] CPU: 6 PID: 6982 Comm: rmmod Tainted: G O 3.12.0-0.7-default+ #9 [ 563.468199] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 563.468200] 0000000000000880 ffff88010f111e98 ffffffff814d1c92 ffff88010f111eb8 [ 563.468204] ffffffff81148efd ffff88010f111eb8 0000000000000000 ffff88010f111ec8 [ 563.468206] ffffffffa062a270 ffff88010f111ed8 ffffffffa063ac76 ffff88010f111f78 [ 563.468209] Call Trace: [ 563.468218] [] dump_stack+0x6a/0x78 [ 563.468234] [] kmem_cache_destroy+0xfd/0x100 [ 563.468242] [] br_fdb_fini+0x10/0x20 [bridge] [ 563.468247] [] br_deinit+0x4e/0x50 [bridge] [ 563.468254] [] SyS_delete_module+0x199/0x2b0 [ 563.468259] [] system_call_fastpath+0x16/0x1b [ 570.377958] Bridge firewalling registered --------------------------- cut here ------------------------------- The reason is that when the bridge dev's address is changed, the br_fdb_change_mac_address() will add new address in fdb, but when the bridge was removed, the address entry in the fdb did not free, the bridge_fdb_cache still has objects when destroy the cache, Fix this by flushing the bridge address entry when removing the bridge. v2: according to the Toshiaki Makita and Vlad's suggestion, I only delete the vlan0 entry, it still have a leak here if the vlan id is other number, so I need to call fdb_delete_by_port(br, NULL, 1) to flush all entries whose dst is NULL for the bridge. Suggested-by: Toshiaki Makita Suggested-by: Vlad Yasevich Signed-off-by: Ding Tianhong Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 27770ad06226eda3cbf16cb05eea5c11e08895d6 Author: Vlad Yasevich Date: Tue Nov 19 20:47:15 2013 -0500 net: core: Always propagate flag changes to interfaces [ Upstream commit d2615bf450694c1302d86b9cc8a8958edfe4c3a4 ] The following commit: b6c40d68ff6498b7f63ddf97cf0aa818d748dee7 net: only invoke dev->change_rx_flags when device is UP tried to fix a problem with VLAN devices and promiscuouse flag setting. The issue was that VLAN device was setting a flag on an interface that was down, thus resulting in bad promiscuity count. This commit blocked flag propagation to any device that is currently down. A later commit: deede2fabe24e00bd7e246eb81cd5767dc6fcfc7 vlan: Don't propagate flag changes on down interfaces fixed VLAN code to only propagate flags when the VLAN interface is up, thus fixing the same issue as above, only localized to VLAN. The problem we have now is that if we have create a complex stack involving multiple software devices like bridges, bonds, and vlans, then it is possible that the flags would not propagate properly to the physical devices. A simple examle of the scenario is the following: eth0----> bond0 ----> bridge0 ---> vlan50 If bond0 or eth0 happen to be down at the time bond0 is added to the bridge, then eth0 will never have promisc mode set which is currently required for operation as part of the bridge. As a result, packets with vlan50 will be dropped by the interface. The only 2 devices that implement the special flag handling are VLAN and DSA and they both have required code to prevent incorrect flag propagation. As a result we can remove the generic solution introduced in b6c40d68ff6498b7f63ddf97cf0aa818d748dee7 and leave it to the individual devices to decide whether they will block flag propagation or not. Reported-by: Stefan Priebe Suggested-by: Veaceslav Falico Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit a74ae1680282a93e0d3953567992b06888833b5d Author: Ying Xue Date: Tue Nov 19 18:09:27 2013 +0800 atm: idt77252: fix dev refcnt leak [ Upstream commit b5de4a22f157ca345cdb3575207bf46402414bc1 ] init_card() calls dev_get_by_name() to get a network deceive. But it doesn't decrease network device reference count after the device is used. Signed-off-by: Ying Xue Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit e01013c7f968aa72d3981d3ab835dcde7451bef2 Author: Hannes Frederic Sowa Date: Sat Nov 23 07:22:33 2013 +0100 ipv6: fix leaking uninitialized port number of offender sockaddr [ Upstream commit 1fa4c710b6fe7b0aac9907240291b6fe6aafc3b8 ] Offenders don't have port numbers, so set it to 0. Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 2b232c68061e496e02f1fee12420f5b4c14b4860 Author: Dan Carpenter Date: Wed Nov 27 15:40:21 2013 +0300 net: clamp ->msg_namelen instead of returning an error [ Upstream commit db31c55a6fb245fdbb752a2ca4aefec89afabb06 ] If kmsg->msg_namelen > sizeof(struct sockaddr_storage) then in the original code that would lead to memory corruption in the kernel if you had audit configured. If you didn't have audit configured it was harmless. There are some programs such as beta versions of Ruby which use too large of a buffer and returning an error code breaks them. We should clamp the ->msg_namelen value instead. Fixes: 1661bf364ae9 ("net: heap overflow in __audit_sockaddr()") Reported-by: Eric Wong Signed-off-by: Dan Carpenter Tested-by: Eric Wong Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit dbf387a6491bf25a7aac8c17540376bff2a8a686 Author: Hannes Frederic Sowa Date: Sat Nov 23 00:46:12 2013 +0100 inet: fix addr_len/msg->msg_namelen assignment in recv_error and rxpmtu functions [ Upstream commit 85fbaa75037d0b6b786ff18658ddf0b4014ce2a4 ] Commit bceaa90240b6019ed73b49965eac7d167610be69 ("inet: prevent leakage of uninitialized memory to user in recv syscalls") conditionally updated addr_len if the msg_name is written to. The recv_error and rxpmtu functions relied on the recvmsg functions to set up addr_len before. As this does not happen any more we have to pass addr_len to those functions as well and set it to the size of the corresponding sockaddr length. This broke traceroute and such. Fixes: bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls") Reported-by: Brad Spengler Reported-by: Tom Labanowski Cc: mpb Cc: David S. Miller Cc: Eric Dumazet Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit b52a430ce161044be5e8c77a0e3275cc548e2558 Author: Hannes Frederic Sowa Date: Thu Nov 21 03:14:34 2013 +0100 net: add BUG_ON if kernel advertises msg_namelen > sizeof(struct sockaddr_storage) [ Upstream commit 68c6beb373955da0886d8f4f5995b3922ceda4be ] In that case it is probable that kernel code overwrote part of the stack. So we should bail out loudly here. The BUG_ON may be removed in future if we are sure all protocols are conformant. Suggested-by: Eric Dumazet Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 4485f23cb4c08b21f0a4434a65c8adf0026d589a Author: Hannes Frederic Sowa Date: Thu Nov 21 03:14:22 2013 +0100 net: rework recvmsg handler msg_name and msg_namelen logic CVE-2013-7266 BugLink: http://bugs.launchpad.net/bugs/1267081 This patch now always passes msg->msg_namelen as 0. recvmsg handlers must set msg_namelen to the proper size <= sizeof(struct sockaddr_storage) to return msg_name to the user. This prevents numerous uninitialized memory leaks we had in the recvmsg handlers and makes it harder for new code to accidentally leak uninitialized memory. Optimize for the case recvfrom is called with NULL as address. We don't need to copy the address at all, so set it to NULL before invoking the recvmsg handler. We can do so, because all the recvmsg handlers must cope with the case a plain read() is called on them. read() also sets msg_name to NULL. Also document these changes in include/linux/net.h as suggested by David Miller. Changes since RFC: Set msg->msg_name = NULL if user specified a NULL in msg_name but had a non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't affect sendto as it would bail out earlier while trying to copy-in the address. It also more naturally reflects the logic by the callers of verify_iovec. With this change in place I could remove " if (!uaddr || msg_sys->msg_namelen == 0) msg->msg_name = NULL ". This change does not alter the user visible error logic as we ignore msg_namelen as long as msg_name is NULL. Also remove two unnecessary curly brackets in ___sys_recvmsg and change comments to netdev style. Cc: David Miller Suggested-by: Eric Dumazet Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller (back ported from commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c) Signed-off-by: Luis Henriques Acked-by: Andy Whitcroft Acked-by: Stefan Bader Signed-off-by: Tim Gardner Signed-off-by: Willy Tarreau commit f8e4f8fc477a233429d60e7ed4acdc5e09337af9 Author: Hannes Frederic Sowa Date: Mon Nov 18 04:20:45 2013 +0100 inet: prevent leakage of uninitialized memory to user in recv syscalls [ Upstream commit bceaa90240b6019ed73b49965eac7d167610be69 ] Only update *addr_len when we actually fill in sockaddr, otherwise we can return uninitialized memory from the stack to the caller in the recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL) checks because we only get called with a valid addr_len pointer either from sock_common_recvmsg or inet_recvmsg. If a blocking read waits on a socket which is concurrently shut down we now return zero and set msg_msgnamelen to 0. Reported-by: mpb Suggested-by: Eric Dumazet Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller [wt: no ieee802154, ping nor l2tp in 2.6.32] Signed-off-by: Willy Tarreau commit c091517bfcde5e59ef4f8f1355761dbe2b04c6ff Author: Eric Dumazet Date: Thu Nov 14 13:37:54 2013 -0800 ipv4: fix possible seqlock deadlock [ Upstream commit c9e9042994d37cbc1ee538c500e9da1bb9d1bcdf ] ip4_datagram_connect() being called from process context, it should use IP_INC_STATS() instead of IP_INC_STATS_BH() otherwise we can deadlock on 32bit arches, or get corruptions of SNMP counters. Fixes: 584bdf8cbdf6 ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP") Signed-off-by: Eric Dumazet Reported-by: Dave Jones Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit ed488d772a73690dab2e3152d3a313ebef69be86 Author: Dan Carpenter Date: Thu Nov 14 11:21:10 2013 +0300 isdnloop: use strlcpy() instead of strcpy() [ Upstream commit f9a23c84486ed350cce7bb1b2828abd1f6658796 ] These strings come from a copy_from_user() and there is no way to be sure they are NUL terminated. Signed-off-by: Dan Carpenter Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit f1a8a3ffd1fd6921933b3dc8fc08f140e0c7b7e1 Author: Nikolay Aleksandrov Date: Wed Nov 13 17:07:46 2013 +0100 bonding: fix two race conditions in bond_store_updelay/downdelay [ Upstream commit b869ccfab1e324507fa3596e3e1308444fb68227 ] This patch fixes two race conditions between bond_store_updelay/downdelay and bond_store_miimon which could lead to division by zero as miimon can be set to 0 while either updelay/downdelay are being set and thus miss the zero check in the beginning, the zero div happens because updelay/downdelay are stored as new_value / bond->params.miimon. Use rtnl to synchronize with miimon setting. CC: Jay Vosburgh CC: Andy Gospodarek CC: Veaceslav Falico Signed-off-by: Nikolay Aleksandrov Acked-by: Veaceslav Falico Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 77ef71a8ed5711d57733289f2ed93ceccd0a9bd8 Author: Daniel Borkmann Date: Mon Nov 11 12:20:32 2013 +0100 random32: fix off-by-one in seeding requirement [ Upstream commit 51c37a70aaa3f95773af560e6db3073520513912 ] For properly initialising the Tausworthe generator [1], we have a strict seeding requirement, that is, s1 > 1, s2 > 7, s3 > 15. Commit 697f8d0348 ("random32: seeding improvement") introduced a __seed() function that imposes boundary checks proposed by the errata paper [2] to properly ensure above conditions. However, we're off by one, as the function is implemented as: "return (x < m) ? x + m : x;", and called with __seed(X, 1), __seed(X, 7), __seed(X, 15). Thus, an unwanted seed of 1, 7, 15 would be possible, whereas the lower boundary should actually be of at least 2, 8, 16, just as GSL does. Fix this, as otherwise an initialization with an unwanted seed could have the effect that Tausworthe's PRNG properties cannot not be ensured. Note that this PRNG is *not* used for cryptography in the kernel. [1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps [2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps Joint work with Hannes Frederic Sowa. Fixes: 697f8d0348a6 ("random32: seeding improvement") Cc: Stephen Hemminger Cc: Florian Weimer Cc: Theodore Ts'o Signed-off-by: Daniel Borkmann Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit e2c48fd8fd3bce6a8ff06e940b525b587aca3f13 Author: Duan Jiong Date: Fri Nov 8 09:56:53 2013 +0800 ipv6: use rt6_get_dflt_router to get default router in rt6_route_rcv [ Upstream commit f104a567e673f382b09542a8dc3500aa689957b4 ] As the rfc 4191 said, the Router Preference and Lifetime values in a ::/0 Route Information Option should override the preference and lifetime values in the Router Advertisement header. But when the kernel deals with a ::/0 Route Information Option, the rt6_get_route_info() always return NULL, that means that overriding will not happen, because those default routers were added without flag RTF_ROUTEINFO in rt6_add_dflt_router(). In order to deal with that condition, we should call rt6_get_dflt_router when the prefix length is 0. Signed-off-by: Duan Jiong Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 062e4edd6a774cd1dd25661cbe1e148aa6811e79 Author: Andreas Henriksson Date: Thu Nov 7 18:26:38 2013 +0100 net: Fix "ip rule delete table 256" [ Upstream commit 13eb2ab2d33c57ebddc57437a7d341995fc9138c ] When trying to delete a table >= 256 using iproute2 the local table will be deleted. The table id is specified as a netlink attribute when it needs more then 8 bits and iproute2 then sets the table field to RT_TABLE_UNSPEC (0). Preconditions to matching the table id in the rule delete code doesn't seem to take the "table id in netlink attribute" into condition so the frh_get_table helper function never gets to do its job when matching against current rule. Use the helper function twice instead of peaking at the table value directly. Originally reported at: http://bugs.debian.org/724783 Reported-by: Nicolas HICHER Signed-off-by: Andreas Henriksson Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit b0027af4a0539a91df88483f06939d7bc2c997c5 Author: Ying Xue Date: Thu Aug 16 12:09:07 2012 +0000 tipc: fix lockdep warning during bearer initialization [ Upstream commit 4225a398c1352a7a5c14dc07277cb5cc4473983b ] When the lockdep validator is enabled, it will report the below warning when we enable a TIPC bearer: [ INFO: possible irq lock inversion dependency detected ] --------------------------------------------------------- Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(ptype_lock); local_irq_disable(); lock(tipc_net_lock); lock(ptype_lock); lock(tipc_net_lock); *** DEADLOCK *** the shortest dependencies between 2nd lock and 1st lock: -> (ptype_lock){+.+...} ops: 10 { [...] SOFTIRQ-ON-W at: [] __lock_acquire+0x528/0x13e0 [] lock_acquire+0x90/0x100 [] _raw_spin_lock+0x38/0x50 [] dev_add_pack+0x3a/0x60 [] arp_init+0x1a/0x48 [] inet_init+0x181/0x27e [] do_one_initcall+0x34/0x170 [] kernel_init+0x110/0x1b2 [] kernel_thread_helper+0x6/0x10 [...] ... key at: [] ptype_lock+0x10/0x20 ... acquired at: [] lock_acquire+0x90/0x100 [] _raw_spin_lock+0x38/0x50 [] dev_add_pack+0x3a/0x60 [] enable_bearer+0xf2/0x140 [tipc] [] tipc_enable_bearer+0x1ba/0x450 [tipc] [] tipc_cfg_do_cmd+0x5c4/0x830 [tipc] [] handle_cmd+0x42/0xd0 [tipc] [] genl_rcv_msg+0x232/0x280 [] netlink_rcv_skb+0x86/0xb0 [] genl_rcv+0x1c/0x30 [] netlink_unicast+0x174/0x1f0 [] netlink_sendmsg+0x1eb/0x2d0 [] sock_aio_write+0x161/0x170 [] do_sync_write+0xac/0xf0 [] vfs_write+0x156/0x170 [] sys_write+0x42/0x70 [] sysenter_do_call+0x12/0x38 [...] } -> (tipc_net_lock){+..-..} ops: 4 { [...] IN-SOFTIRQ-R at: [] __lock_acquire+0x64a/0x13e0 [] lock_acquire+0x90/0x100 [] _raw_read_lock_bh+0x3d/0x50 [] tipc_recv_msg+0x1d/0x830 [tipc] [] recv_msg+0x3f/0x50 [tipc] [] __netif_receive_skb+0x22a/0x590 [] netif_receive_skb+0x2b/0xf0 [] pcnet32_poll+0x292/0x780 [] net_rx_action+0xfa/0x1e0 [] __do_softirq+0xae/0x1e0 [...] } >From the log, we can see three different call chains between CPU0 and CPU1: Time 0 on CPU0: kernel_init()->inet_init()->dev_add_pack() At time 0, the ptype_lock is held by CPU0 in dev_add_pack(); Time 1 on CPU1: tipc_enable_bearer()->enable_bearer()->dev_add_pack() At time 1, tipc_enable_bearer() first holds tipc_net_lock, and then wants to take ptype_lock to register TIPC protocol handler into the networking stack. But the ptype_lock has been taken by dev_add_pack() on CPU0, so at this time the dev_add_pack() running on CPU1 has to be busy looping. Time 2 on CPU0: netif_receive_skb()->recv_msg()->tipc_recv_msg() At time 2, an incoming TIPC packet arrives at CPU0, hence tipc_recv_msg() will be invoked. In tipc_recv_msg(), it first wants to hold tipc_net_lock. At the moment, below scenario happens: On CPU0, below is our sequence of taking locks: lock(ptype_lock)->lock(tipc_net_lock) On CPU1, our sequence of taking locks looks like: lock(tipc_net_lock)->lock(ptype_lock) Obviously deadlock may happen in this case. But please note the deadlock possibly doesn't occur at all when the first TIPC bearer is enabled. Before enable_bearer() -- running on CPU1 does not hold ptype_lock, so the TIPC receive handler (i.e. recv_msg()) is not registered successfully via dev_add_pack(), so the tipc_recv_msg() cannot be called by recv_msg() even if a TIPC message comes to CPU0. But when the second TIPC bearer is registered, the deadlock can perhaps really happen. To fix it, we will push the work of registering TIPC protocol handler into workqueue context. After the change, both paths taking ptype_lock are always in process contexts, thus, the deadlock should never occur. Signed-off-by: Ying Xue Signed-off-by: Jon Maloy Signed-off-by: Paul Gortmaker Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 43350267a502957eb232dfbde7b4bf9948dbaa59 Author: Jiri Bohac Date: Fri Aug 30 11:18:45 2013 +0200 ICMPv6: treat dest unreachable codes 5 and 6 as EACCES, not EPROTO [ Upstream commit 61e76b178dbe7145e8d6afa84bb4ccea71918994 ] RFC 4443 has defined two additional codes for ICMPv6 type 1 (destination unreachable) messages: 5 - Source address failed ingress/egress policy 6 - Reject route to destination Now they are treated as protocol error and icmpv6_err_convert() converts them to EPROTO. RFC 4443 says: "Codes 5 and 6 are more informative subsets of code 1." Treat codes 5 and 6 as code 1 (EACCES) Btw, connect() returning -EPROTO confuses firefox, so that fallback to other/IPv4 addresses does not work: https://bugzilla.mozilla.org/show_bug.cgi?id=910773 Signed-off-by: Jiri Bohac Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 8eb7263e80167c089c6e80fda94bfbebbecee9b5 Author: Thomas Graf Date: Tue Sep 3 13:37:01 2013 +0200 ipv6: Don't depend on per socket memory for neighbour discovery messages [ Upstream commit 25a6e6b84fba601eff7c28d30da8ad7cfbef0d43 ] Allocating skbs when sending out neighbour discovery messages currently uses sock_alloc_send_skb() based on a per net namespace socket and thus share a socket wmem buffer space. If a netdevice is temporarily unable to transmit due to carrier loss or for other reasons, the queued up ndisc messages will cosnume all of the wmem space and will thus prevent from any more skbs to be allocated even for netdevices that are able to transmit packets. The number of neighbour discovery messages sent is very limited, use of alloc_skb() bypasses the socket wmem buffer size enforcement while the manual call to skb_set_owner_w() maintains the socket reference needed for the IPv6 output path. This patch has orginally been posted by Eric Dumazet in a modified form. Signed-off-by: Thomas Graf Cc: Eric Dumazet Cc: Hannes Frederic Sowa Cc: Stephen Warren Cc: Fabio Estevam Tested-by: Fabio Estevam Tested-by: Stephen Warren Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit ede46183818537a2da12ad7bed33d8f2d3cb35d0 Author: Hannes Frederic Sowa Date: Fri Aug 16 13:30:07 2013 +0200 ipv6: drop packets with multiple fragmentation headers [ Upstream commit f46078cfcd77fa5165bf849f5e568a7ac5fa569c ] It is not allowed for an ipv6 packet to contain multiple fragmentation headers. So discard packets which were already reassembled by fragmentation logic and send back a parameter problem icmp. The updates for RFC 6980 will come in later, I have to do a bit more research here. Cc: YOSHIFUJI Hideaki Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 789ee2aee7f7765e5bd4825541b16e957151c9d1 Author: Hannes Frederic Sowa Date: Fri Aug 16 13:02:27 2013 +0200 ipv6: remove max_addresses check from ipv6_create_tempaddr [ Upstream commit 4b08a8f1bd8cb4541c93ec170027b4d0782dab52 ] Because of the max_addresses check attackers were able to disable privacy extensions on an interface by creating enough autoconfigured addresses: But the check is not actually needed: max_addresses protects the kernel to install too many ipv6 addresses on an interface and guards addrconf_prefix_rcv to install further addresses as soon as this limit is reached. We only generate temporary addresses in direct response of a new address showing up. As soon as we filled up the maximum number of addresses of an interface, we stop installing more addresses and thus also stop generating more temp addresses. Even if the attacker tries to generate a lot of temporary addresses by announcing a prefix and removing it again (lifetime == 0) we won't install more temp addresses, because the temporary addresses do count to the maximum number of addresses, thus we would stop installing new autoconfigured addresses when the limit is reached. This patch fixes CVE-2013-0343 (but other layer-2 attacks are still possible). Thanks to Ding Tianhong to bring this topic up again. Cc: Ding Tianhong Cc: George Kargiotakis Cc: P J P Cc: YOSHIFUJI Hideaki Signed-off-by: Hannes Frederic Sowa Acked-by: Ding Tianhong Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 8d4e8ab0d1ff4c2ecc836419685dcfde61422235 Author: Hannes Frederic Sowa Date: Wed Aug 7 02:34:31 2013 +0200 ipv6: don't stop backtracking in fib6_lookup_1 if subtree does not match [ Upstream commit 3e3be275851bc6fc90bfdcd732cd95563acd982b ] In case a subtree did not match we currently stop backtracking and return NULL (root table from fib_lookup). This could yield in invalid routing table lookups when using subtrees. Instead continue to backtrack until a valid subtree or node is found and return this match. Also remove unneeded NULL check. Reported-by: Teco Boot Cc: YOSHIFUJI Hideaki Cc: David Lamparter Cc: Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 91b5dbd86fedea48e1808df5496617a87b6f320c Author: Eric Dumazet Date: Mon Aug 5 20:05:12 2013 -0700 tcp: cubic: fix bug in bictcp_acked() [ Upstream commit cd6b423afd3c08b27e1fed52db828ade0addbc6b ] While investigating about strange increase of retransmit rates on hosts ~24 days after boot, Van found hystart was disabled if ca->epoch_start was 0, as following condition is true when tcp_time_stamp high order bit is set. (s32)(tcp_time_stamp - ca->epoch_start) < HZ Quoting Van : At initialization & after every loss ca->epoch_start is set to zero so I believe that the above line will turn off hystart as soon as the 2^31 bit is set in tcp_time_stamp & hystart will stay off for 24 days. I think we've observed that cubic's restart is too aggressive without hystart so this might account for the higher drop rate we observe. Diagnosed-by: Van Jacobson Signed-off-by: Eric Dumazet Cc: Neal Cardwell Cc: Yuchung Cheng Acked-by: Neal Cardwell Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit eedcafdc0abf54a33b2c021b74c30c7a96eee050 Author: Roman Gushchin Date: Fri Aug 2 18:36:40 2013 +0400 net: check net.core.somaxconn sysctl values [ Upstream commit 5f671d6b4ec3e6d66c2a868738af2cdea09e7509 ] It's possible to assign an invalid value to the net.core.somaxconn sysctl variable, because there is no checks at all. The sk_max_ack_backlog field of the sock structure is defined as unsigned short. Therefore, the backlog argument in inet_listen() shouldn't exceed USHRT_MAX. The backlog argument in the listen() syscall is truncated to the somaxconn value. So, the somaxconn value shouldn't exceed 65535 (USHRT_MAX). Also, negative values of somaxconn are meaningless. before: $ sysctl -w net.core.somaxconn=256 net.core.somaxconn = 256 $ sysctl -w net.core.somaxconn=65536 net.core.somaxconn = 65536 $ sysctl -w net.core.somaxconn=-100 net.core.somaxconn = -100 after: $ sysctl -w net.core.somaxconn=256 net.core.somaxconn = 256 $ sysctl -w net.core.somaxconn=65536 error: "Invalid argument" setting key "net.core.somaxconn" $ sysctl -w net.core.somaxconn=-100 error: "Invalid argument" setting key "net.core.somaxconn" Based on a prior patch from Changli Gao. Signed-off-by: Roman Gushchin Reported-by: Changli Gao Suggested-by: Eric Dumazet Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 63251ed09df444c561f7dddf2d97539e1257a7b6 Author: stephen hemminger Date: Thu Aug 1 22:32:07 2013 -0700 htb: fix sign extension bug [ Upstream commit cbd375567f7e4811b1c721f75ec519828ac6583f ] When userspace passes a large priority value the assignment of the unsigned value hopt->prio to signed int cl->prio causes cl->prio to become negative and the comparison is with TC_HTB_NUMPRIO is always false. The result is that HTB crashes by referencing outside the array when processing packets. With this patch the large value wraps around like other values outside the normal range. See: https://bugzilla.kernel.org/show_bug.cgi?id=60669 Signed-off-by: Stephen Hemminger Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit f5a2119c90fba5150930082491bceee731f52c90 Author: Dan Carpenter Date: Tue Jul 30 13:23:39 2013 +0300 net_sched: info leak in atm_tc_dump_class() [ Upstream commit 8cb3b9c3642c0263d48f31d525bcee7170eedc20 ] The "pvc" struct has a hole after pvc.sap_family which is not cleared. Signed-off-by: Dan Carpenter Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit f084fd38510a018f611b2c630db43f266694524f Author: Dan Carpenter Date: Sun Jul 28 23:04:45 2013 +0300 af_key: more info leaks in pfkey messages [ Upstream commit ff862a4668dd6dba962b1d2d8bd344afa6375683 ] This is inspired by a5cc68f3d6 "af_key: fix info leaks in notify messages". There are some struct members which don't get initialized and could disclose small amounts of private information. Acked-by: Mathias Krause Signed-off-by: Dan Carpenter Acked-by: Steffen Klassert Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit dd0938fafe464ffddadf805850fc29967f9d393f Author: David S. Miller Date: Tue Jul 30 00:16:21 2013 -0700 net_sched: Fix stack info leak in cbq_dump_wrr(). [ Upstream commit a0db856a95a29efb1c23db55c02d9f0ff4f0db48 ] Make sure the reserved fields, and padding (if any), are fully initialized. Based upon a patch by Dan Carpenter and feedback from Joe Perches. Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit e409e24ad51215dc0e816aa42e2914fc7911e93d Author: Neil Horman Date: Wed Jun 12 14:26:44 2013 -0400 sctp: fully initialize sctp_outq in sctp_outq_init [ Upstream commit c5c7774d7eb4397891edca9ebdf750ba90977a69 ] In commit 2f94aabd9f6c925d77aecb3ff020f1cc12ed8f86 (refactor sctp_outq_teardown to insure proper re-initalization) we modified sctp_outq_teardown to use sctp_outq_init to fully re-initalize the outq structure. Steve West recently asked me why I removed the q->error = 0 initalization from sctp_outq_teardown. I did so because I was operating under the impression that sctp_outq_init would properly initalize that value for us, but it doesn't. sctp_outq_init operates under the assumption that the outq struct is all 0's (as it is when called from sctp_association_init), but using it in __sctp_outq_teardown violates that assumption. We should do a memset in sctp_outq_init to ensure that the entire structure is in a known state there instead. Signed-off-by: Neil Horman Reported-by: "West, Steve (NSN - US/Fort Worth)" CC: Vlad Yasevich CC: netdev@vger.kernel.org CC: davem@davemloft.net Acked-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit b7c9e4ee1068391f547f3eaf0b6646115e300bb5 Author: Michal Tesar Date: Fri Jul 19 14:09:01 2013 +0200 sysctl net: Keep tcp_syn_retries inside the boundary [ Upstream commit 651e92716aaae60fc41b9652f54cb6803896e0da ] Limit the min/max value passed to the /proc/sys/net/ipv4/tcp_syn_retries. Signed-off-by: Michal Tesar Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 8c46d377c799adb4c8dbfed9cd1156d228298ff9 Author: Dan Carpenter Date: Fri Jul 19 08:48:05 2013 +0300 arcnet: cleanup sizeof parameter [ Upstream commit 087d273caf4f7d3f2159256f255f1f432bc84a5b ] This patch doesn't change the compiled code because ARC_HDR_SIZE is 4 and sizeof(int) is 4, but the intent was to use the header size and not the sizeof the header size. Signed-off-by: Dan Carpenter Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 28a7606fe1d4cdae19bf6cea88f33df13f543e2a Author: Eric Dumazet Date: Thu Jul 18 09:35:10 2013 -0700 vlan: fix a race in egress prio management [ Upstream commit 3e3aac497513c669e1c62c71e1d552ea85c1d974 ] egress_priority_map[] hash table updates are protected by rtnl, and we never remove elements until device is dismantled. We have to make sure that before inserting an new element in hash table, all its fields are committed to memory or else another cpu could find corrupt values and crash. Signed-off-by: Eric Dumazet Cc: Patrick McHardy Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 5393c4b20b6fe78df930a0fe1821ec72ba735541 Author: dingtianhong Date: Thu Jul 11 19:04:06 2013 +0800 ifb: fix oops when loading the ifb failed [ Upstream commit f2966cd5691058b8674a20766525bedeaea9cbcf ] If __rtnl_link_register() return faild when loading the ifb, it will take the wrong path and get oops, so fix it just like dummy. Signed-off-by: Ding Tianhong Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 8665a8c475b12393bb8eed9b3c7401da9513d3c1 Author: dingtianhong Date: Thu Jul 11 19:04:02 2013 +0800 dummy: fix oops when loading the dummy failed [ Upstream commit 2c8a01894a12665d8059fad8f0a293c98a264121 ] We rename the dummy in modprobe.conf like this: install dummy0 /sbin/modprobe -o dummy0 --ignore-install dummy install dummy1 /sbin/modprobe -o dummy1 --ignore-install dummy We got oops when we run the command: modprobe dummy0 modprobe dummy1 ------------[ cut here ]------------ [ 3302.187584] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 3302.195411] IP: [] __rtnl_link_unregister+0x9a/0xd0 [ 3302.201844] PGD 85c94a067 PUD 8517bd067 PMD 0 [ 3302.206305] Oops: 0002 [#1] SMP [ 3302.299737] task: ffff88105ccea300 ti: ffff880eba4a0000 task.ti: ffff880eba4a0000 [ 3302.307186] RIP: 0010:[] [] __rtnl_link_unregister+0x9a/0xd0 [ 3302.316044] RSP: 0018:ffff880eba4a1dd8 EFLAGS: 00010246 [ 3302.321332] RAX: 0000000000000000 RBX: ffffffff81a9d738 RCX: 0000000000000002 [ 3302.328436] RDX: 0000000000000000 RSI: ffffffffa04d602c RDI: ffff880eba4a1dd8 [ 3302.335541] RBP: ffff880eba4a1e18 R08: dead000000200200 R09: dead000000100100 [ 3302.342644] R10: 0000000000000080 R11: 0000000000000003 R12: ffffffff81a9d788 [ 3302.349748] R13: ffffffffa04d7020 R14: ffffffff81a9d670 R15: ffff880eba4a1dd8 [ 3302.364910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3302.370630] CR2: 0000000000000008 CR3: 000000085e15e000 CR4: 00000000000427e0 [ 3302.377734] DR0: 0000000000000003 DR1: 00000000000000b0 DR2: 0000000000000001 [ 3302.384838] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 3302.391940] Stack: [ 3302.393944] ffff880eba4a1dd8 ffff880eba4a1dd8 ffff880eba4a1e18 ffffffffa04d70c0 [ 3302.401350] 00000000ffffffef ffffffffa01a8000 0000000000000000 ffffffff816111c8 [ 3302.408758] ffff880eba4a1e48 ffffffffa01a80be ffff880eba4a1e48 ffffffffa04d70c0 [ 3302.416164] Call Trace: [ 3302.418605] [] ? 0xffffffffa01a7fff [ 3302.423727] [] dummy_init_module+0xbe/0x1000 [dummy0] [ 3302.430405] [] ? 0xffffffffa01a7fff [ 3302.435535] [] do_one_initcall+0x152/0x1b0 [ 3302.441263] [] do_init_module+0x7b/0x200 [ 3302.446824] [] load_module+0x4e2/0x530 [ 3302.452215] [] ? ddebug_dyndbg_boot_param_cb+0x60/0x60 [ 3302.458979] [] SyS_init_module+0xd1/0x130 [ 3302.464627] [] system_call_fastpath+0x16/0x1b [ 3302.490090] RIP [] __rtnl_link_unregister+0x9a/0xd0 [ 3302.496607] RSP [ 3302.500084] CR2: 0000000000000008 [ 3302.503466] ---[ end trace 8342d49cd49f78ed ]--- The reason is that when loading dummy, if __rtnl_link_register() return failed, the init_module should return and avoid take the wrong path. Signed-off-by: Tan Xiaojun Signed-off-by: Ding Tianhong Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 895f0ade5ca2823cf39971f1340eb2d440b463e5 Author: dingtianhong Date: Wed Jul 10 12:04:02 2013 +0800 ifb: fix rcu_sched self-detected stalls [ Upstream commit 440d57bc5ff55ec1efb3efc9cbe9420b4bbdfefa ] According to the commit 16b0dc29c1af9df341428f4c49ada4f626258082 (dummy: fix rcu_sched self-detected stalls) Eric Dumazet fix the problem in dummy, but the ifb will occur the same problem like the dummy modules. Trying to "modprobe ifb numifbs=30000" triggers : INFO: rcu_sched self-detected stall on CPU After this splat, RTNL is locked and reboot is needed. We must call cond_resched() to avoid this, even holding RTNL. Signed-off-by: Ding Tianhong Signed-off-by: David S. Miller [wt: 2.6.32: cond_resched() needs linux/sched.h] Signed-off-by: Willy Tarreau commit 63d632e5c4108652a3df37944b0a970e02439508 Author: Dave Kleikamp Date: Mon Jul 1 16:49:22 2013 -0500 sunvnet: vnet_port_remove must call unregister_netdev [ Upstream commit aabb9875d02559ab9b928cd6f259a5cc4c21a589 ] The missing call to unregister_netdev() leaves the interface active after the driver is unloaded by rmmod. Signed-off-by: Dave Kleikamp Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit b0d43763ffc46f894889a74aa05acd349a3e1ce6 Author: Changli Gao Date: Sat Jun 29 00:15:51 2013 +0800 net: Swap ver and type in pppoe_hdr [ Upstream commit b1a5a34bd0b8767ea689e68f8ea513e9710b671e ] Ver and type in pppoe_hdr should be swapped as defined by RFC2516 section-4. Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit af6739a5fe47e23aef27d8057db0edbccb13da52 Author: Eric Dumazet Date: Fri Jun 28 02:37:42 2013 -0700 neighbour: fix a race in neigh_destroy() [ Upstream commit c9ab4d85de222f3390c67aedc9c18a50e767531e ] There is a race in neighbour code, because neigh_destroy() uses skb_queue_purge(&neigh->arp_queue) without holding neighbour lock, while other parts of the code assume neighbour rwlock is what protects arp_queue Convert all skb_queue_purge() calls to the __skb_queue_purge() variant Use __skb_queue_head_init() instead of skb_queue_head_init() to make clear we do not use arp_queue.lock And hold neigh->lock in neigh_destroy() to close the race. Reported-by: Joe Jin Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 6597b256646d4ff022bedd364b59fa3de0c05acc Author: Daniel Borkmann Date: Wed Jun 12 16:02:27 2013 +0200 packet: packet_getname_spkt: make sure string is always 0-terminated [ Upstream commit 2dc85bf323515e59e15dfa858d1472bb25cad0fe ] uaddr->sa_data is exactly of size 14, which is hard-coded here and passed as a size argument to strncpy(). A device name can be of size IFNAMSIZ (== 16), meaning we might leave the destination string unterminated. Thus, use strlcpy() and also sizeof() while we're at it. We need to memset the data area beforehand, since strlcpy does not padd the remaining buffer with zeroes for user space, so that we do not possibly leak anything. Signed-off-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit de892243196ba7dbddc858a4f34600c6d6e77b65 Author: Daniel Borkmann Date: Thu Jun 6 15:53:47 2013 +0200 net: sctp: fix NULL pointer dereference in socket destruction [ Upstream commit 1abd165ed757db1afdefaac0a4bc8a70f97d258c ] While stress testing sctp sockets, I hit the following panic: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [] sctp_endpoint_free+0xe/0x40 [sctp] PGD 7cead067 PUD 7ce76067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: sctp(F) libcrc32c(F) [...] CPU: 7 PID: 2950 Comm: acc Tainted: GF 3.10.0-rc2+ #1 Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011 task: ffff88007ce0e0c0 ti: ffff88007b568000 task.ti: ffff88007b568000 RIP: 0010:[] [] sctp_endpoint_free+0xe/0x40 [sctp] RSP: 0018:ffff88007b569e08 EFLAGS: 00010292 RAX: 0000000000000000 RBX: ffff88007db78a00 RCX: dead000000200200 RDX: ffffffffa049fdb0 RSI: ffff8800379baf38 RDI: 0000000000000000 RBP: ffff88007b569e18 R08: ffff88007c230da0 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff880077990d00 R14: 0000000000000084 R15: ffff88007db78a00 FS: 00007fc18ab61700(0000) GS:ffff88007fc60000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 000000007cf9d000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff88007b569e38 ffff88007db78a00 ffff88007b569e38 ffffffffa049fded ffffffff81abf0c0 ffff88007db78a00 ffff88007b569e58 ffffffff8145b60e 0000000000000000 0000000000000000 ffff88007b569eb8 ffffffff814df36e Call Trace: [] sctp_destroy_sock+0x3d/0x80 [sctp] [] sk_common_release+0x1e/0xf0 [] inet_create+0x2ae/0x350 [] __sock_create+0x11f/0x240 [] sock_create+0x30/0x40 [] SyS_socket+0x4c/0xc0 [] ? do_page_fault+0xe/0x10 [] ? page_fault+0x22/0x30 [] system_call_fastpath+0x16/0x1b Code: 0c c9 c3 66 2e 0f 1f 84 00 00 00 00 00 e8 fb fe ff ff c9 c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 <48> 8b 47 20 48 89 fb c6 47 1c 01 c6 40 12 07 e8 9e 68 01 00 48 RIP [] sctp_endpoint_free+0xe/0x40 [sctp] RSP CR2: 0000000000000020 ---[ end trace e0d71ec1108c1dd9 ]--- I did not hit this with the lksctp-tools functional tests, but with a small, multi-threaded test program, that heavily allocates, binds, listens and waits in accept on sctp sockets, and then randomly kills some of them (no need for an actual client in this case to hit this). Then, again, allocating, binding, etc, and then killing child processes. This panic then only occurs when ``echo 1 > /proc/sys/net/sctp/auth_enable'' is set. The cause for that is actually very simple: in sctp_endpoint_init() we enter the path of sctp_auth_init_hmacs(). There, we try to allocate our crypto transforms through crypto_alloc_hash(). In our scenario, it then can happen that crypto_alloc_hash() fails with -EINTR from crypto_larval_wait(), thus we bail out and release the socket via sk_common_release(), sctp_destroy_sock() and hit the NULL pointer dereference as soon as we try to access members in the endpoint during sctp_endpoint_free(), since endpoint at that time is still NULL. Now, if we have that case, we do not need to do any cleanup work and just leave the destruction handler. Signed-off-by: Daniel Borkmann Acked-by: Neil Horman Acked-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 09ef3754dffadf39076a402bc1ca7883c3fa2b73 Author: Eric Dumazet Date: Fri May 24 05:49:58 2013 +0000 ip_tunnel: fix kernel panic with icmp_dest_unreach [ Upstream commit a622260254ee481747cceaaa8609985b29a31565 ] Daniel Petre reported crashes in icmp_dst_unreach() with following call graph: Daniel found a similar problem mentioned in http://lkml.indiana.edu/hypermail/linux/kernel/1007.0/00961.html And indeed this is the root cause : skb->cb[] contains data fooling IP stack. We must clear IPCB in ip_tunnel_xmit() sooner in case dst_link_failure() is called. Or else skb->cb[] might contain garbage from GSO segmentation layer. A similar fix was tested on linux-3.9, but gre code was refactored in linux-3.10. I'll send patches for stable kernels as well. Many thanks to Daniel for providing reports, patches and testing ! Reported-by: Daniel Petre Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 46e280b6b2d0b0849c902289579e756ead08a6d8 Author: Eric Dumazet Date: Fri May 17 04:53:13 2013 +0000 ipv6: fix possible crashes in ip6_cork_release() [ Upstream commit 284041ef21fdf2e0d216ab6b787bc9072b4eb58a ] commit 0178b695fd6b4 ("ipv6: Copy cork options in ip6_append_data") added some code duplication and bad error recovery, leading to potential crash in ip6_cork_release() as kfree() could be called with garbage. use kzalloc() to make sure this wont happen. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Cc: Herbert Xu Cc: Hideaki YOSHIFUJI Cc: Neal Cardwell Signed-off-by: Willy Tarreau commit 048c4543789eebb39d99648f48f39b6d96565103 Author: Eric Dumazet Date: Mon May 13 21:25:52 2013 +0000 tcp: fix tcp_md5_hash_skb_data() [ Upstream commit 54d27fcb338bd9c42d1dfc5a39e18f6f9d373c2e ] TCP md5 communications fail [1] for some devices, because sg/crypto code assume page offsets are below PAGE_SIZE. This was discovered using mlx4 driver [2], but I suspect loopback might trigger the same bug now we use order-3 pages in tcp_sendmsg() [1] Failure is giving following messages. huh, entered softirq 3 NET_RX ffffffff806ad230 preempt_count 00000100, exited with 00000101? [2] mlx4 driver uses order-2 pages to allocate RX frags Reported-by: Matt Schnall Signed-off-by: Eric Dumazet Cc: Bernhard Beck Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit dc070123222abfa8e84a09c05864843a4406b42c Author: Ricardo Ribalda Date: Tue Oct 1 08:17:10 2013 +0200 ll_temac: Reset dma descriptors indexes on ndo_open [ Upstream commit 7167cf0e8cd10287b7912b9ffcccd9616f382922 ] The dma descriptors indexes are only initialized on the probe function. If a packet is on the buffer when temac_stop is called, the dma descriptors indexes can be left on a incorrect state where no other package can be sent. So an interface could be left in an usable state after ifdow/ifup. This patch makes sure that the descriptors indexes are in a proper status when the device is open. Signed-off-by: Ricardo Ribalda Delgado Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit d619b073abf62ef8bd9360bae938dc47f995741b Author: Neil Horman Date: Fri Sep 27 12:22:15 2013 -0400 bonding: Fix broken promiscuity reference counting issue [ Upstream commit 5a0068deb611109c5ba77358be533f763f395ee4 ] Recently grabbed this report: https://bugzilla.redhat.com/show_bug.cgi?id=1005567 Of an issue in which the bonding driver, with an attached vlan encountered the following errors when bond0 was taken down and back up: dummy1: promiscuity touches roof, set promiscuity failed. promiscuity feature of device might be broken. The error occurs because, during __bond_release_one, if we release our last slave, we take on a random mac address and issue a NETDEV_CHANGEADDR notification. With an attached vlan, the vlan may see that the vlan and bond mac address were in sync, but no longer are. This triggers a call to dev_uc_add and dev_set_rx_mode, which enables IFF_PROMISC on the bond device. Then, when we complete __bond_release_one, we use the current state of the bond flags to determine if we should decrement the promiscuity of the releasing slave. But since the bond changed promiscuity state during the release operation, we incorrectly decrement the slave promisc count when it wasn't in promiscuous mode to begin with, causing the above error Fix is pretty simple, just cache the bonding flags at the start of the function and use those when determining the need to set promiscuity. This is also needed for the ALLMULTI flag CC: Jay Vosburgh CC: Andy Gospodarek CC: Mark Wu CC: "David S. Miller" Reported-by: Mark Wu Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit cc2a9147e5c8be68d3a08637d512593c79b6b158 Author: Peter Korsgaard Date: Mon Sep 30 23:28:20 2013 +0200 dm9601: fix IFF_ALLMULTI handling [ Upstream commit bf0ea6380724beb64f27a722dfc4b0edabff816e ] Pass-all-multicast is controlled by bit 3 in RX control, not bit 2 (pass undersized frames). Reported-by: Joseph Chang Signed-off-by: Peter Korsgaard Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 5d71a2b371009a57b90b325012e0150b2c65076d Author: Salam Noureddine Date: Sun Sep 29 13:39:42 2013 -0700 ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put [ Upstream commit e2401654dd0f5f3fb7a8d80dad9554d73d7ca394 ] It is possible for the timer handlers to run after the call to ip_mc_down so use in_dev_put instead of __in_dev_put in the handler function in order to do proper cleanup when the refcnt reaches 0. Otherwise, the refcnt can reach zero without the in_device being destroyed and we end up leaking a reference to the net_device and see messages like the following, unregister_netdevice: waiting for eth0 to become free. Usage count = 1 Tested on linux-3.4.43. Signed-off-by: Salam Noureddine Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 3b34788b41e4a2810ddc97a4c2b4b34aaf39346a Author: Salam Noureddine Date: Sun Sep 29 13:41:34 2013 -0700 ipv6 mcast: use in6_dev_put in timer handlers instead of __in6_dev_put [ Upstream commit 9260d3e1013701aa814d10c8fc6a9f92bd17d643 ] It is possible for the timer handlers to run after the call to ipv6_mc_down so use in6_dev_put instead of __in6_dev_put in the handler function in order to do proper cleanup when the refcnt reaches 0. Otherwise, the refcnt can reach zero without the inet6_dev being destroyed and we end up leaking a reference to the net_device and see messages like the following, unregister_netdevice: waiting for eth0 to become free. Usage count = 1 Tested on linux-3.4.43. Signed-off-by: Salam Noureddine Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 82d399e06e8364dcafc8dee1d615f87c160ea1d9 Author: Chris Healy Date: Wed Sep 11 21:37:47 2013 -0700 resubmit bridge: fix message_age_timer calculation [ Upstream commit 9a0620133ccce9dd35c00a96405c8d80938c2cc0 ] This changes the message_age_timer calculation to use the BPDU's max age as opposed to the local bridge's max age. This is in accordance with section 8.6.2.3.2 Step 2 of the 802.1D-1998 sprecification. With the current implementation, when running with very large bridge diameters, convergance will not always occur even if a root bridge is configured to have a longer max age. Tested successfully on bridge diameters of ~200. Signed-off-by: Chris Healy Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 81df64dff0e1f6ad9f2fa8b9a9bdea74f8cbdd85 Author: Mariusz Ceier Date: Mon Oct 21 19:45:04 2013 +0200 davinci_emac.c: Fix IFF_ALLMULTI setup [ Upstream commit d69e0f7ea95fef8059251325a79c004bac01f018 ] When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't, emac_dev_mcast_set should only enable RX of multicasts and reset MACHASH registers. It does this, but afterwards it either sets up multicast MACs filtering or disables RX of multicasts and resets MACHASH registers again, rendering IFF_ALLMULTI flag useless. This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set. Tested with kernel 2.6.37. Signed-off-by: Mariusz Ceier Acked-by: Mugunthan V N Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 5b5c1947f692f8b6a8568a27a270cbb9413233ac Author: Salva Peiró Date: Wed Oct 16 12:46:50 2013 +0200 wanxl: fix info leak in ioctl [ Upstream commit 2b13d06c9584b4eb773f1e80bbaedab9a1c344e1 ] The wanxl_ioctl() code fails to initialize the two padding bytes of struct sync_serial_settings after the ->loopback member. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Salva Peiró Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 618664aefb3819cf77720b17b894df55e4ca4d93 Author: Vlad Yasevich Date: Tue Oct 15 22:01:31 2013 -0400 sctp: Perform software checksum if packet has to be fragmented. [ Upstream commit d2dbbba77e95dff4b4f901fee236fef6d9552072 ] IP/IPv6 fragmentation knows how to compute only TCP/UDP checksum. This causes problems if SCTP packets has to be fragmented and ipsummed has been set to PARTIAL due to checksum offload support. This condition can happen when retransmitting after MTU discover, or when INIT or other control chunks are larger then MTU. Check for the rare fragmentation condition in SCTP and use software checksum calculation in this case. CC: Fan Du Signed-off-by: Vlad Yasevich Acked-by: Neil Horman Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit dd42885d9bdefdbed554a0f11a055c4f1be2df3a Author: Fan Du Date: Tue Oct 15 22:01:30 2013 -0400 sctp: Use software crc32 checksum when xfrm transform will happen. [ Upstream commit 27127a82561a2a3ed955ce207048e1b066a80a2a ] igb/ixgbe have hardware sctp checksum support, when this feature is enabled and also IPsec is armed to protect sctp traffic, ugly things happened as xfrm_output checks CHECKSUM_PARTIAL to do checksum operation(sum every thing up and pack the 16bits result in the checksum field). The result is fail establishment of sctp communication. Cc: Neil Horman Cc: Steffen Klassert Signed-off-by: Fan Du Signed-off-by: Vlad Yasevich Acked-by: Neil Horman Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 3731098269c396270702425616d7209ba20924f3 Author: Vlad Yasevich Date: Tue Oct 15 22:01:29 2013 -0400 net: dst: provide accessor function to dst->xfrm [ Upstream commit e87b3998d795123b4139bc3f25490dd236f68212 ] dst->xfrm is conditionally defined. Provide accessor funtion that is always available. Signed-off-by: Vlad Yasevich Acked-by: Neil Horman Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 3c1382ec8a0ae705e14985c7e0ba7781ce5ddf45 Author: Mathias Krause Date: Mon Sep 30 22:03:07 2013 +0200 connector: use nlmsg_len() to check message length [ Upstream commit 162b2bedc084d2d908a04c93383ba02348b648b0 ] The current code tests the length of the whole netlink message to be at least as long to fit a cn_msg. This is wrong as nlmsg_len includes the length of the netlink message header. Use nlmsg_len() instead to fix this "off-by-NLMSG_HDRLEN" size check. Cc: stable@vger.kernel.org # v2.6.14+ Signed-off-by: Mathias Krause Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit fa2b034479f59fcca4a218f357d37fe00f35554c Author: Salva Peiró Date: Fri Oct 11 12:50:03 2013 +0300 farsync: fix info leak in ioctl [ Upstream commit 96b340406724d87e4621284ebac5e059d67b2194 ] The fst_get_iface() code fails to initialize the two padding bytes of struct sync_serial_settings after the ->loopback member. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Dan Carpenter Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 42a11ea6999d8b9f65fd7a2c472d6c447f359405 Author: Marc Kleine-Budde Date: Mon Oct 7 23:19:58 2013 +0200 net: vlan: fix nlmsg size calculation in vlan_get_size() [ Upstream commit c33a39c575068c2ea9bffb22fd6de2df19c74b89 ] This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Cc: Patrick McHardy Signed-off-by: Marc Kleine-Budde Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit c8f1f981492cebde72c8ba8fdf1dd160a8dea908 Author: Marc Kleine-Budde Date: Sat Oct 5 21:25:17 2013 +0200 can: dev: fix nlmsg size calculation in can_get_size() [ Upstream commit fe119a05f8ca481623a8d02efcc984332e612528 ] This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Signed-off-by: Marc Kleine-Budde Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit eee99b6ad8923ff58d54a0cf72b87ff434f2d28f Author: Mathias Krause Date: Mon Sep 30 22:03:06 2013 +0200 proc connector: fix info leaks [ Upstream commit e727ca82e0e9616ab4844301e6bae60ca7327682 ] Initialize event_data for all possible message types to prevent leaking kernel stack contents to userland (up to 20 bytes). Also set the flags member of the connector message to 0 to prevent leaking two more stack bytes this way. Cc: stable@vger.kernel.org # v2.6.15+ Signed-off-by: Mathias Krause Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit afd86e972d729cc1722f942368a05e2ab12f3449 Author: Dan Carpenter Date: Thu Oct 3 00:27:20 2013 +0300 net: heap overflow in __audit_sockaddr() [ Upstream commit 1661bf364ae9c506bc8795fef70d1532931be1e8 ] We need to cap ->msg_namelen or it leads to a buffer overflow when we to the memcpy() in __audit_sockaddr(). It requires CAP_AUDIT_CONTROL to exploit this bug. The call tree is: ___sys_recvmsg() move_addr_to_user() audit_sockaddr() __audit_sockaddr() Reported-by: Jüri Aedla Signed-off-by: Dan Carpenter Signed-off-by: David S. Miller [wt: 2.6.32: msg_sys is a struct, not a pointer] Signed-off-by: Willy Tarreau commit 3e644055922658dd12c48c519f4e58b4294fd5ab Author: Eric Dumazet Date: Tue Oct 1 21:04:11 2013 -0700 net: do not call sock_put() on TIMEWAIT sockets [ Upstream commit 80ad1d61e72d626e30ebe8529a0455e660ca4693 ] commit 3ab5aee7fe84 ("net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets. We should instead use inet_twsk_put() Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 49fefd74ad953802cee70f2a807b316d779ec32b Author: Eric Dumazet Date: Tue Oct 15 11:54:30 2013 -0700 tcp: must unclone packets before mangling them [ Upstream commit c52e2421f7368fd36cbe330d2cf41b10452e39a9 ] TCP stack should make sure it owns skbs before mangling them. We had various crashes using bnx2x, and it turned out gso_size was cleared right before bnx2x driver was populating TC descriptor of the _previous_ packet send. TCP stack can sometime retransmit packets that are still in Qdisc. Of course we could make bnx2x driver more robust (using ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack. We have identified two points where skb_unclone() was needed. This patch adds a WARN_ON_ONCE() to warn us if we missed another fix of this kind. Kudos to Neal for finding the root cause of this bug. Its visible using small MSS. Signed-off-by: Eric Dumazet Signed-off-by: Neal Cardwell Cc: Yuchung Cheng Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 0277a0c45208946eccad721a3970765f687f4145 Author: Eric Dumazet Date: Wed Nov 23 15:49:31 2011 -0500 ipv6: tcp: fix panic in SYN processing commit 72a3effaf633bc ([NET]: Size listen hash tables using backlog hint) added a bug allowing inet6_synq_hash() to return an out of bound array index, because of u16 overflow. Bug can happen if system admins set net.core.somaxconn & net.ipv4.tcp_max_syn_backlog sysctls to values greater than 65536 Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller (cherry picked from commit c16a98ed91597b40b22b540c6517103497ef8e74) Signed-off-by: Willy Tarreau commit 8ec40cab58d3b334077ff0f7bbc0f667c43f1583 Author: Nikola Pajkovsky Date: Fri Oct 11 08:48:34 2013 +0200 crypto: api - Fix race condition in larval lookup https://bugzilla.redhat.com/1016108 64z is missing rhel6 commit 3af031a395c0 ("[crypto] algboss: Hold ref count on larval") which is causing cosmetic fuzz, because crypto_alg_get was move from crypto/api.c to crypto/internal.h. From: Herbert Xu [ upstream commit 77dbd7a95e4a4f15264c333a9e9ab97ee27dc2aa ] crypto_larval_lookup should only return a larval if it created one. Any larval created by another entity must be processed through crypto_larval_wait before being returned. Otherwise this will lead to a larval being killed twice, which will most likely lead to a crash. Cc: stable@vger.kernel.org Reported-by: Kees Cook Tested-by: Kees Cook Signed-off-by: Herbert Xu Signed-off-by: Nikola Pajkovsky Signed-off-by: Willy Tarreau commit 70630fa3b7148ee07828587fedbf0ad3f895c8e8 Author: Kees Cook Date: Wed Sep 11 19:56:50 2013 +0000 HID: provide a helper for validating hid reports commit 331415ff16a12147d57d5c953f3a961b7ede348b upstream Many drivers need to validate the characteristics of their HID report during initialization to avoid misusing the reports. This adds a common helper to perform validation of the report exisitng, the field existing, and the expected number of values within the field. Signed-off-by: Kees Cook Cc: stable@vger.kernel.org Reviewed-by: Benjamin Tissoires Signed-off-by: Jiri Kosina [jmm: backported to 2.6.32] [wt: dev_err() in 2.6.32 instead of hid_err()] Signed-off-by: Willy Tarreau commit 69117d947845a3065062d98c46c7a7452709cc6c Author: Kees Cook Date: Wed Aug 28 20:32:01 2013 +0000 HID: check for NULL field when setting values commit be67b68d52fa28b9b721c47bb42068f0c1214855 upstream Defensively check that the field to be worked on is not NULL. Signed-off-by: Kees Cook Cc: stable@kernel.org Signed-off-by: Jiri Kosina Signed-off-by: Willy Tarreau commit 36d004160cbbe860c42feb24b02905fd50621cf6 Author: Kees Cook Date: Wed Sep 11 19:56:54 2013 +0000 HID: LG: validate HID output report details commit 0fb6bd06e06792469acc15bbe427361b56ada528 upstream A HID device could send a malicious output report that would cause the lg, lg3, and lg4 HID drivers to write beyond the output report allocation during an event, causing a heap overflow: [ 325.245240] usb 1-1: New USB device found, idVendor=046d, idProduct=c287 ... [ 414.518960] BUG kmalloc-4096 (Not tainted): Redzone overwritten Additionally, while lg2 did correctly validate the report details, it was cleaned up and shortened. CVE-2013-2893 Signed-off-by: Kees Cook Cc: stable@vger.kernel.org Reviewed-by: Benjamin Tissoires Signed-off-by: Jiri Kosina [jmm: backported to 2.6.32] Signed-off-by: Willy Tarreau commit b7f1dbb911bfc70e750c02a573d7387e727224e6 Author: Kees Cook Date: Wed Aug 28 22:30:49 2013 +0200 HID: pantherlord: validate output report details commit 412f30105ec6735224535791eed5cdc02888ecb4 upstream A HID device could send a malicious output report that would cause the pantherlord HID driver to write beyond the output report allocation during initialization, causing a heap overflow: [ 310.939483] usb 1-1: New USB device found, idVendor=0e8f, idProduct=0003 ... [ 315.980774] BUG kmalloc-192 (Tainted: G W ): Redzone overwritten CVE-2013-2892 Signed-off-by: Kees Cook Cc: stable@kernel.org Signed-off-by: Jiri Kosina Signed-off-by: Willy Tarreau commit 267450c9146ff0e8c0aeb988d276f66f67af9500 Author: Kees Cook Date: Wed Sep 11 19:56:51 2013 +0000 HID: zeroplus: validate output report details commit 78214e81a1bf43740ce89bb5efda78eac2f8ef83 upstream The zeroplus HID driver was not checking the size of allocated values in fields it used. A HID device could send a malicious output report that would cause the driver to write beyond the output report allocation during initialization, causing a heap overflow: [ 1442.728680] usb 1-1: New USB device found, idVendor=0c12, idProduct=0005 ... [ 1466.243173] BUG kmalloc-192 (Tainted: G W ): Redzone overwritten CVE-2013-2889 Signed-off-by: Kees Cook Cc: stable@vger.kernel.org Reviewed-by: Benjamin Tissoires Signed-off-by: Jiri Kosina [jmm: backport to 2.6.32] Signed-off-by: Willy Tarreau commit dc4108776e19ac3f653c288138b1f23da655eb61 Author: Kees Cook Date: Wed Aug 28 20:29:55 2013 +0000 HID: validate HID report id size commit 43622021d2e2b82ea03d883926605bdd0525e1d1 upstream The "Report ID" field of a HID report is used to build indexes of reports. The kernel's index of these is limited to 256 entries, so any malicious device that sets a Report ID greater than 255 will trigger memory corruption on the host: [ 1347.156239] BUG: unable to handle kernel paging request at ffff88094958a878 [ 1347.156261] IP: [] hid_register_report+0x2a/0x8b CVE-2013-2888 Signed-off-by: Kees Cook Cc: stable@kernel.org Signed-off-by: Jiri Kosina [jmm: backport to 2.6.32] Signed-off-by: Willy Tarreau commit f9653a5ab3bba8a3ace8174df52776ee98239c4a Author: Kees Cook Date: Fri May 10 21:48:21 2013 +0000 b43: stop format string leaking into error msgs commit e0e29b683d6784ef59bbc914eac85a04b650e63c upstream The module parameter "fwpostfix" is userspace controllable, unfiltered, and is used to define the firmware filename. b43_do_request_fw() populates ctx->errors[] on error, containing the firmware filename. b43err() parses its arguments as a format string. For systems with b43 hardware, this could lead to a uid-0 to ring-0 escalation. CVE-2013-2852 Signed-off-by: Kees Cook Cc: stable@vger.kernel.org Signed-off-by: John W. Linville Signed-off-by: Willy Tarreau commit 134cbf552295204ea9b17882580acf800b30eaff Author: Kees Cook Date: Wed Jul 3 22:01:14 2013 +0000 block: do not pass disk names as format strings commit ffc8b30866879ed9ba62bd0a86fecdbd51cd3d19 upstream Disk names may contain arbitrary strings, so they must not be interpreted as format strings. It seems that only md allows arbitrary strings to be used for disk names, but this could allow for a local memory corruption from uid 0 into ring 0. CVE-2013-2851 Signed-off-by: Kees Cook Cc: Jens Axboe Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [jmm: Backport to 2.6.32] Signed-off-by: Willy Tarreau commit bc627cf07b1fa9e61cfe6b14ea27825ed100a126 Author: Nicolas Dichtel Date: Mon Feb 18 15:24:20 2013 +0000 af_key: initialize satype in key_notify_policy_flush() commit 85dfb745ee40232876663ae206cba35f24ab2a40 upstream This field was left uninitialized. Some user daemons perform check against this field. Signed-off-by: Nicolas Dichtel Signed-off-by: Steffen Klassert Signed-off-by: Willy Tarreau commit acd3e86d5d5f1d3635e4a11b1d4da2e7f5c75048 Author: Mathias Krause Date: Wed Jun 26 21:52:30 2013 +0000 af_key: fix info leaks in notify messages commit a5cc68f3d63306d0d288f31edfc2ae6ef8ecd887 upstream key_notify_sa_flush() and key_notify_policy_flush() miss to initialize the sadb_msg_reserved member of the broadcasted message and thereby leak 2 bytes of heap memory to listeners. Fix that. Signed-off-by: Mathias Krause Cc: Steffen Klassert Cc: "David S. Miller" Cc: Herbert Xu Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 42b682e8e122f4509823e97b5d9936a684225a2c Author: Eric Dumazet Date: Wed Jun 26 11:15:07 2013 +0000 ipv6: ip6_sk_dst_check() must not assume ipv6 dst commit a963a37d384d71ad43b3e9e79d68d42fbe0901f3 upstream It's possible to use AF_INET6 sockets and to connect to an IPv4 destination. After this, socket dst cache is a pointer to a rtable, not rt6_info. ip6_sk_dst_check() should check the socket dst cache is IPv6, or else various corruptions/crashes can happen. Dave Jones can reproduce immediate crash with trinity -q -l off -n -c sendmsg -c connect With help from Hannes Frederic Sowa Reported-by: Dave Jones Reported-by: Hannes Frederic Sowa Signed-off-by: Eric Dumazet Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 5f6eddcbb90c3e98738e50759fe20776ac24fd33 Author: Vlad Yasevich Date: Tue Mar 12 15:53:23 2013 +0000 sctp: Use correct sideffect command in duplicate cookie handling commit f2815633504b442ca0b0605c16bf3d88a3a0fcea upstream When SCTP is done processing a duplicate cookie chunk, it tries to delete a newly created association. For that, it has to set the right association for the side-effect processing to work. However, when it uses the SCTP_CMD_NEW_ASOC command, that performs more work then really needed (like hashing the associationa and assigning it an id) and there is no point to do that only to delete the association as a next step. In fact, it also creates an impossible condition where an association may be found by the getsockopt() call, and that association is empty. This causes a crash in some sctp getsockopts. The solution is rather simple. We simply use SCTP_CMD_SET_ASOC command that doesn't have all the overhead and does exactly what we need. Reported-by: Karl Heiss Tested-by: Karl Heiss CC: Neil Horman Signed-off-by: Vlad Yasevich Acked-by: Neil Horman Signed-off-by: David S. Miller Signed-off-by: Willy Tarreau commit 88a5a5fa2a0f5ab9995d54f605f42a295202d042 Author: Max Matveev Date: Mon Aug 29 21:02:24 2011 +0000 sctp: deal with multiple COOKIE_ECHO chunks commit d5ccd496601b8776a516d167a6485754575dc38f upstream Attempt to reduce the number of IP packets emitted in response to single SCTP packet (2e3216cd) introduced a complication - if a packet contains two COOKIE_ECHO chunks and nothing else then SCTP state machine corks the socket while processing first COOKIE_ECHO and then loses the association and forgets to uncork the socket. To deal with the issue add new SCTP command which can be used to set association explictly. Use this new command when processing second COOKIE_ECHO chunk to restore the context for SCTP state machine. Signed-off-by: Max Matveev Signed-off-by: David S. Miller [dannf: backported to Debian's 2.6.32] Signed-off-by: Willy Tarreau commit 6c8fbb40df75444547dd4a6ee4dbc67af7e76c1b Author: Jonathan Salwan Date: Wed Jul 3 22:01:13 2013 +0000 drivers/cdrom/cdrom.c: use kzalloc() for failing hardware commit 542db01579fbb7ea7d1f7bb9ddcef1559df660b2 upstream In drivers/cdrom/cdrom.c mmc_ioctl_cdrom_read_data() allocates a memory area with kmalloc in line 2885. 2885 cgc->buffer = kmalloc(blocksize, GFP_KERNEL); 2886 if (cgc->buffer == NULL) 2887 return -ENOMEM; In line 2908 we can find the copy_to_user function: 2908 if (!ret && copy_to_user(arg, cgc->buffer, blocksize)) The cgc->buffer is never cleaned and initialized before this function. If ret = 0 with the previous basic block, it's possible to display some memory bytes in kernel space from userspace. When we read a block from the disk it normally fills the ->buffer but if the drive is malfunctioning there is a chance that it would only be partially filled. The result is an leak information to userspace. Signed-off-by: Dan Carpenter Cc: Jens Axboe Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 81fd7a2965dc182bb0bc2de2f4e1b076aad5ab8c Author: Dan Carpenter Date: Tue Sep 24 22:27:44 2013 +0000 cpqarray: fix info leak in ida_locked_ioctl() commit 627aad1c01da6f881e7f98d71fd928ca0c316b1a upstream The pciinfo struct has a two byte hole after ->dev_fn so stack information could be leaked to the user. This was assigned CVE-2013-2147. Signed-off-by: Dan Carpenter Acked-by: Mike Miller Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit b141d47e87a4cb5724f24e81ffcacbb102c28394 Author: Dan Carpenter Date: Tue Sep 24 22:27:45 2013 +0000 cciss: fix info leak in cciss_ioctl32_passthru() commit 58f09e00ae095e46ef9edfcf3a5fd9ccdfad065e upstream. The arg64 struct has a hole after ->buf_size which isn't cleared. Or if any of the calls to copy_from_user() fail then that would cause an information leak as well. This was assigned CVE-2013-2147. Signed-off-by: Dan Carpenter Acked-by: Mike Miller Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Willy Tarreau commit 023ae3b03883fab864e2ca8d801cdae83938713a Author: Tetsuo Handa Date: Mon Sep 30 13:45:08 2013 -0700 kernel/kmod.c: check for NULL in call_usermodehelper_exec() If /proc/sys/kernel/core_pattern contains only "|", a NULL pointer dereference happens upon core dump because argv_split("") returns argv[0] == NULL. This bug was once fixed by commit 264b83c07a84 ("usermodehelper: check subprocess_info->path != NULL") but was by error reintroduced by commit 7f57cfa4e2aa ("usermodehelper: kill the sub_info->path[0] check"). This bug seems to exist since 2.6.19 (the version which core dump to pipe was added). Depending on kernel version and config, some side effect might happen immediately after this oops (e.g. kernel panic with 2.6.32-358.18.1.el6). Signed-off-by: Tetsuo Handa Acked-by: Oleg Nesterov Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds (cherry picked from commit 4c1c7be95c345cf2ad537a0c48e9aeadc7304527) Signed-off-by: Willy Tarreau commit 2ccae1f0a24955aba0730317385901169aed62db Author: Ian Abbott Date: Thu Oct 10 10:55:58 2013 +0100 staging: comedi: ni_65xx: (bug fix) confine insn_bits to one subdevice Commit 677a31565692d596ef42ea589b53ba289abf4713 upstream. The `insn_bits` handler `ni_65xx_dio_insn_bits()` has a `for` loop that currently writes (optionally) and reads back up to 5 "ports" consisting of 8 channels each. It reads up to 32 1-bit channels but can only read and write a whole port at once - it needs to handle up to 5 ports as the first channel it reads might not be aligned on a port boundary. It breaks out of the loop early if the next port it handles is beyond the final port on the card. It also breaks out early on the 5th port in the loop if the first channel was aligned. Unfortunately, it doesn't check that the current port it is dealing with belongs to the comedi subdevice the `insn_bits` handler is acting on. That's a bug. Redo the `for` loop to terminate after the final port belonging to the subdevice, changing the loop variable in the process to simplify things a bit. The `for` loop could now try and handle more than 5 ports if the subdevice has more than 40 channels, but the test `if (bitshift >= 32)` ensures it will break out early after 4 or 5 ports (depending on whether the first channel is aligned on a port boundary). (`bitshift` will be between -7 and 7 inclusive on the first iteration, increasing by 8 for each subsequent operation.) Signed-off-by: Ian Abbott Signed-off-by: Willy Tarreau commit 92e440881b51eebcb2ffaca0742cfc874d7cc891 Author: Jitendra Bhivare Date: Fri Sep 6 23:17:51 2013 +0530 intel-iommu: Flush unmaps at domain_exit Backported Alex Williamson's commit to 2.6.32.y http://git.kernel.org/linus/7b668357810ecb5fdda4418689d50f5d95aea6a8 It resolves the following assert when module is immediately reloaded. kernel BUG at drivers/pci/iova.c:155! Call Trace: [] intel_alloc_iova+0xb5/0xe0 [] __intel_map_single+0xbe/0x210 [] intel_alloc_coherent+0xae/0x120 [] be_queue_alloc+0xb9/0x140 [be2net] [] be_rx_qs_create+0xca/0x370 [be2net] The issue is reproducible in 2.6.32.60 and also gets resolved by passing intel-iommu=strict to kernel. Signed-off-by: Jitendra Bhivare Signed-off-by: Willy Tarreau commit 030edd6218d2752c65bf2055e1caeb2a17e6485a Author: Julian Anastasov Date: Sun Oct 17 16:14:31 2010 +0300 ipvs: fix CHECKSUM_PARTIAL for TCP, UDP Fix CHECKSUM_PARTIAL handling. Tested for IPv4 TCP, UDP not tested because it needs network card with HW CSUM support. May be fixes problem where IPVS can not be used in virtual boxes. Problem appears with DNAT to local address when the local stack sends reply in CHECKSUM_PARTIAL mode. Fix tcp_dnat_handler and udp_dnat_handler to provide vaddr and daddr in right order (old and new IP) when calling tcp_partial_csum_update/udp_partial_csum_update (CHECKSUM_PARTIAL). Signed-off-by: Julian Anastasov Signed-off-by: Simon Horman (cherry picked from commit 5bc9068e9d962ca6b8bec3f0eb6f60ab4dee1d04) Signed-off-by: Willy Tarreau commit 40c74e0d834df3379b8c3208ea0f08afc9f39aae Author: Willy Tarreau Date: Thu Jun 13 19:36:35 2013 +0200 x86, ptrace: fix build breakage with gcc 4.7 (second try) syscall_trace_enter() and syscall_trace_leave() are only called from within asm code and do not need to be declared in the .c at all. Removing their reference fixes the build issue that was happening with gcc 4.7. Both Sven-Haegar Koch and Christoph Biedl confirmed this patch addresses their respective build issues. Cc: Sven-Haegar Koch Cc: Christoph Biedl Signed-off-by: Willy Tarreau commit 4d6afb4fb5f74221314110719efe334bbf810e7e Author: Willy Tarreau Date: Thu Jun 13 16:40:05 2013 +0200 Revert "x86, ptrace: fix build breakage with gcc 4.7" This reverts commit 4ed3bb08f1698c62685278051c19f474fbf961d2. As reported by Sven-Haegar Koch, this patch breaks make headers_check : CHECK include (0 files) CHECK include/asm (54 files) /home/haegar/src/2.6.32/linux/usr/include/asm/ptrace.h:5: included file 'linux/linkage.h' is not exported Signed-off-by: Willy Tarreau commit beca8411aa7b81c7233c9814acb3f34bb98e4976 Author: Ben Greear Date: Thu Jun 6 14:29:49 2013 -0700 Fix lockup related to stop_machine being stuck in __do_softirq. The stop machine logic can lock up if all but one of the migration threads make it through the disable-irq step and the one remaining thread gets stuck in __do_softirq. The reason __do_softirq can hang is that it has a bail-out based on jiffies timeout, but in the lockup case, jiffies itself is not incremented. To work around this, re-add the max_restart counter in __do_irq and stop processing irqs after 10 restarts. Thanks to Tejun Heo and Rusty Russell and others for helping me track this down. This was introduced in 3.9 by commit c10d73671ad3 ("softirq: reduce latencies"). It may be worth looking into ath9k to see if it has issues with its irq handler at a later date. The hang stack traces look something like this: ------------[ cut here ]------------ WARNING: at kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xa7() Watchdog detected hard LOCKUP on cpu 2 Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc] Pid: 23, comm: migration/2 Tainted: G C 3.9.4+ #11 Call Trace: warn_slowpath_common+0x85/0x9f warn_slowpath_fmt+0x46/0x48 watchdog_overflow_callback+0x9c/0xa7 __perf_event_overflow+0x137/0x1cb perf_event_overflow+0x14/0x16 intel_pmu_handle_irq+0x2dc/0x359 perf_event_nmi_handler+0x19/0x1b nmi_handle+0x7f/0xc2 do_nmi+0xbc/0x304 end_repeat_nmi+0x1e/0x2e <> cpu_stopper_thread+0xae/0x162 smpboot_thread_fn+0x258/0x260 kthread+0xc7/0xcf ret_from_fork+0x7c/0xb0 ---[ end trace 4947dfa9b0a4cec3 ]--- BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:17] Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc] irq event stamp: 835637905 hardirqs last enabled at (835637904): __do_softirq+0x9f/0x257 hardirqs last disabled at (835637905): apic_timer_interrupt+0x6d/0x80 softirqs last enabled at (5654720): __do_softirq+0x1ff/0x257 softirqs last disabled at (5654725): irq_exit+0x5f/0xbb CPU 1 Pid: 17, comm: migration/1 Tainted: G WC 3.9.4+ #11 To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M. RIP: tasklet_hi_action+0xf0/0xf0 Process migration/1 Call Trace: __do_softirq+0x117/0x257 irq_exit+0x5f/0xbb smp_apic_timer_interrupt+0x8a/0x98 apic_timer_interrupt+0x72/0x80 printk+0x4d/0x4f stop_machine_cpu_stop+0x22c/0x274 cpu_stopper_thread+0xae/0x162 smpboot_thread_fn+0x258/0x260 kthread+0xc7/0xcf ret_from_fork+0x7c/0xb0 Signed-off-by: Ben Greear Acked-by: Tejun Heo Acked-by: Pekka Riikonen Cc: Eric Dumazet Cc: stable@kernel.org Cc: Ben Hutchings Signed-off-by: Linus Torvalds (cherry picked from commit 34376a50fb1fa095b9d0636fa41ed2e73125f214) Signed-off-by: Willy Tarreau commit 9bf823ecfb0c88e7ad4d2d38abc57f5ff6ef437c Author: Thomas Bork Date: Sun Jun 16 19:47:23 2013 +0200 scsi: fix missing include linux/types.h in scsi_netlink.h Thomas Bork reported that commit c6203cd ("scsi: use __uX types for headers exported to user space") caused a regression as now he's getting this warning : > /usr/src/linux-2.6.32-eisfair-1/usr/include/scsi/scsi_netlink.h:108: > found __[us]{8,16,32,64} type without #include This issue was addressed later by commit 10db4e1 ("headers: include linux/types.h where appropriate"), so let's just pick the relevant part from it. Cc: Thomas Bork Signed-off-by: Willy Tarreau