commit eee1550b3e89217321b63efba64f03b2546180d6 Author: Greg Kroah-Hartman Date: Sat Feb 18 15:11:56 2017 +0100 Linux 4.9.11 commit 724aedaa5ca6dfa31e54864f03215cce7ed663a0 Author: Yu-cheng Yu Date: Mon Jan 23 14:54:44 2017 -0800 x86/fpu/xstate: Fix xcomp_bv in XSAVES header commit dffba9a31c7769be3231c420d4b364c92ba3f1ac upstream. The compacted-format XSAVES area is determined at boot time and never changed after. The field xsave.header.xcomp_bv indicates which components are in the fixed XSAVES format. In fpstate_init() we did not set xcomp_bv to reflect the XSAVES format since at the time there is no valid data. However, after we do copy_init_fpstate_to_fpregs() in fpu__clear(), as in commit: b22cbe404a9c x86/fpu: Fix invalid FPU ptrace state after execve() and when __fpu_restore_sig() does fpu__restore() for a COMPAT-mode app, a #GP occurs. This can be easily triggered by doing valgrind on a COMPAT-mode "Hello World," as reported by Joakim Tjernlund and others: https://bugzilla.kernel.org/show_bug.cgi?id=190061 Fix it by setting xcomp_bv correctly. This patch also moves the xcomp_bv initialization to the proper place, which was in copyin_to_xsaves() as of: 4c833368f0bf x86/fpu: Set the xcomp_bv when we fake up a XSAVES area which fixed the bug too, but it's more efficient and cleaner to initialize things once per boot, not for every signal handling operation. Reported-by: Kevin Hao Reported-by: Joakim Tjernlund Signed-off-by: Yu-cheng Yu Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Dave Hansen Cc: Fenghua Yu Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Ravi V. Shankar Cc: Thomas Gleixner Cc: haokexin@gmail.com Link: http://lkml.kernel.org/r/1485212084-4418-1-git-send-email-yu-cheng.yu@intel.com [ Combined it with 4c833368f0bf. ] Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 0d4c19ee68c91f46905cbd393939d89237e6189c Author: Pablo Neira Date: Thu Jan 26 22:56:21 2017 +0100 tcp: don't annotate mark on control socket from tcp_v6_send_response() commit 92e55f412cffd016cc245a74278cb4d7b89bb3bc upstream. Unlike ipv4, this control socket is shared by all cpus so we cannot use it as scratchpad area to annotate the mark that we pass to ip6_xmit(). Add a new parameter to ip6_xmit() to indicate the mark. The SCTP socket family caches the flowi6 structure in the sctp_transport structure, so we cannot use to carry the mark unless we later on reset it back, which I discarded since it looks ugly to me. Fixes: bf99b4ded5f8 ("tcp: fix mark propagation with fwmark_reflect enabled") Suggested-by: Eric Dumazet Signed-off-by: Pablo Neira Ayuso Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0e0751cdfa466923218ff59a8624fa8d1050b6fb Author: Mark Bloch Date: Mon Sep 5 10:58:04 2016 +0000 net/mlx5: Don't unlock fte while still using it commit 0fd758d6112f867b2cc6df0f6a856048ff99b211 upstream. When adding a new rule to an fte, we need to hold the fte lock until we add that rule to the fte and increase the fte ref count. Fixes: 0c56b97503fd ("net/mlx5_core: Introduce flow steering API") Signed-off-by: Mark Bloch Signed-off-by: Saeed Mahameed Signed-off-by: Leon Romanovsky Signed-off-by: Greg Kroah-Hartman commit 7c4c32a2976e061f3cc9cecf54ad20dd8861a212 Author: Pau Espin Pedrol Date: Fri Jan 6 20:33:28 2017 +0100 tcp: fix mark propagation with fwmark_reflect enabled commit bf99b4ded5f8a4767dbb9d180626f06c51f9881f upstream. Otherwise, RST packets generated by the TCP stack for non-existing sockets always have mark 0. The mark from the original packet is assigned to the netns_ipv4/6 socket used to send the response so that it can get copied into the response skb when the socket sends it. Fixes: e110861f8609 ("net: add a sysctl to reflect the fwmark on replies") Cc: Lorenzo Colitti Signed-off-by: Pau Espin Pedrol Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 16a3fbe5239a5ca054b0544abdd661076ec4f1c5 Author: Hangbin Liu Date: Wed Feb 8 21:16:45 2017 +0800 igmp, mld: Fix memory leak in igmpv3/mld_del_delrec() [ Upstream commit 9c8bb163ae784be4f79ae504e78c862806087c54 ] In function igmpv3/mld_add_delrec() we allocate pmc and put it in idev->mc_tomb, so we should free it when we don't need it in del_delrec(). But I removed kfree(pmc) incorrectly in latest two patches. Now fix it. Fixes: 24803f38a5c0 ("igmp: do not remove igmp souce list info when ...") Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when ...") Reported-by: Daniel Borkmann Signed-off-by: Hangbin Liu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 53a76d633b860f47f82f3ad821bc264306f9be69 Author: Hangbin Liu Date: Thu Jan 12 21:19:37 2017 +0800 mld: do not remove mld souce list info when set link down [ Upstream commit 1666d49e1d416fcc2cce708242a52fe3317ea8ba ] This is an IPv6 version of commit 24803f38a5c0 ("igmp: do not remove igmp souce list..."). In mld_del_delrec(), we will restore back all source filter info instead of flush them. Move mld_clear_delrec() from ipv6_mc_down() to ipv6_mc_destroy_dev() since we should not remove source list info when set link down. Remove igmp6_group_dropped() in ipv6_mc_destroy_dev() since we have called it in ipv6_mc_down(). Also clear all source info after igmp6_group_dropped() instead of in it because ipv6_mc_down() will call igmp6_group_dropped(). Signed-off-by: Hangbin Liu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5b1bb4cbd7ec562d1964d37d642ca836f2b83a2c Author: Eric Dumazet Date: Thu Feb 9 16:15:52 2017 -0800 l2tp: do not use udp_ioctl() [ Upstream commit 72fb96e7bdbbdd4421b0726992496531060f3636 ] udp_ioctl(), as its name suggests, is used by UDP protocols, but is also used by L2TP :( L2TP should use its own handler, because it really does not look the same. SIOCINQ for instance should not assume UDP checksum or headers. Thanks to Andrey and syzkaller team for providing the report and a nice reproducer. While crashes only happen on recent kernels (after commit 7c13f97ffde6 ("udp: do fwd memory scheduling on dequeue")), this probably needs to be backported to older kernels. Fixes: 7c13f97ffde6 ("udp: do fwd memory scheduling on dequeue") Fixes: 85584672012e ("udp: Fix udp_poll() and ioctl()") Signed-off-by: Eric Dumazet Reported-by: Andrey Konovalov Acked-by: Paolo Abeni Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 12758a282435c0ab4878ef163e82420a8e250b73 Author: Florian Fainelli Date: Tue Feb 7 23:10:13 2017 -0800 net: dsa: Do not destroy invalid network devices [ Upstream commit 382e1eea2d983cd2343482c6a638f497bb44a636 ] dsa_slave_create() can fail, and dsa_user_port_unapply() will properly check for the network device not being NULL before attempting to destroy it. We were not setting the slave network device as NULL if dsa_slave_create() failed, so we would later on be calling dsa_slave_destroy() on a now free'd and unitialized network device, causing crashes in dsa_slave_destroy(). Fixes: 83c0afaec7b7 ("net: dsa: Add new binding implementation") Signed-off-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a700cf26a3be881e32573cb0e6373278fac2348a Author: WANG Cong Date: Tue Feb 7 12:59:46 2017 -0800 ping: fix a null pointer dereference [ Upstream commit 73d2c6678e6c3af7e7a42b1e78cd0211782ade32 ] Andrey reported a kernel crash: general protection fault: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 2 PID: 3880 Comm: syz-executor1 Not tainted 4.10.0-rc6+ #124 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff880060048040 task.stack: ffff880069be8000 RIP: 0010:ping_v4_push_pending_frames net/ipv4/ping.c:647 [inline] RIP: 0010:ping_v4_sendmsg+0x1acd/0x23f0 net/ipv4/ping.c:837 RSP: 0018:ffff880069bef8b8 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: ffff880069befb90 RCX: 0000000000000000 RDX: 0000000000000018 RSI: ffff880069befa30 RDI: 00000000000000c2 RBP: ffff880069befbb8 R08: 0000000000000008 R09: 0000000000000000 R10: 0000000000000002 R11: 0000000000000000 R12: ffff880069befab0 R13: ffff88006c624a80 R14: ffff880069befa70 R15: 0000000000000000 FS: 00007f6f7c716700(0000) GS:ffff88006de00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004a6f28 CR3: 000000003a134000 CR4: 00000000000006e0 Call Trace: inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744 sock_sendmsg_nosec net/socket.c:635 [inline] sock_sendmsg+0xca/0x110 net/socket.c:645 SYSC_sendto+0x660/0x810 net/socket.c:1687 SyS_sendto+0x40/0x50 net/socket.c:1655 entry_SYSCALL_64_fastpath+0x1f/0xc2 This is because we miss a check for NULL pointer for skb_peek() when the queue is empty. Other places already have the same check. Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Reported-by: Andrey Konovalov Tested-by: Andrey Konovalov Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 82849541895fc355de23aad2ab1615969e50896f Author: Willem de Bruijn Date: Tue Feb 7 15:57:21 2017 -0500 packet: round up linear to header len [ Upstream commit 57031eb794906eea4e1c7b31dc1e2429c0af0c66 ] Link layer protocols may unconditionally pull headers, as Ethernet does in eth_type_trans. Ensure that the entire link layer header always lies in the skb linear segment. tpacket_snd has such a check. Extend this to packet_snd. Variable length link layer headers complicate the computation somewhat. Here skb->len may be smaller than dev->hard_header_len. Round up the linear length to be at least as long as the smallest of the two. Reported-by: Dmitry Vyukov Signed-off-by: Willem de Bruijn Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6ebde312a8ed469172fc9694ca0c8411994d47ff Author: Willem de Bruijn Date: Tue Feb 7 15:57:20 2017 -0500 net: introduce device min_header_len [ Upstream commit 217e6fa24ce28ec87fca8da93c9016cb78028612 ] The stack must not pass packets to device drivers that are shorter than the minimum link layer header length. Previously, packet sockets would drop packets smaller than or equal to dev->hard_header_len, but this has false positives. Zero length payload is used over Ethernet. Other link layer protocols support variable length headers. Support for validation of these protocols removed the min length check for all protocols. Introduce an explicit dev->min_header_len parameter and drop all packets below this value. Initially, set it to non-zero only for Ethernet and loopback. Other protocols can follow in a patch to net-next. Fixes: 9ed988cd5915 ("packet: validate variable length ll headers") Reported-by: Sowmini Varadhan Signed-off-by: Willem de Bruijn Acked-by: Eric Dumazet Acked-by: Sowmini Varadhan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 4cd0362114c826ef1b5ca12c04a3c288dd4d9ecd Author: WANG Cong Date: Wed Feb 8 10:02:13 2017 -0800 sit: fix a double free on error path [ Upstream commit d7426c69a1942b2b9b709bf66b944ff09f561484 ] Dmitry reported a double free in sit_init_net(): kernel BUG at mm/percpu.c:689! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 15692 Comm: syz-executor1 Not tainted 4.10.0-rc6-next-20170206 #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 task: ffff8801c9cc27c0 task.stack: ffff88017d1d8000 RIP: 0010:pcpu_free_area+0x68b/0x810 mm/percpu.c:689 RSP: 0018:ffff88017d1df488 EFLAGS: 00010046 RAX: 0000000000010000 RBX: 00000000000007c0 RCX: ffffc90002829000 RDX: 0000000000010000 RSI: ffffffff81940efb RDI: ffff8801db841d94 RBP: ffff88017d1df590 R08: dffffc0000000000 R09: 1ffffffff0bb3bdd R10: dffffc0000000000 R11: 00000000000135dd R12: ffff8801db841d80 R13: 0000000000038e40 R14: 00000000000007c0 R15: 00000000000007c0 FS: 00007f6ea608f700(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002000aff8 CR3: 00000001c8d44000 CR4: 00000000001426f0 DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 Call Trace: free_percpu+0x212/0x520 mm/percpu.c:1264 ipip6_dev_free+0x43/0x60 net/ipv6/sit.c:1335 sit_init_net+0x3cb/0xa10 net/ipv6/sit.c:1831 ops_init+0x10a/0x530 net/core/net_namespace.c:115 setup_net+0x2ed/0x690 net/core/net_namespace.c:291 copy_net_ns+0x26c/0x530 net/core/net_namespace.c:396 create_new_namespaces+0x409/0x860 kernel/nsproxy.c:106 unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:205 SYSC_unshare kernel/fork.c:2281 [inline] SyS_unshare+0x64e/0xfc0 kernel/fork.c:2231 entry_SYSCALL_64_fastpath+0x1f/0xc2 This is because when tunnel->dst_cache init fails, we free dev->tstats once in ipip6_tunnel_init() and twice in sit_init_net(). This looks redundant but its ndo_uinit() does not seem enough to clean up everything here. So avoid this by setting dev->tstats to NULL after the first free, at least for -net. Reported-by: Dmitry Vyukov Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2b7f50d67f5dd7b4f97e413dd8d4ecbf83d723ce Author: David Ahern Date: Wed Feb 8 09:29:00 2017 -0800 lwtunnel: valid encap attr check should return 0 when lwtunnel is disabled [ Upstream commit 2bd137de531367fb573d90150d1872cb2a2095f7 ] An error was reported upgrading to 4.9.8: root@Typhoon:~# ip route add default table 210 nexthop dev eth0 via 10.68.64.1 weight 1 nexthop dev eth0 via 10.68.64.2 weight 1 RTNETLINK answers: Operation not supported The problem occurs when CONFIG_LWTUNNEL is not enabled and a multipath route is submitted. The point of lwtunnel_valid_encap_type_attr is catch modules that need to be loaded before any references are taken with rntl held. With CONFIG_LWTUNNEL disabled, there will be no modules to load so the lwtunnel_valid_encap_type_attr stub should just return 0. Fixes: 9ed59592e3e3 ("lwtunnel: fix autoload of lwt modules") Reported-by: pupilla@libero.it Signed-off-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 00eff2ebbd229758e90659907724c14dd5a18339 Author: Marcelo Ricardo Leitner Date: Mon Feb 6 18:10:31 2017 -0200 sctp: avoid BUG_ON on sctp_wait_for_sndbuf [ Upstream commit 2dcab598484185dea7ec22219c76dcdd59e3cb90 ] Alexander Popov reported that an application may trigger a BUG_ON in sctp_wait_for_sndbuf if the socket tx buffer is full, a thread is waiting on it to queue more data and meanwhile another thread peels off the association being used by the first thread. This patch replaces the BUG_ON call with a proper error handling. It will return -EPIPE to the original sendmsg call, similarly to what would have been done if the association wasn't found in the first place. Acked-by: Alexander Popov Signed-off-by: Marcelo Ricardo Leitner Reviewed-by: Xin Long Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 4400acce6881bc40380afa4c0559348a9feb7329 Author: Benjamin Poirier Date: Mon Feb 6 10:14:31 2017 -0800 mlx4: Invoke softirqs after napi_reschedule [ Upstream commit bd4ce941c8d5b862b2f83364be5dbe8fc8ab48f8 ] mlx4 may schedule napi from a workqueue. Afterwards, softirqs are not run in a deterministic time frame and the following message may be logged: NOHZ: local_softirq_pending 08 The problem is the same as what was described in commit ec13ee80145c ("virtio_net: invoke softirqs after __napi_schedule") and this patch applies the same fix to mlx4. Fixes: 07841f9d94c1 ("net/mlx4_en: Schedule napi when RX buffers allocation fails") Cc: Eric Dumazet Signed-off-by: Benjamin Poirier Acked-by: Eric Dumazet Reviewed-by: Tariq Toukan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 970390fd5d53de0817b538350131edd2514a8321 Author: Ben Hutchings Date: Sat Feb 4 16:57:04 2017 +0000 catc: Use heap buffer for memory size test [ Upstream commit 2d6a0e9de03ee658a9adc3bfb2f0ca55dff1e478 ] Allocating USB buffers on the stack is not portable, and no longer works on x86_64 (with VMAP_STACK enabled as per default). Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Ben Hutchings Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 61bf9f381c38821004bd514015487852021053a1 Author: Ben Hutchings Date: Sat Feb 4 16:56:56 2017 +0000 catc: Combine failure cleanup code in catc_probe() [ Upstream commit d41149145f98fe26dcd0bfd1d6cc095e6e041418 ] Signed-off-by: Ben Hutchings Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e898f6f008aa91c154c9c8fb7be3fb9ec4d333ec Author: Ben Hutchings Date: Sat Feb 4 16:56:32 2017 +0000 rtl8150: Use heap buffers for all register access [ Upstream commit 7926aff5c57b577ab0f43364ff0c59d968f6a414 ] Allocating USB buffers on the stack is not portable, and no longer works on x86_64 (with VMAP_STACK enabled as per default). Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Ben Hutchings Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 878b015bcc726560b13be2d906caf6923428f05d Author: Ben Hutchings Date: Sat Feb 4 16:56:03 2017 +0000 pegasus: Use heap buffers for all register access [ Upstream commit 5593523f968bc86d42a035c6df47d5e0979b5ace ] Allocating USB buffers on the stack is not portable, and no longer works on x86_64 (with VMAP_STACK enabled as per default). Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") References: https://bugs.debian.org/852556 Reported-by: Lisandro Damián Nicanor Pérez Meyer Tested-by: Lisandro Damián Nicanor Pérez Meyer Signed-off-by: Ben Hutchings Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b90cb484c068b3fd30aee322506d38bcc2f43838 Author: Willem de Bruijn Date: Fri Feb 3 18:20:49 2017 -0500 macvtap: read vnet_hdr_size once [ Upstream commit 837585a5375c38d40361cfe64e6fd11e1addb936 ] When IFF_VNET_HDR is enabled, a virtio_net header must precede data. Data length is verified to be greater than or equal to expected header length tun->vnet_hdr_sz before copying. Macvtap functions read the value once, but unless READ_ONCE is used, the compiler may ignore this and read multiple times. Enforce a single read and locally cached value to avoid updates between test and use. Signed-off-by: Willem de Bruijn Suggested-by: Eric Dumazet Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 26989c9d9904e3626443336bcefab0b6e7077d99 Author: Willem de Bruijn Date: Fri Feb 3 18:20:48 2017 -0500 tun: read vnet_hdr_sz once [ Upstream commit e1edab87faf6ca30cd137e0795bc73aa9a9a22ec ] When IFF_VNET_HDR is enabled, a virtio_net header must precede data. Data length is verified to be greater than or equal to expected header length tun->vnet_hdr_sz before copying. Read this value once and cache locally, as it can be updated between the test and use (TOCTOU). Signed-off-by: Willem de Bruijn Reported-by: Dmitry Vyukov CC: Eric Dumazet Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0f895f51a831d73ce24158534784aba5b2a72a9e Author: Eric Dumazet Date: Fri Feb 3 14:59:38 2017 -0800 tcp: avoid infinite loop in tcp_splice_read() [ Upstream commit ccf7abb93af09ad0868ae9033d1ca8108bdaec82 ] Splicing from TCP socket is vulnerable when a packet with URG flag is received and stored into receive queue. __tcp_splice_read() returns 0, and sk_wait_data() immediately returns since there is the problematic skb in queue. This is a nice way to burn cpu (aka infinite loop) and trigger soft lockups. Again, this gem was found by syzkaller tool. Fixes: 9c55e01c0cc8 ("[TCP]: Splice receive support.") Signed-off-by: Eric Dumazet Reported-by: Dmitry Vyukov Cc: Willy Tarreau Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1e340bb22af3afc881db78c17b73eaf3d26fb8fe Author: Eric Dumazet Date: Sun Feb 5 20:23:22 2017 -0800 ipv6: tcp: add a missing tcp_v6_restore_cb() [ Upstream commit ebf6c9cb23d7e56eec8575a88071dec97ad5c6e2 ] Dmitry reported use-after-free in ip6_datagram_recv_specific_ctl() A similar bug was fixed in commit 8ce48623f0cf ("ipv6: tcp: restore IP6CB for pktoptions skbs"), but I missed another spot. tcp_v6_syn_recv_sock() can indeed set np->pktoptions from ireq->pktopts Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses") Signed-off-by: Eric Dumazet Reported-by: Dmitry Vyukov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ae1768bbbc469b75662c6714957fe5886cc960c4 Author: Eric Dumazet Date: Sat Feb 4 23:18:55 2017 -0800 ip6_gre: fix ip6gre_err() invalid reads [ Upstream commit 7892032cfe67f4bde6fc2ee967e45a8fbaf33756 ] Andrey Konovalov reported out of bound accesses in ip6gre_err() If GRE flags contains GRE_KEY, the following expression *(((__be32 *)p) + (grehlen / 4) - 1) accesses data ~40 bytes after the expected point, since grehlen includes the size of IPv6 headers. Let's use a "struct gre_base_hdr *greh" pointer to make this code more readable. p[1] becomes greh->protocol. grhlen is the GRE header length. Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Signed-off-by: Eric Dumazet Reported-by: Andrey Konovalov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 66cdd4347573027f95b4c7b50a7b20079ce66919 Author: Eric Dumazet Date: Fri Feb 3 00:03:26 2017 -0800 netlabel: out of bound access in cipso_v4_validate() [ Upstream commit d71b7896886345c53ef1d84bda2bc758554f5d61 ] syzkaller found another out of bound access in ip_options_compile(), or more exactly in cipso_v4_validate() Fixes: 20e2a8648596 ("cipso: handle CIPSO options correctly when NetLabel is disabled") Fixes: 446fda4f2682 ("[NetLabel]: CIPSOv4 engine") Signed-off-by: Eric Dumazet Reported-by: Dmitry Vyukov Cc: Paul Moore Acked-by: Paul Moore Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f5b54446630a973e1f27b68599366bbd0ac53066 Author: Eric Dumazet Date: Sat Feb 4 11:16:52 2017 -0800 ipv4: keep skb->dst around in presence of IP options [ Upstream commit 34b2cef20f19c87999fff3da4071e66937db9644 ] Andrey Konovalov got crashes in __ip_options_echo() when a NULL skb->dst is accessed. ipv4_pktinfo_prepare() should not drop the dst if (evil) IP options are present. We could refine the test to the presence of ts_needtime or srr, but IP options are not often used, so let's be conservative. Thanks to syzkaller team for finding this bug. Fixes: d826eb14ecef ("ipv4: PKTINFO doesnt need dst reference") Signed-off-by: Eric Dumazet Reported-by: Andrey Konovalov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d5b6fd77519df03feae24c1409eafb95f347ee88 Author: Eric Dumazet Date: Thu Feb 2 10:31:35 2017 -0800 net: use a work queue to defer net_disable_timestamp() work [ Upstream commit 5fa8bbda38c668e56b0c6cdecced2eac2fe36dec ] Dmitry reported a warning [1] showing that we were calling net_disable_timestamp() -> static_key_slow_dec() from a non process context. Grabbing a mutex while holding a spinlock or rcu_read_lock() is not allowed. As Cong suggested, we now use a work queue. It is possible netstamp_clear() exits while netstamp_needed_deferred is not zero, but it is probably not worth trying to do better than that. netstamp_needed_deferred atomic tracks the exact number of deferred decrements. [1] [ INFO: suspicious RCU usage. ] 4.10.0-rc5+ #192 Not tainted ------------------------------- ./include/linux/rcupdate.h:561 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 0 2 locks held by syz-executor14/23111: #0: (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock include/net/sock.h:1454 [inline] #0: (sk_lock-AF_INET6){+.+.+.}, at: [] rawv6_sendmsg+0x1e65/0x3ec0 net/ipv6/raw.c:919 #1: (rcu_read_lock){......}, at: [] nf_hook include/linux/netfilter.h:201 [inline] #1: (rcu_read_lock){......}, at: [] __ip6_local_out+0x258/0x840 net/ipv6/output_core.c:160 stack backtrace: CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:15 [inline] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452 rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline] ___might_sleep+0x560/0x650 kernel/sched/core.c:7748 __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739 mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752 atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060 __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149 static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174 net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728 sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403 __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441 sk_destruct+0x47/0x80 net/core/sock.c:1460 __sk_free+0x57/0x230 net/core/sock.c:1468 sock_wfree+0xae/0x120 net/core/sock.c:1645 skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655 skb_release_all+0x15/0x60 net/core/skbuff.c:668 __kfree_skb+0x15/0x20 net/core/skbuff.c:684 kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304 inet_frag_put include/net/inet_frag.h:133 [inline] nf_ct_frag6_gather+0x1106/0x3840 net/ipv6/netfilter/nf_conntrack_reasm.c:617 ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68 nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline] nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310 nf_hook include/linux/netfilter.h:212 [inline] __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742 rawv6_push_pending_frames net/ipv6/raw.c:613 [inline] rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744 sock_sendmsg_nosec net/socket.c:635 [inline] sock_sendmsg+0xca/0x110 net/socket.c:645 sock_write_iter+0x326/0x600 net/socket.c:848 do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695 do_readv_writev+0x42c/0x9b0 fs/read_write.c:872 vfs_writev+0x87/0xc0 fs/read_write.c:911 do_writev+0x110/0x2c0 fs/read_write.c:944 SYSC_writev fs/read_write.c:1017 [inline] SyS_writev+0x27/0x30 fs/read_write.c:1014 entry_SYSCALL_64_fastpath+0x1f/0xc2 RIP: 0033:0x445559 RSP: 002b:00007f6f46fceb58 EFLAGS: 00000292 ORIG_RAX: 0000000000000014 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 0000000000445559 RDX: 0000000000000001 RSI: 0000000020f1eff0 RDI: 0000000000000005 RBP: 00000000006e19c0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000292 R12: 0000000000700000 R13: 0000000020f59000 R14: 0000000000000015 R15: 0000000000020400 BUG: sleeping function called from invalid context at kernel/locking/mutex.c:752 in_atomic(): 1, irqs_disabled(): 0, pid: 23111, name: syz-executor14 INFO: lockdep is turned off. CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:15 [inline] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 ___might_sleep+0x47e/0x650 kernel/sched/core.c:7780 __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739 mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752 atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060 __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149 static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174 net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728 sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403 __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441 sk_destruct+0x47/0x80 net/core/sock.c:1460 __sk_free+0x57/0x230 net/core/sock.c:1468 sock_wfree+0xae/0x120 net/core/sock.c:1645 skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655 skb_release_all+0x15/0x60 net/core/skbuff.c:668 __kfree_skb+0x15/0x20 net/core/skbuff.c:684 kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304 inet_frag_put include/net/inet_frag.h:133 [inline] nf_ct_frag6_gather+0x1106/0x3840 net/ipv6/netfilter/nf_conntrack_reasm.c:617 ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68 nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline] nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310 nf_hook include/linux/netfilter.h:212 [inline] __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742 rawv6_push_pending_frames net/ipv6/raw.c:613 [inline] rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744 sock_sendmsg_nosec net/socket.c:635 [inline] sock_sendmsg+0xca/0x110 net/socket.c:645 sock_write_iter+0x326/0x600 net/socket.c:848 do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695 do_readv_writev+0x42c/0x9b0 fs/read_write.c:872 vfs_writev+0x87/0xc0 fs/read_write.c:911 do_writev+0x110/0x2c0 fs/read_write.c:944 SYSC_writev fs/read_write.c:1017 [inline] SyS_writev+0x27/0x30 fs/read_write.c:1014 entry_SYSCALL_64_fastpath+0x1f/0xc2 RIP: 0033:0x445559 Fixes: b90e5794c5bd ("net: dont call jump_label_dec from irq context") Suggested-by: Cong Wang Reported-by: Dmitry Vyukov Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 455a457780b64e96189f715b1ef04dbcced7a284 Author: Alexey Brodkin Date: Fri Jan 27 15:24:43 2017 +0300 stmmac: Discard masked flags in interrupt status register [ Upstream commit 0a764db103376cf69d04449b10688f3516cc0b88 ] DW GMAC databook says the following about bits in "Register 15 (Interrupt Mask Register)": --------------------------->8------------------------- When set, this bit __disables_the_assertion_of_the_interrupt_signal__ because of the setting of XXX bit in Register 14 (Interrupt Status Register). --------------------------->8------------------------- In fact even if we mask one bit in the mask register it doesn't prevent corresponding bit to appear in the status register, it only disables interrupt generation for corresponding event. But currently we expect a bit different behavior: status bits to be in sync with their masks, i.e. if mask for bit A is set in the mask register then bit A won't appear in the interrupt status register. This was proven to be incorrect assumption, see discussion here [1]. That misunderstanding causes unexpected behaviour of the GMAC, for example we were happy enough to just see bogus messages about link state changes. So from now on we'll be only checking bits that really may trigger an interrupt. [1] https://lkml.org/lkml/2016/11/3/413 Signed-off-by: Alexey Brodkin Cc: Giuseppe Cavallaro Cc: Fabrice Gasnier Cc: Joachim Eastwood Cc: Phil Reid Cc: David Miller Cc: Alexandre Torgue Cc: Vineet Gupta Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ca876dff1e8c04871d019c57016514df4cf04a25 Author: Eric Dumazet Date: Wed Feb 1 08:33:53 2017 -0800 tcp: fix 0 divide in __tcp_select_window() [ Upstream commit 06425c308b92eaf60767bc71d359f4cbc7a561f8 ] syszkaller fuzzer was able to trigger a divide by zero, when TCP window scaling is not enabled. SO_RCVBUF can be used not only to increase sk_rcvbuf, also to decrease it below current receive buffers utilization. If mss is negative or 0, just return a zero TCP window. Signed-off-by: Eric Dumazet Reported-by: Dmitry Vyukov Acked-by: Neal Cardwell Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e6fbace87c7b57ff6143df2bc18a9ca9ae919bbe Author: Dan Carpenter Date: Wed Feb 1 11:46:32 2017 +0300 ipv6: pointer math error in ip6_tnl_parse_tlv_enc_lim() [ Upstream commit 63117f09c768be05a0bf465911297dc76394f686 ] Casting is a high precedence operation but "off" and "i" are in terms of bytes so we need to have some parenthesis here. Fixes: fbfa743a9d2a ("ipv6: fix ip6_tnl_parse_tlv_enc_lim()") Signed-off-by: Dan Carpenter Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a7fe4e5d06338e1a82b1977eca37400951f99730 Author: Eric Dumazet Date: Mon Jan 23 16:43:06 2017 -0800 ipv6: fix ip6_tnl_parse_tlv_enc_lim() [ Upstream commit fbfa743a9d2a0ffa24251764f10afc13eb21e739 ] This function suffers from multiple issues. First one is that pskb_may_pull() may reallocate skb->head, so the 'raw' pointer needs either to be reloaded or not used at all. Second issue is that NEXTHDR_DEST handling does not validate that the options are present in skb->data, so we might read garbage or access non existent memory. With help from Willem de Bruijn. Signed-off-by: Eric Dumazet Reported-by: Dmitry Vyukov Cc: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6c8556f6e11441c52f9593381c8be50f922f911f Author: Yotam Gigi Date: Tue Jan 31 15:14:29 2017 +0200 net/sched: matchall: Fix configuration race [ Upstream commit fd62d9f5c575f0792f150109f1fd24a0d4b3f854 ] In the current version, the matchall internal state is split into two structs: cls_matchall_head and cls_matchall_filter. This makes little sense, as matchall instance supports only one filter, and there is no situation where one exists and the other does not. In addition, that led to some races when filter was deleted while packet was processed. Unify that two structs into one, thus simplifying the process of matchall creation and deletion. As a result, the new, delete and get callbacks have a dummy implementation where all the work is done in destroy and change callbacks, as was done in cls_cgroup. Fixes: bf3994d2ed31 ("net/sched: introduce Match-all classifier") Reported-by: Daniel Borkmann Signed-off-by: Yotam Gigi Acked-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 64cc7ef5cf1d53ee084231b451d4c73c0cab3012 Author: Gal Pressman Date: Thu Jan 12 16:25:46 2017 +0200 net/mlx5e: Fix update of hash function/key via ethtool [ Upstream commit a100ff3eef193d2d79daf98dcd97a54776ffeb78 ] Modifying TIR hash should change selected fields bitmask in addition to the function and key. Formerly, Only on ethool mlx5e_set_rxfh "ethtoo -X" we would not set this field resulting in zeroing of its value, which means no packet fields are used for RX RSS hash calculation thus causing all traffic to arrive in RQ[0]. On driver load out of the box we don't have this issue, since the TIR hash is fully created from scratch. Tested: ethtool -X ethX hkey ethtool -X ethX hfunc ethtool -X ethX equal All cases are verified with TCP Multi-Stream traffic over IPv4 & IPv6. Fixes: bdfc028de1b3 ("net/mlx5e: Fix ethtool RX hash func configuration change") Signed-off-by: Gal Pressman Signed-off-by: Saeed Mahameed Signed-off-by: Greg Kroah-Hartman commit adf86d59bb9b08d9eb67054251d29484c5ec102c Author: Eric Dumazet Date: Fri Jan 27 08:11:44 2017 -0800 can: Fix kernel panic at security_sock_rcv_skb [ Upstream commit f1712c73714088a7252d276a57126d56c7d37e64 ] Zhang Yanmin reported crashes [1] and provided a patch adding a synchronize_rcu() call in can_rx_unregister() The main problem seems that the sockets themselves are not RCU protected. If CAN uses RCU for delivery, then sockets should be freed only after one RCU grace period. Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's ease stable backports with the following fix instead. [1] BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] selinux_socket_sock_rcv_skb+0x65/0x2a0 Call Trace: [] security_sock_rcv_skb+0x4c/0x60 [] sk_filter+0x41/0x210 [] sock_queue_rcv_skb+0x53/0x3a0 [] raw_rcv+0x2a3/0x3c0 [] can_rcv_filter+0x12b/0x370 [] can_receive+0xd9/0x120 [] can_rcv+0xab/0x100 [] __netif_receive_skb_core+0xd8c/0x11f0 [] __netif_receive_skb+0x24/0xb0 [] process_backlog+0x127/0x280 [] net_rx_action+0x33b/0x4f0 [] __do_softirq+0x184/0x440 [] do_softirq_own_stack+0x1c/0x30 [] do_softirq.part.18+0x3b/0x40 [] do_softirq+0x1d/0x20 [] netif_rx_ni+0xe5/0x110 [] slcan_receive_buf+0x507/0x520 [] flush_to_ldisc+0x21c/0x230 [] process_one_work+0x24f/0x670 [] worker_thread+0x9d/0x6f0 [] ? rescuer_thread+0x480/0x480 [] kthread+0x12c/0x150 [] ret_from_fork+0x3f/0x70 Reported-by: Zhang Yanmin Signed-off-by: Eric Dumazet Acked-by: Oliver Hartkopp Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman