commit 4b9c8e9bd1f5c549fb581f7edae250d4d9ebc922 Author: Greg Kroah-Hartman Date: Wed Jan 15 15:27:19 2014 -0800 Linux 3.4.77 commit 9af6b695ac764184cdbdd42b77da3e600d337d14 Author: Paul Turner Date: Wed Oct 16 11:16:27 2013 -0700 sched: Guarantee new group-entities always have weight commit 0ac9b1c21874d2490331233b3242085f8151e166 upstream. Currently, group entity load-weights are initialized to zero. This admits some races with respect to the first time they are re-weighted in earlty use. ( Let g[x] denote the se for "g" on cpu "x". ) Suppose that we have root->a and that a enters a throttled state, immediately followed by a[0]->t1 (the only task running on cpu[0]) blocking: put_prev_task(group_cfs_rq(a[0]), t1) put_prev_entity(..., t1) check_cfs_rq_runtime(group_cfs_rq(a[0])) throttle_cfs_rq(group_cfs_rq(a[0])) Then, before unthrottling occurs, let a[0]->b[0]->t2 wake for the first time: enqueue_task_fair(rq[0], t2) enqueue_entity(group_cfs_rq(b[0]), t2) enqueue_entity_load_avg(group_cfs_rq(b[0]), t2) account_entity_enqueue(group_cfs_ra(b[0]), t2) update_cfs_shares(group_cfs_rq(b[0])) < skipped because b is part of a throttled hierarchy > enqueue_entity(group_cfs_rq(a[0]), b[0]) ... We now have b[0] enqueued, yet group_cfs_rq(a[0])->load.weight == 0 which violates invariants in several code-paths. Eliminate the possibility of this by initializing group entity weight. Signed-off-by: Paul Turner Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20131016181627.22647.47543.stgit@sword-of-the-dawn.mtv.corp.google.com Signed-off-by: Ingo Molnar Cc: Chris J Arges Signed-off-by: Greg Kroah-Hartman commit 03d35a39f4de9f2e9d7fbc5c2d03dbcc5b882df7 Author: Ben Segall Date: Wed Oct 16 11:16:22 2013 -0700 sched: Fix hrtimer_cancel()/rq->lock deadlock commit 927b54fccbf04207ec92f669dce6806848cbec7d upstream. __start_cfs_bandwidth calls hrtimer_cancel while holding rq->lock, waiting for the hrtimer to finish. However, if sched_cfs_period_timer runs for another loop iteration, the hrtimer can attempt to take rq->lock, resulting in deadlock. Fix this by ensuring that cfs_b->timer_active is cleared only if the _latest_ call to do_sched_cfs_period_timer is returning as idle. Then __start_cfs_bandwidth can just call hrtimer_try_to_cancel and wait for that to succeed or timer_active == 1. Signed-off-by: Ben Segall Signed-off-by: Peter Zijlstra Cc: pjt@google.com Link: http://lkml.kernel.org/r/20131016181622.22647.16643.stgit@sword-of-the-dawn.mtv.corp.google.com Signed-off-by: Ingo Molnar Cc: Chris J Arges Signed-off-by: Greg Kroah-Hartman commit 9b318052acd24780f4dba1349ef3a30a7aef52ad Author: Ben Segall Date: Wed Oct 16 11:16:17 2013 -0700 sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining commit db06e78cc13d70f10877e0557becc88ab3ad2be8 upstream. hrtimer_expires_remaining does not take internal hrtimer locks and thus must be guarded against concurrent __hrtimer_start_range_ns (but returning HRTIMER_RESTART is safe). Use cfs_b->lock to make it safe. Signed-off-by: Ben Segall Signed-off-by: Peter Zijlstra Cc: pjt@google.com Link: http://lkml.kernel.org/r/20131016181617.22647.73829.stgit@sword-of-the-dawn.mtv.corp.google.com Signed-off-by: Ingo Molnar Cc: Chris J Arges Signed-off-by: Greg Kroah-Hartman commit 16e7480c23d33d9475d45be92c1ddf218575a647 Author: Ben Segall Date: Wed Oct 16 11:16:12 2013 -0700 sched: Fix race on toggling cfs_bandwidth_used commit 1ee14e6c8cddeeb8a490d7b54cd9016e4bb900b4 upstream. When we transition cfs_bandwidth_used to false, any currently throttled groups will incorrectly return false from cfs_rq_throttled. While tg_set_cfs_bandwidth will unthrottle them eventually, currently running code (including at least dequeue_task_fair and distribute_cfs_runtime) will cause errors. Fix this by turning off cfs_bandwidth_used only after unthrottling all cfs_rqs. Tested: toggle bandwidth back and forth on a loaded cgroup. Caused crashes in minutes without the patch, hasn't crashed with it. Signed-off-by: Ben Segall Signed-off-by: Peter Zijlstra Cc: pjt@google.com Link: http://lkml.kernel.org/r/20131016181611.22647.80365.stgit@sword-of-the-dawn.mtv.corp.google.com Signed-off-by: Ingo Molnar Cc: Chris J Arges Signed-off-by: Greg Kroah-Hartman commit a63f31f1f2cdb459b121f644ccbd07ae84f45d4e Author: Linus Torvalds Date: Sat Jan 11 19:15:52 2014 -0800 x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround commit 26bef1318adc1b3a530ecc807ef99346db2aa8b0 upstream. Before we do an EMMS in the AMD FXSAVE information leak workaround we need to clear any pending exceptions, otherwise we trap with a floating-point exception inside this code. Reported-by: halfdog Tested-by: Borislav Petkov Link: http://lkml.kernel.org/r/CA%2B55aFxQnY_PCG_n4=0w-VG=YLXL-yr7oMxyy0WU2gCBAf3ydg@mail.gmail.com Signed-off-by: H. Peter Anvin Signed-off-by: Greg Kroah-Hartman commit 925ece07657be96df688f010dbfec851a0306341 Author: Laurent Pinchart Date: Mon Dec 16 19:16:09 2013 +0100 ARM: shmobile: mackerel: Fix coherent DMA mask commit b6328a6b7ba57fc84c38248f6f0e387e1170f1a8 upstream. Commit 4dcfa60071b3d23f0181f27d8519f12e37cefbb9 ("ARM: DMA-API: better handing of DMA masks for coherent allocations") added an additional check to the coherent DMA mask that results in an error when the mask is larger than what dma_addr_t can address. Set the LCDC coherent DMA mask to DMA_BIT_MASK(32) instead of ~0 to fix the problem. Signed-off-by: Laurent Pinchart Signed-off-by: Simon Horman Signed-off-by: Greg Kroah-Hartman commit 10252aa2278e1d53c420bacdf9ee239784712541 Author: Russell King Date: Fri Jan 3 15:01:39 2014 +0000 ARM: fix "bad mode in ... handler" message for undefined instructions commit 29c350bf28da333e41e30497b649fe335712a2ab upstream. The array was missing the final entry for the undefined instruction exception handler; this commit adds it. Signed-off-by: Russell King Signed-off-by: Greg Kroah-Hartman commit 09e333b064c0f570051ec055d0098c6b6ce00850 Author: Curt Brune Date: Mon Jan 6 11:00:32 2014 -0800 bridge: use spin_lock_bh() in br_multicast_set_hash_max [ Upstream commit fe0d692bbc645786bce1a98439e548ae619269f5 ] br_multicast_set_hash_max() is called from process context in net/bridge/br_sysfs_br.c by the sysfs store_hash_max() function. br_multicast_set_hash_max() calls spin_lock(&br->multicast_lock), which can deadlock the CPU if a softirq that also tries to take the same lock interrupts br_multicast_set_hash_max() while the lock is held . This can happen quite easily when any of the bridge multicast timers expire, which try to take the same lock. The fix here is to use spin_lock_bh(), preventing other softirqs from executing on this CPU. Steps to reproduce: 1. Create a bridge with several interfaces (I used 4). 2. Set the "multicast query interval" to a low number, like 2. 3. Enable the bridge as a multicast querier. 4. Repeatedly set the bridge hash_max parameter via sysfs. # brctl addbr br0 # brctl addif br0 eth1 eth2 eth3 eth4 # brctl setmcqi br0 2 # brctl setmcquerier br0 1 # while true ; do echo 4096 > /sys/class/net/br0/bridge/hash_max; done Signed-off-by: Curt Brune Signed-off-by: Scott Feldman Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5a5bf44e2b40d72ed94d031ebf43e3670b1d7646 Author: Daniel Borkmann Date: Mon Dec 30 23:40:50 2013 +0100 net: llc: fix use after free in llc_ui_recvmsg [ Upstream commit 4d231b76eef6c4a6bd9c96769e191517765942cb ] While commit 30a584d944fb fixes datagram interface in LLC, a use after free bug has been introduced for SOCK_STREAM sockets that do not make use of MSG_PEEK. The flow is as follow ... if (!(flags & MSG_PEEK)) { ... sk_eat_skb(sk, skb, false); ... } ... if (used + offset < skb->len) continue; ... where sk_eat_skb() calls __kfree_skb(). Therefore, cache original length and work on skb_len to check partial reads. Fixes: 30a584d944fb ("[LLX]: SOCK_DGRAM interface fixes") Signed-off-by: Daniel Borkmann Cc: Stephen Hemminger Cc: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 86dc6b93ee413a3997b699476fc4dd78d8f35df7 Author: David S. Miller Date: Tue Dec 31 16:23:35 2013 -0500 vlan: Fix header ops passthru when doing TX VLAN offload. [ Upstream commit 2205369a314e12fcec4781cc73ac9c08fc2b47de ] When the vlan code detects that the real device can do TX VLAN offloads in hardware, it tries to arrange for the real device's header_ops to be invoked directly. But it does so illegally, by simply hooking the real device's header_ops up to the VLAN device. This doesn't work because we will end up invoking a set of header_ops routines which expect a device type which matches the real device, but will see a VLAN device instead. Fix this by providing a pass-thru set of header_ops which will arrange to pass the proper real device instead. To facilitate this add a dev_rebuild_header(). There are implementations which provide a ->cache and ->create but not a ->rebuild (f.e. PLIP). So we need a helper function just like dev_hard_header() to avoid crashes. Use this helper in the one existing place where the header_ops->rebuild was being invoked, the neighbour code. With lots of help from Florian Westphal. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e25027c9bf3fb2dcdd893f13a5651401332941cb Author: Florian Westphal Date: Mon Dec 23 00:32:31 2013 +0100 net: rose: restore old recvmsg behavior [ Upstream commit f81152e35001e91997ec74a7b4e040e6ab0acccf ] recvmsg handler in net/rose/af_rose.c performs size-check ->msg_namelen. After commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c (net: rework recvmsg handler msg_name and msg_namelen logic), we now always take the else branch due to namelen being initialized to 0. Digging in netdev-vger-cvs git repo shows that msg_namelen was initialized with a fixed-size since at least 1995, so the else branch was never taken. Compile tested only. Signed-off-by: Florian Westphal Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7918313da2c7d58dd1380ff5cebf0524268ae523 Author: Sasha Levin Date: Wed Dec 18 23:49:42 2013 -0500 rds: prevent dereference of a NULL device [ Upstream commit c2349758acf1874e4c2b93fe41d072336f1a31d0 ] Binding might result in a NULL device, which is dereferenced causing this BUG: [ 1317.260548] BUG: unable to handle kernel NULL pointer dereference at 000000000000097 4 [ 1317.261847] IP: [] rds_ib_laddr_check+0x82/0x110 [ 1317.263315] PGD 418bcb067 PUD 3ceb21067 PMD 0 [ 1317.263502] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 1317.264179] Dumping ftrace buffer: [ 1317.264774] (ftrace buffer empty) [ 1317.265220] Modules linked in: [ 1317.265824] CPU: 4 PID: 836 Comm: trinity-child46 Tainted: G W 3.13.0-rc4- next-20131218-sasha-00013-g2cebb9b-dirty #4159 [ 1317.267415] task: ffff8803ddf33000 ti: ffff8803cd31a000 task.ti: ffff8803cd31a000 [ 1317.268399] RIP: 0010:[] [] rds_ib_laddr_check+ 0x82/0x110 [ 1317.269670] RSP: 0000:ffff8803cd31bdf8 EFLAGS: 00010246 [ 1317.270230] RAX: 0000000000000000 RBX: ffff88020b0dd388 RCX: 0000000000000000 [ 1317.270230] RDX: ffffffff8439822e RSI: 00000000000c000a RDI: 0000000000000286 [ 1317.270230] RBP: ffff8803cd31be38 R08: 0000000000000000 R09: 0000000000000000 [ 1317.270230] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 1317.270230] R13: 0000000054086700 R14: 0000000000a25de0 R15: 0000000000000031 [ 1317.270230] FS: 00007ff40251d700(0000) GS:ffff88022e200000(0000) knlGS:000000000000 0000 [ 1317.270230] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1317.270230] CR2: 0000000000000974 CR3: 00000003cd478000 CR4: 00000000000006e0 [ 1317.270230] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1317.270230] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602 [ 1317.270230] Stack: [ 1317.270230] 0000000054086700 5408670000a25de0 5408670000000002 0000000000000000 [ 1317.270230] ffffffff84223542 00000000ea54c767 0000000000000000 ffffffff86d26160 [ 1317.270230] ffff8803cd31be68 ffffffff84223556 ffff8803cd31beb8 ffff8800c6765280 [ 1317.270230] Call Trace: [ 1317.270230] [] ? rds_trans_get_preferred+0x42/0xa0 [ 1317.270230] [] rds_trans_get_preferred+0x56/0xa0 [ 1317.270230] [] rds_bind+0x73/0xf0 [ 1317.270230] [] SYSC_bind+0x92/0xf0 [ 1317.270230] [] ? context_tracking_user_exit+0xb8/0x1d0 [ 1317.270230] [] ? trace_hardirqs_on+0xd/0x10 [ 1317.270230] [] ? syscall_trace_enter+0x32/0x290 [ 1317.270230] [] SyS_bind+0xe/0x10 [ 1317.270230] [] tracesys+0xdd/0xe2 [ 1317.270230] Code: 00 8b 45 cc 48 8d 75 d0 48 c7 45 d8 00 00 00 00 66 c7 45 d0 02 00 89 45 d4 48 89 df e8 78 49 76 ff 41 89 c4 85 c0 75 0c 48 8b 03 <80> b8 74 09 00 00 01 7 4 06 41 bc 9d ff ff ff f6 05 2a b6 c2 02 [ 1317.270230] RIP [] rds_ib_laddr_check+0x82/0x110 [ 1317.270230] RSP [ 1317.270230] CR2: 0000000000000974 Signed-off-by: Sasha Levin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit eb2da112485dc59834cf1a87036b3a7a43fba3c4 Author: Salva Peiró Date: Tue Dec 17 10:06:30 2013 +0100 hamradio/yam: fix info leak in ioctl [ Upstream commit 8e3fbf870481eb53b2d3a322d1fc395ad8b367ed ] The yam_ioctl() code fails to initialise the cmd field of the struct yamdrv_ioctl_cfg. Add an explicit memset(0) before filling the structure to avoid the 4-byte info leak. Signed-off-by: Salva Peiró Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b7e3762c40e1d7f7a147f7d5be9584e1603bea01 Author: Wenliang Fan Date: Tue Dec 17 11:25:28 2013 +0800 drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl() [ Upstream commit e9db5c21d3646a6454fcd04938dd215ac3ab620a ] The local variable 'bi' comes from userspace. If userspace passed a large number to 'bi.data.calibrate', there would be an integer overflow in the following line: s->hdlctx.calibrate = bi.data.calibrate * s->par.bitrate / 16; Signed-off-by: Wenliang Fan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 931a701d92ac7449efc5bb7e579f377f5eb565fd Author: Daniel Borkmann Date: Tue Dec 17 00:38:39 2013 +0100 net: inet_diag: zero out uninitialized idiag_{src,dst} fields [ Upstream commit b1aac815c0891fe4a55a6b0b715910142227700f ] Jakub reported while working with nlmon netlink sniffer that parts of the inet_diag_sockid are not initialized when r->idiag_family != AF_INET6. That is, fields of r->id.idiag_src[1 ... 3], r->id.idiag_dst[1 ... 3]. In fact, it seems that we can leak 6 * sizeof(u32) byte of kernel [slab] memory through this. At least, in udp_dump_one(), we allocate a skb in ... rep = nlmsg_new(sizeof(struct inet_diag_msg) + ..., GFP_KERNEL); ... and then pass that to inet_sk_diag_fill() that puts the whole struct inet_diag_msg into the skb, where we only fill out r->id.idiag_src[0], r->id.idiag_dst[0] and leave the rest untouched: r->id.idiag_src[0] = inet->inet_rcv_saddr; r->id.idiag_dst[0] = inet->inet_daddr; struct inet_diag_msg embeds struct inet_diag_sockid that is correctly / fully filled out in IPv6 case, but for IPv4 not. So just zero them out by using plain memset (for this little amount of bytes it's probably not worth the extra check for idiag_family == AF_INET). Similarly, fix also other places where we fill that out. Reported-by: Jakub Zawadzki Signed-off-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c0be5de16bf9b4350802e28c3dd243ca2367c5ef Author: Sasha Levin Date: Fri Dec 13 10:54:22 2013 -0500 net: unix: allow bind to fail on mutex lock [ Upstream commit 37ab4fa7844a044dc21fde45e2a0fc2f3c3b6490 ] This is similar to the set_peek_off patch where calling bind while the socket is stuck in unix_dgram_recvmsg() will block and cause a hung task spew after a while. This is also the last place that did a straightforward mutex_lock(), so there shouldn't be any more of these patches. Signed-off-by: Sasha Levin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0248d4b4d5bed909c52877e16db569361648c952 Author: Jason Wang Date: Fri Dec 13 17:21:27 2013 +0800 netvsc: don't flush peers notifying work during setting mtu [ Upstream commit 50dc875f2e6e2e04aed3b3033eb0ac99192d6d02 ] There's a possible deadlock if we flush the peers notifying work during setting mtu: [ 22.991149] ====================================================== [ 22.991173] [ INFO: possible circular locking dependency detected ] [ 22.991198] 3.10.0-54.0.1.el7.x86_64.debug #1 Not tainted [ 22.991219] ------------------------------------------------------- [ 22.991243] ip/974 is trying to acquire lock: [ 22.991261] ((&(&net_device_ctx->dwork)->work)){+.+.+.}, at: [] flush_work+0x5/0x2e0 [ 22.991307] but task is already holding lock: [ 22.991330] (rtnl_mutex){+.+.+.}, at: [] rtnetlink_rcv+0x1b/0x40 [ 22.991367] which lock already depends on the new lock. [ 22.991398] the existing dependency chain (in reverse order) is: [ 22.991426] -> #1 (rtnl_mutex){+.+.+.}: [ 22.991449] [] __lock_acquire+0xb19/0x1260 [ 22.991477] [] lock_acquire+0xa2/0x1f0 [ 22.991501] [] mutex_lock_nested+0x89/0x4f0 [ 22.991529] [] rtnl_lock+0x17/0x20 [ 22.991552] [] netdev_notify_peers+0x12/0x30 [ 22.991579] [] netvsc_send_garp+0x22/0x30 [hv_netvsc] [ 22.991610] [] process_one_work+0x211/0x6e0 [ 22.991637] [] worker_thread+0x11b/0x3a0 [ 22.991663] [] kthread+0xed/0x100 [ 22.991686] [] ret_from_fork+0x7c/0xb0 [ 22.991715] -> #0 ((&(&net_device_ctx->dwork)->work)){+.+.+.}: [ 22.991715] [] check_prevs_add+0x967/0x970 [ 22.991715] [] __lock_acquire+0xb19/0x1260 [ 22.991715] [] lock_acquire+0xa2/0x1f0 [ 22.991715] [] flush_work+0x4e/0x2e0 [ 22.991715] [] __cancel_work_timer+0x95/0x130 [ 22.991715] [] cancel_delayed_work_sync+0x13/0x20 [ 22.991715] [] netvsc_change_mtu+0x84/0x200 [hv_netvsc] [ 22.991715] [] dev_set_mtu+0x34/0x80 [ 22.991715] [] do_setlink+0x23a/0xa00 [ 22.991715] [] rtnl_newlink+0x394/0x5e0 [ 22.991715] [] rtnetlink_rcv_msg+0x9c/0x260 [ 22.991715] [] netlink_rcv_skb+0xa9/0xc0 [ 22.991715] [] rtnetlink_rcv+0x2a/0x40 [ 22.991715] [] netlink_unicast+0xdd/0x190 [ 22.991715] [] netlink_sendmsg+0x337/0x750 [ 22.991715] [] sock_sendmsg+0x99/0xd0 [ 22.991715] [] ___sys_sendmsg+0x39e/0x3b0 [ 22.991715] [] __sys_sendmsg+0x42/0x80 [ 22.991715] [] SyS_sendmsg+0x12/0x20 [ 22.991715] [] system_call_fastpath+0x16/0x1b This is because we hold the rtnl_lock() before ndo_change_mtu() and try to flush the work in netvsc_change_mtu(), in the mean time, netdev_notify_peers() may be called from worker and also trying to hold the rtnl_lock. This will lead the flush won't succeed forever. Solve this by not canceling and flushing the work, this is safe because the transmission done by NETDEV_NOTIFY_PEERS was synchronized with the netif_tx_disable() called by netvsc_change_mtu(). Reported-by: Yaju Cao Tested-by: Yaju Cao Cc: K. Y. Srinivasan Cc: Haiyang Zhang Signed-off-by: Jason Wang Acked-by: Haiyang Zhang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e5893b25ab6bc7a99161b25b07f39c8506038f93 Author: Nat Gurumoorthy Date: Mon Dec 9 10:43:21 2013 -0800 tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0 [ Upstream commit 388d3335575f4c056dcf7138a30f1454e2145cd8 ] The new tg3 driver leaves REG_BASE_ADDR (PCI config offset 120) uninitialized. From power on reset this register may have garbage in it. The Register Base Address register defines the device local address of a register. The data pointed to by this location is read or written using the Register Data register (PCI config offset 128). When REG_BASE_ADDR has garbage any read or write of Register Data Register (PCI 128) will cause the PCI bus to lock up. The TCO watchdog will fire and bring down the system. Signed-off-by: Nat Gurumoorthy Acked-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e387172fe3c3f5d256f183c0f88c9f0dc5434a88 Author: Sasha Levin Date: Sat Dec 7 17:26:27 2013 -0500 net: unix: allow set_peek_off to fail [ Upstream commit 12663bfc97c8b3fdb292428105dd92d563164050 ] unix_dgram_recvmsg() will hold the readlock of the socket until recv is complete. In the same time, we may try to setsockopt(SO_PEEK_OFF) which will hang until unix_dgram_recvmsg() will complete (which can take a while) without allowing us to break out of it, triggering a hung task spew. Instead, allow set_peek_off to fail, this way userspace will not hang. Signed-off-by: Sasha Levin Acked-by: Pavel Emelyanov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit fcbb1132558f68da2ce37a883be165129aa1eb31 Author: Changli Gao Date: Sun Dec 8 09:36:56 2013 -0500 net: drop_monitor: fix the value of maxattr [ Upstream commit d323e92cc3f4edd943610557c9ea1bb4bb5056e8 ] maxattr in genl_family should be used to save the max attribute type, but not the max command type. Drop monitor doesn't support any attributes, so we should leave it as zero. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 30e65cd5055baa827b0f827c5c9415046b2c377d Author: Hannes Frederic Sowa Date: Sat Dec 7 03:33:45 2013 +0100 ipv6: don't count addrconf generated routes against gc limit [ Upstream commit a3300ef4bbb1f1e33ff0400e1e6cf7733d988f4f ] Brett Ciphery reported that new ipv6 addresses failed to get installed because the addrconf generated dsts where counted against the dst gc limit. We don't need to count those routes like we currently don't count administratively added routes. Because the max_addresses check enforces a limit on unbounded address generation first in case someone plays with router advertisments, we are still safe here. Reported-by: Brett Ciphery Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ae8d56de3072a8a812e1400686668315fb27c21b Author: Jason Wang Date: Wed Dec 11 13:08:34 2013 +0800 macvtap: signal truncated packets [ Upstream commit ce232ce01d61b184202bb185103d119820e1260c ] macvtap_put_user() never return a value grater than iov length, this in fact bypasses the truncated checking in macvtap_recvmsg(). Fix this by always returning the size of packet plus the possible vlan header to let the trunca checking work. Cc: Vlad Yasevich Cc: Zhi Yong Wu Cc: Michael S. Tsirkin Signed-off-by: Jason Wang Acked-by: Vlad Yasevich Acked-by: Michael S. Tsirkin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 693d5d2765cdd89a9f8f113d6a7426ddfe996378 Author: Zhi Yong Wu Date: Fri Dec 6 14:16:51 2013 +0800 tun: update file current position [ Upstream commit d0b7da8afa079ffe018ab3e92879b7138977fc8f ] Signed-off-by: Zhi Yong Wu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d717ca464061e93f0e125e81ae16993da66094f2 Author: Zhi Yong Wu Date: Fri Dec 6 14:16:50 2013 +0800 macvtap: update file current position [ Upstream commit e6ebc7f16ca1434a334647aa56399c546be4e64b ] Signed-off-by: Zhi Yong Wu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b4c3eac4fe4e12260614f5d724a3ab8ce21e486a Author: Vlad Yasevich Date: Tue Nov 26 12:37:12 2013 -0500 macvtap: Do not double-count received packets [ Upstream commit 006da7b07bc4d3a7ffabad17cf639eec6849c9dc ] Currently macvlan will count received packets after calling each vlans receive handler. Macvtap attempts to count the packet yet again when the user reads the packet from the tap socket. This code doesn't do this consistently either. Remove the counting from macvtap and let only macvlan count received packets. Signed-off-by: Vlad Yasevich Acked-by: Michael S. Tsirkin Acked-by: Jason Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 468c82ca655147afc38382e2a184df6c026d2ec5 Author: Venkat Venkatsubra Date: Mon Dec 2 15:41:39 2013 -0800 rds: prevent BUG_ON triggered on congestion update to loopback [ Upstream commit 18fc25c94eadc52a42c025125af24657a93638c0 ] After congestion update on a local connection, when rds_ib_xmit returns less bytes than that are there in the message, rds_send_xmit calls back rds_ib_xmit with an offset that causes BUG_ON(off & RDS_FRAG_SIZE) to trigger. For a 4Kb PAGE_SIZE rds_ib_xmit returns min(8240,4096)=4096 when actually the message contains 8240 bytes. rds_send_xmit thinks there is more to send and calls rds_ib_xmit again with a data offset "off" of 4096-48(rds header) =4048 bytes thus hitting the BUG_ON(off & RDS_FRAG_SIZE) [RDS_FRAG_SIZE=4k]. The commit 6094628bfd94323fc1cea05ec2c6affd98c18f7f "rds: prevent BUG_ON triggering on congestion map updates" introduced this regression. That change was addressing the triggering of a different BUG_ON in rds_send_xmit() on PowerPC architecture with 64Kbytes PAGE_SIZE: BUG_ON(ret != 0 && conn->c_xmit_sg == rm->data.op_nents); This was the sequence it was going through: (rds_ib_xmit) /* Do not send cong updates to IB loopback */ if (conn->c_loopback && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) { rds_cong_map_updated(conn->c_fcong, ~(u64) 0); return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES; } rds_ib_xmit returns 8240 rds_send_xmit: c_xmit_data_off = 0 + 8240 - 48 (rds header accounted only the first time) = 8192 c_xmit_data_off < 65536 (sg->length), so calls rds_ib_xmit again rds_ib_xmit returns 8240 rds_send_xmit: c_xmit_data_off = 8192 + 8240 = 16432, calls rds_ib_xmit again and so on (c_xmit_data_off 24672,32912,41152,49392,57632) rds_ib_xmit returns 8240 On this iteration this sequence causes the BUG_ON in rds_send_xmit: while (ret) { tmp = min_t(int, ret, sg->length - conn->c_xmit_data_off); [tmp = 65536 - 57632 = 7904] conn->c_xmit_data_off += tmp; [c_xmit_data_off = 57632 + 7904 = 65536] ret -= tmp; [ret = 8240 - 7904 = 336] if (conn->c_xmit_data_off == sg->length) { conn->c_xmit_data_off = 0; sg++; conn->c_xmit_sg++; BUG_ON(ret != 0 && conn->c_xmit_sg == rm->data.op_nents); [c_xmit_sg = 1, rm->data.op_nents = 1] What the current fix does: Since the congestion update over loopback is not actually transmitted as a message, all that rds_ib_xmit needs to do is let the caller think the full message has been transmitted and not return partial bytes. It will return 8240 (RDS_CONG_MAP_BYTES+48) when PAGE_SIZE is 4Kb. And 64Kb+48 when page size is 64Kb. Reported-by: Josh Hunt Tested-by: Honggang Li Acked-by: Bang Nguyen Signed-off-by: Venkat Venkatsubra Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit acd4b2b8fb7a43acf7efb24ecb751e6d593f7d2f Author: Eric Dumazet Date: Mon Dec 2 08:51:13 2013 -0800 net: do not pretend FRAGLIST support [ Upstream commit 28e24c62ab3062e965ef1b3bcc244d50aee7fa85 ] Few network drivers really supports frag_list : virtual drivers. Some drivers wrongly advertise NETIF_F_FRAGLIST feature. If skb with a frag_list is given to them, packet on the wire will be corrupt. Remove this flag, as core networking stack will make sure to provide packets that can be sent without corruption. Signed-off-by: Eric Dumazet Cc: Thadeu Lima de Souza Cascardo Cc: Anirudha Sarangi Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman