aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2014-11-08dccp: Convert DCCP_WARN to net_warn_ratelimitedHEADmasterJoe Perches1-2/+2
Remove the dependency on the "warning" sysctl (net_msg_warn) which is only used by the LIMIT_NETDEBUG macro. Convert the LIMIT_NETDEBUG use in DCCP_WARN to the more common net_warn_ratelimited mechanism. This still ratelimits based on the net_ratelimit() function, but removes the check for the sysctl. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-07udp: Increment UDP_MIB_IGNOREDMULTI for arriving unmatched multicastsRick Jones5-6/+20
As NIC multicast filtering isn't perfect, and some platforms are quite content to spew broadcasts, we should not trigger an event for skb:kfree_skb when we do not have a match for such an incoming datagram. We do though want to avoid sweeping the matter under the rug entirely, so increment a suitable statistic. This incorporates feedback from David L. Stevens, Karl Neiss and Eric Dumazet. V3 - use bool per David Miller Signed-off-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-07cdc-ether: implement MULTICAST flag on the deviceOliver Neukum1-9/+10
Olivier having laid the groundwork this patch transmits the multicast flag to the device to save some bus traffic. Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-07uapi: resort Kbuild entriesstephen hemminger1-44/+44
The entries in the Kbuild files are incorrectly sorted. Matters for aesthetics only. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-07stmmac: platform: fix sparse warningsAndy Shevchenko7-6/+39
This patch fixes the following sparse warnings. One is fixed by casting return value to a return type of the function. The others by creating a specific stmmac_platform.h which provides the bits related to the platform driver. drivers/net/ethernet/stmicro/stmmac/dwmac-meson.c:59:29: warning: incorrect type in return expression (different address spaces) drivers/net/ethernet/stmicro/stmmac/dwmac-meson.c:59:29: expected void * drivers/net/ethernet/stmicro/stmmac/dwmac-meson.c:59:29: got void [noderef] <asn:2>*reg drivers/net/ethernet/stmicro/stmmac/dwmac-meson.c:64:29: warning: symbol 'meson6_dwmac_data' was not declared. Should it be static? drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c:354:29: warning: symbol 'stih4xx_dwmac_data' was not declared. Should it be static? drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c:361:29: warning: symbol 'stid127_dwmac_data' was not declared. Should it be static? drivers/net/ethernet/stmicro/stmmac/dwmac-sunxi.c:133:29: warning: symbol 'sun7i_gmac_data' was not declared. Should it be static? Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-07stmmac: remove custom implementation of print_hex_dump()Andy Shevchenko1-8/+2
There is a kernel helper to dump buffers in a hexdecimal format. This patch substitutes the open coded function by calling that helper. The output is slightly changed: - no lead space - ASCII part will be printed along with the dump - offset is longer than 3 characters (now 8) Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-07Merge branch 'iov_iter'David S. Miller4-92/+81
Herbert Xu says: ==================== Replace skb_copy_datagram_const_iovec with iterator version This patch series adds the helper skb_copy_datagram_iter, which is meant to replace both skb_copy_datagram_iovec and its evil twin skb_copy_datagram_const_iovec. It then converts tun and macvtap over to the new helper and finally removes skb_copy_datagram_const_iovec which is only used by tun and macvtap. The copy_to_iter return value issue pointed out by Al has now been fixed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-07net: Kill skb_copy_datagram_const_iovecHerbert Xu2-92/+0
Now that both macvtap and tun are using skb_copy_datagram_iter, we can kill the abomination that is skb_copy_datagram_const_iovec. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-07macvtap: Use iovec iteratorsHerbert Xu1-25/+21
This patch removes the use of skb_copy_datagram_const_iovec in favour of the iovec iterator-based skb_copy_datagram_iter. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-07tun: Use iovec iteratorsHerbert Xu1-35/+30
This patch removes the use of skb_copy_datagram_const_iovec in favour of the iovec iterator-based skb_copy_datagram_iter. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-07inet: Add skb_copy_datagram_iterHerbert Xu2-0/+90
This patch adds skb_copy_datagram_iter, which is identical to skb_copy_datagram_iovec except that it operates on iov_iter instead of iovec. Eventually all users of skb_copy_datagram_iovec should switch over to iov_iter and then we can remove skb_copy_datagram_iovec. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller34-166/+244
2014-11-06vxlan: Fix to enable UDP checksums on interfaceTom Herbert1-0/+3
Add definition to vxlan nla_policy for UDP checksum. This is necessary to enable UDP checksums on VXLAN. In some instances, enabling UDP checksums can improve performance on receive for devices that return legacy checksum-unnecessary for UDP/IP. Also, UDP checksum provides some protection against VNI corruption. Testing: Ran 200 instances of TCP_STREAM and TCP_RR on bnx2x. TCP_STREAM IPv4, without UDP checksums 14.41% TX CPU utilization 25.71% RX CPU utilization 9083.4 Mbps IPv4, with UDP checksums 13.99% TX CPU utilization 13.40% RX CPU utilization 9095.65 Mbps TCP_RR IPv4, without UDP checksums 94.08% TX CPU utilization 156/248/462 90/95/99% latencies 1.12743e+06 tps IPv4, with UDP checksums 94.43% TX CPU utilization 158/250/462 90/95/99% latencies 1.13345e+06 tps Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06Merge branch 'amd-xgbe-next'David S. Miller1-5/+5
Tom Lendacky says: ==================== amd-xgbe: AMD XGBE driver updates 2014-11-06 The following series of patches fixes a couple of bugs that slipped through my last series. - Free channel structure after freeing the per channel interrupts - If an skb error allocation occurs during receive processing check whether more descriptors are associated with the packet or whether to start on a new packet This patch series is based on net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06amd-xgbe: Check for complete packet on skb allocation errorLendacky, Thomas1-2/+2
If the skb allocation fails during receive processing, the driver would continue reading descriptors without first determining if there were any more descriptors for the current packet. Update the code to check whether more descriptors are associated with the current packet or whether to move on to the next descriptor as a new packet. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06amd-xgbe: Free channel/ring structures laterLendacky, Thomas1-3/+3
The channel structure is freed before freeing the per channel interrupts resulting in a kernel oops. Move the call to free the channel structure to after the freeing of the per channel interrupts. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06netxen: Fix link event handling.Manish Chopra1-1/+2
o Poll for the link events only if firmware doesn't have capability to notify the driver for the link events. Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06enic: update desc properly in rx_copybreakGovindarajulu Varadarajan1-12/+2
When we reuse the rx buffer, we need to update the desc. If not hardware sees stale value. In the following crash, when mtu is changed, hardware sees old rx buffer value and crashes on skb_put. Fix this by using enic_queue_rq_desc helper function which updates the necessary desc. [ 64.657376] skbuff: skb_over_panic: text:ffffffffa041f55d len:9010 put:9010 head:ffff8800d3ca9fc0 data:ffff8800d3caa000 tail:0x2372 end:0x640 dev:enp0s3 [ 64.659965] ------------[ cut here ]------------ [ 64.661322] kernel BUG at net/core/skbuff.c:100! [ 64.662644] invalid opcode: 0000 [#1] PREEMPT SMP [ 64.664001] Modules linked in: rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 cirrus ttm drm_kms_helper drm enic psmouse microcode evdev serio_raw syscopyarea sysfillrect sysimgblt i2c_piix4 i2c_core pcspkr nfs lockd grace sunrpc fscache ext4 crc16 mbcache jbd2 sd_mod ata_generic virtio_balloon ata_piix libata uhci_hcd virtio_pci virtio_ring usbcore usb_common virtio scsi_mod [ 64.664834] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.17.0-netnext-10335-g942396b-dirty #273 [ 64.664834] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 64.664834] task: ffffffff81a1d580 ti: ffffffff81a00000 task.ti: ffffffff81a00000 [ 64.664834] RIP: 0010:[<ffffffff81392cf1>] [<ffffffff81392cf1>] skb_panic+0x61/0x70 [ 64.664834] RSP: 0018:ffff880210603d48 EFLAGS: 00010292 [ 64.664834] RAX: 000000000000008c RBX: ffff88020b0f6930 RCX: 0000000000000000 [ 64.664834] RDX: 000000000000008c RSI: ffffffff8178b288 RDI: 00000000ffffffff [ 64.664834] RBP: ffff880210603d68 R08: 0000000000000001 R09: 0000000000000001 [ 64.664834] R10: 00000000000005ce R11: 0000000000000001 R12: ffff88020b1f0b40 [ 64.664834] R13: 000000000000a332 R14: ffff880209a1a000 R15: 0000000000000001 [ 64.664834] FS: 0000000000000000(0000) GS:ffff880210600000(0000) knlGS:0000000000000000 [ 64.664834] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 64.664834] CR2: 00007f6752935e48 CR3: 0000000035743000 CR4: 00000000000006f0 [ 64.664834] Stack: [ 64.664834] ffff8800d3caa000 0000000000002372 0000000000000640 ffff88020b1f0000 [ 64.664834] ffff880210603d78 ffffffff81392d54 ffff880210603e08 ffffffffa041f55d [ 64.664834] 0000000000000296 ffffffff00000000 00008e7e00008e7e ffff880200002332 [ 64.664834] Call Trace: [ 64.664834] <IRQ> [ 64.664834] [ 64.664834] [<ffffffff81392d54>] skb_put+0x54/0x60 [ 64.664834] [<ffffffffa041f55d>] enic_rq_service.constprop.47+0x3ad/0x730 [enic] [ 64.664834] [<ffffffffa041fa79>] enic_poll_msix_rq+0x199/0x370 [enic] [ 64.664834] [<ffffffff813a5499>] net_rx_action+0x139/0x210 [ 64.664834] [<ffffffff81290db3>] ? __this_cpu_preempt_check+0x13/0x20 [ 64.664834] [<ffffffff8106110e>] __do_softirq+0x14e/0x280 [ 64.664834] [<ffffffff8106152e>] irq_exit+0x8e/0xb0 [ 64.664834] [<ffffffff8100fd21>] do_IRQ+0x61/0x100 [ 64.664834] [<ffffffff814a2bf2>] common_interrupt+0x72/0x72 fixes: a03bb56e67c357980dae886683733dab5583dc14 ("enic: implement rx_copybreak") Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06enic: handle error condition properly in enic_rq_indicate_bufGovindarajulu Varadarajan1-0/+6
In case of error in rx path, we free the buf->os_buf but we do not make it NULL. In next iteration we use the skb which is already freed. This causes the following crash. [ 886.154772] general protection fault: 0000 [#1] PREEMPT SMP [ 886.154851] Modules linked in: rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 microcode evdev cirrus ttm drm_kms_helper drm enic syscopyarea sysfillrect sysimgblt psmouse i2c_piix4 serio_raw pcspkr i2c_core nfs lockd grace sunrpc fscache ext4 crc16 mbcache jbd2 sd_mod crc_t10dif crct10dif_common ata_generic ata_piix virtio_balloon libata scsi_mod uhci_hcd usbcore virtio_pci virtio_ring virtio usb_common [ 886.155199] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.17.0-netnext-05668-g876bc7f #272 [ 886.155263] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 886.155304] task: ffffffff81a1d580 ti: ffffffff81a00000 task.ti: ffffffff81a00000 [ 886.155356] RIP: 0010:[<ffffffff81384030>] [<ffffffff81384030>] kfree_skb_list+0x10/0x30 [ 886.155418] RSP: 0018:ffff880210603d48 EFLAGS: 00010206 [ 886.155456] RAX: 0000000000000020 RBX: 0000000000000000 RCX: 0000000000000000 [ 886.155504] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 004500084e000017 [ 886.155553] RBP: ffff880210603d50 R08: 00000000fe13d1b6 R09: 0000000000000001 [ 886.155601] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880209ff2f00 [ 886.155650] R13: ffff88020ac0fe40 R14: ffff880209ff2f00 R15: ffff8800da8e3a80 [ 886.155699] FS: 0000000000000000(0000) GS:ffff880210600000(0000) knlGS:0000000000000000 [ 886.155774] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 886.155814] CR2: 00007f0e0c925000 CR3: 0000000035e8b000 CR4: 00000000000006f0 [ 886.155865] Stack: [ 886.155882] 0000000000000000 ffff880210603d78 ffffffff81383f79 ffff880209ff2f00 [ 886.155942] ffff88020b0c0b40 000000000000c000 ffff880210603d90 ffffffff81383faf [ 886.156001] ffff880209ff2f00 ffff880210603da8 ffffffff8138406d ffff88020b1b08c0 [ 886.156061] Call Trace: [ 886.156080] <IRQ> [ 886.156095] [ 886.156112] [<ffffffff81383f79>] skb_release_data+0xa9/0xc0 [ 886.157656] [<ffffffff81383faf>] skb_release_all+0x1f/0x30 [ 886.159195] [<ffffffff8138406d>] consume_skb+0x1d/0x40 [ 886.160719] [<ffffffff813942e5>] __dev_kfree_skb_any+0x35/0x40 [ 886.162224] [<ffffffffa02dc1d5>] enic_rq_service.constprop.47+0xe5/0x5a0 [enic] [ 886.163756] [<ffffffffa02dc829>] enic_poll_msix_rq+0x199/0x370 [enic] [ 886.164730] [<ffffffff81397e29>] net_rx_action+0x139/0x210 [ 886.164730] [<ffffffff8105fb2e>] __do_softirq+0x14e/0x280 [ 886.164730] [<ffffffff8105ff2e>] irq_exit+0x8e/0xb0 [ 886.164730] [<ffffffff8100fc1d>] do_IRQ+0x5d/0x100 [ 886.164730] [<ffffffff81496832>] common_interrupt+0x72/0x72 fixes: a03bb56e67c357980dae886683733dab5583dc14 ("enic: implement rx_copybreak") Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06Merge branch 'mlx5-net'David S. Miller2-6/+5
Eli Cohen says: ==================== mlx5_core fixes for 3.18 the following two patches fix races to could lead to kernel panic in some cases. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06net/mlx5_core: Fix race on driver loadEli Cohen1-2/+2
When events arrive at driver load, the event handler gets called even before the spinlock and list are initialized. Fix this by moving the initialization before EQs creation. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06net/mlx5_core: Fix race in create EQEli Cohen1-4/+3
After the EQ is created, it can possibly generate interrupts and the interrupt handler is referencing eq->dev. It is therefore required to set eq->dev before calling request_irq() so if an event is generated before request_irq() returns, we will have a valid eq->dev field. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06Merge branch 'net_next_ovs' of ↵David S. Miller24-245/+606
git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch Pravin B Shelar says: ==================== Open vSwitch First two patches are related to OVS MPLS support. Rest of patches are mostly refactoring and minor improvements to openvswitch. v1-v2: - Fix conflicts due to "gue: Remote checksum offload" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06Merge branch 'sunvnet-next'David S. Miller1-4/+3
Sowmini Varadhan says: ==================== sunvnet: bug fixes This patch series has a coding-style fix and a bug fix. The coding style fix (patch 1) is the extra indentation flagged by Ben Hutchings: http://marc.info/?l=linux-netdev&m=141529243409594&w=2 The bugfix (patch 2) is the following: when vnet_event_napi() is called as part of napi_resume (i.e., continuation of a previous NAPI read that was truncated due to budget constraints), and then finds no more packets to read, the code was trying to avoid an additional trip through ldc_rx as an optimization. However, when this corner case happens, we would need to reset a number of dring state bits such as rcv_nxt carefully, which quickly becomes complex and hacky. The cleaner solution is to just roll back to vnet_poll, re-enable interrupts and set up dring state as was done in the pre-NAPI version of the driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06sunvnet: Return from vnet_napi_event() if no packets to readSowmini Varadhan1-3/+2
vnet_event_napi() may be called as part of the NAPI ->poll, to resume reading descriptor rings. When no data is available, descriptor ring state (e.g., rcv_nxt) needs to be reset carefully to stay in lock-step with ldc_read(). In the interest of simplicity, the best way to do this is to return from vnet_event_napi() when there are no more packets to read. The next trip through ldc_rx will correctly set up the dring state. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Tested-by: David Stevens <david.stevens@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06sunvnet: Fix indentation in maybe_tx_wakeup()Sowmini Varadhan1-1/+1
remove redundant tab. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Reported-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06Merge branch 'r8152-next'David S. Miller1-60/+32
Hayes Wang says: ==================== r8152: rtl_ops_init modify Initialize the ops through tp->version. This could skip checking each VID/PID. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06r8152: remove the definitions of the PIDhayeswang1-7/+3
The PIDs are only used in the id table, so the definitions are unnacessary. Remove them wouldn't have confusion. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06r8152: modify rtl_ops_inithayeswang1-51/+28
Replace using VID/PID with using tp->version to initialize the ops. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06r8152: move r8152b_get_versionhayeswang1-2/+1
Move r8152b_get_version() to the location before rtl_ops_init(). Then, the rtl_ops_init() could use tp->version. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06sock.h: Remove unused NETDEBUG macroJoe Perches1-3/+0
It's unused now, just delete it. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06net: esp: Convert NETDEBUG to pr_infoJoe Perches2-10/+10
Commit 64ce207306de ("[NET]: Make NETDEBUG pure printk wrappers") originally had these NETDEBUG printks as always emitting. Commit a2a316fd068c ("[NET]: Replace CONFIG_NET_DEBUG with sysctl") added a net_msg_warn sysctl to these NETDEBUG uses. Convert these NETDEBUG uses to normal pr_info calls. This changes the output prefix from "ESP: " to include "IPSec: " for the ipv4 case and "IPv6: " for the ipv6 case. These output lines are now like the other messages in the files. Other miscellanea: Neaten the arithmetic spacing to be consistent with other arithmetic spacing in the files. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06net; ipv[46] - Remove 2 unnecessary NETDEBUG OOM messagesJoe Perches2-11/+7
These messages aren't useful as there's a generic dump_stack() on OOM. Neaten the comment and if test above the OOM by separating the assign in if into an allocation then if test. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06dsa: mv88e6171: Add support for mv88e6172Andrew Lunn2-4/+7
The mv88e6172 is very similar to the mv88e6171. So extend the mv88e6171 driver to support it. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06net: dsa: slave: Fix autoneg for phys on switch MDIO busAndrew Lunn1-2/+5
When the ports phys are connected to the switches internal MDIO bus, we need to connect the phy to the slave netdev, otherwise auto-negotiation etc, does not work. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06sched: fix act file names in header commentJiri Pirko6-6/+6
Fixes: 4bba3925 ("[PKT_SCHED]: Prefix tc actions with act_") Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06net/9p: remove a comment about pref member which doesn't existRyo Munakata1-1/+0
Signed-off-by: Ryo Munakata <ryomnktml@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06drivers: net: cpsw: remove cpsw_ale_stop from cpsw_ale_destroyMugunthan V N1-1/+0
when cpsw is build as modulea and simple insert and removal of module creates a deadlock, due to delete timer. the timer is created and destroyed in cpsw_ale_start and cpsw_ale_stop which are from device open and close. root@am437x-evm:~# modprobe -r ti_cpsw [ 158.505333] INFO: trying to register non-static key. [ 158.510623] the code is fine but needs lockdep annotation. [ 158.516448] turning off the locking correctness validator. [ 158.522282] CPU: 0 PID: 1339 Comm: modprobe Not tainted 3.14.23-00445-gd41c88f #44 [ 158.530359] [<c0015380>] (unwind_backtrace) from [<c0012088>] (show_stack+0x10/0x14) [ 158.538603] [<c0012088>] (show_stack) from [<c054ad70>] (dump_stack+0x78/0x94) [ 158.546295] [<c054ad70>] (dump_stack) from [<c0088008>] (__lock_acquire+0x176c/0x1b74) [ 158.554711] [<c0088008>] (__lock_acquire) from [<c0088944>] (lock_acquire+0x9c/0x104) [ 158.563043] [<c0088944>] (lock_acquire) from [<c004e520>] (del_timer_sync+0x44/0xd8) [ 158.571289] [<c004e520>] (del_timer_sync) from [<bf2eac1c>] (cpsw_ale_destroy+0x10/0x3c [ti_cpsw]) [ 158.580821] [<bf2eac1c>] (cpsw_ale_destroy [ti_cpsw]) from [<bf2eb268>] (cpsw_remove+0x30/0xa0 [ti_cpsw]) [ 158.591000] [<bf2eb268>] (cpsw_remove [ti_cpsw]) from [<c035ef44>] (platform_drv_remove+0x18/0x1c) [ 158.600527] [<c035ef44>] (platform_drv_remove) from [<c035d8bc>] (__device_release_driver+0x70/0xc8) [ 158.610236] [<c035d8bc>] (__device_release_driver) from [<c035e0d4>] (driver_detach+0xb4/0xb8) [ 158.619386] [<c035e0d4>] (driver_detach) from [<c035d6e4>] (bus_remove_driver+0x4c/0x90) [ 158.627988] [<c035d6e4>] (bus_remove_driver) from [<c00af2a8>] (SyS_delete_module+0x10c/0x198) [ 158.637144] [<c00af2a8>] (SyS_delete_module) from [<c000e580>] (ret_fast_syscall+0x0/0x48) [ 179.524727] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 0, t=2102 jiffies, g=1487, c=1486, q=6) [ 179.535741] INFO: Stall ended before state dump start Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06net: mv643xx_eth: reclaim TX skbs only when released by the HWKarl Beldan1-8/+10
ATM, txq_reclaim will dequeue and free an skb for each tx desc released by the hw that has TX_LAST_DESC set. However, in case of TSO, each hw desc embedding the last part of a segment has TX_LAST_DESC set, losing the one-to-one 'last skb frag'/'TX_LAST_DESC set' correspondance, which causes data corruption. Fix this by checking TX_ENABLE_INTERRUPT instead of TX_LAST_DESC, and warn when trying to dequeue from an empty txq (which can be symptomatic of releasing skbs prematurely). Fixes: 3ae8f4e0b98 ('net: mv643xx_eth: Implement software TSO') Reported-by: Slawomir Gajzner <slawomir.gajzner@gmail.com> Reported-by: Julien D'Ascenzio <jdascenzio@yahoo.fr> Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Cc: Ian Campbell <ijc@hellion.org.uk> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06Merge branch 'sfc-next'David S. Miller9-193/+281
Shradha Shah says: ==================== sfc: Clean up Siena SR-IOV support in preparation for EF10 SR-IOV support This patch series provides a base and clean up for the upcoming EF10 SRIOV patches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06sfc: Add NIC type operations to replace direct calls from efx.c into ↵Shradha Shah7-8/+56
siena_sriov.c Also add dummy functions where required to avoid NULL pointer dereference. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06sfc: Rename implementations in siena_sriov.c to have a 'siena' prefixShradha Shah6-141/+149
Patch in preparation for the upcoming EF10 sriov support. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06sfc: Move the current VF state from efx_nic into siena_nic_dataShradha Shah5-52/+84
This patch series provides a base and cleanup for the upcoming EF10 SRIOV support. This patch moves the VF state into siena_nic_data as a basis to save the VF state based on nic type. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06xen-netback: remove unconditional __pskb_pull_tail() in guest Tx pathMalcolm Crossley1-14/+12
Unconditionally pulling 128 bytes into the linear area is not required for: - security: Every protocol demux starts with pskb_may_pull() to pull frag data into the linear area, if necessary, before looking at headers. - performance: Netback has already grant copied up-to 128 bytes from the first slot of a packet into the linear area. The first slot normally contain all the IPv4/IPv6 and TCP/UDP headers. The unconditional pull would often copy frag data unnecessarily. This is a performance problem when running on a version of Xen where grant unmap avoids TLB flushes for pages which are not accessed. TLB flushes can now be avoided for > 99% of unmaps (it was 0% before). Grant unmap TLB flush avoidance will be available in a future version of Xen (probably 4.6). Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06Merge branch 'stmmac-next'David S. Miller1-54/+26
Andy Shevchenko says: ==================== stmmac: pci: various cleanups and fixes There are few cleanups and fixes regarding to stmmac PCI driver. This has been tested on Intel Galileo board with recent net-next tree. Since v2: - drop patch 5/5 since it will be part of a big change across entire subsystem Since v1: - remove already applied patch - append patch 1/5 - rework patch 3/5 to be functional compatible with original code ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06stmmac: pci: convert to use dev_* macrosAndy Shevchenko1-4/+4
Instead of pr_* macros let's use dev_* macros which provide device name. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06stmmac: pci: use managed resourcesAndy Shevchenko1-33/+10
Migrate pci driver to managed resources to reduce boilerplate error handling code. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06stmmac: pci: convert to use dev_pm_opsAndy Shevchenko1-16/+11
Convert system PM callbacks to use dev_pm_ops. In addition remove the PCI calls related to a power state since the bus code cares about this already. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06stmmac: pci: use defined constant instead of magic numberAndy Shevchenko1-1/+1
The last standard PCI resource is defined as PCI_STD_RESOURCE_END. Thus, we could use it instead of plain integer. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06stmmac: fix sparse warningsAndy Shevchenko2-3/+6
This patch fixes the following sparse warnings. drivers/net/ethernet/stmicro/stmmac/enh_desc.c:381:30: warning: symbol 'enh_desc_ops' was not declared. Should it be static? drivers/net/ethernet/stmicro/stmmac/norm_desc.c:253:30: warning: symbol 'ndesc_ops' was not declared. Should it be static? drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c:141:33: warning: symbol 'stmmac_ptp' was not declared. Should it be static? There is no functional change. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Giuseppe CAVALLARO <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06ip6_tunnel: Add support for wildcard tunnel endpoints.Steffen Klassert1-1/+40
This patch adds support for tunnels with local or remote wildcard endpoints. With this we get a NBMA tunnel mode like we have it for ipv4 and sit tunnels. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06ipv6: Allow sending packets through tunnels with wildcard endpointsSteffen Klassert4-12/+26
Currently we need the IP6_TNL_F_CAP_XMIT capabiltiy to transmit packets through an ipv6 tunnel. This capability is set when the tunnel gets configured, based on the tunnel endpoint addresses. On tunnels with wildcard tunnel endpoints, we need to do the capabiltiy checking on a per packet basis like it is done in the receive path. This patch extends ip6_tnl_xmit_ctl() to take local and remote addresses as parameters to allow for per packet capabiltiy checking. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05openvswitch: Avoid NULL mask check while building maskPravin B Shelar1-54/+53
OVS does mask validation even if it does not need to convert netlink mask attributes to mask structure. ovs_nla_get_match() caller can pass NULL mask structure pointer if the caller does not need mask. Therefore NULL check is required in SW_FLOW_KEY* macros. Following patch does not convert mask netlink attributes if mask pointer is NULL, so we do not need these checks in SW_FLOW_KEY* macro. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Andy Zhou <azhou@nicira.com>
2014-11-05openvswitch: Refactor action alloc and copy api.Pravin B Shelar3-29/+21
There are two separate API to allocate and copy actions list. Anytime OVS needs to copy action list, it needs to call both functions. Following patch moves action allocation to copy function to avoid code duplication. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2014-11-05openvswitch: Move key_attr_size() to flow_netlink.h.Joe Stringer3-28/+37
flow-netlink has netlink related code. Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-11-05openvswitch: Remove flow member from struct ovs_skb_cbLorand Jakab3-12/+9
The 'flow' memeber was chosen for removal because it's only used in ovs_execute_actions() we can pass it as argument to this function. Signed-off-by: Lorand Jakab <lojakab@cisco.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-11-05openvswitch: Fix the type of struct ovs_key_nd nd_target field.Jarno Rajahalme1-3/+3
Should be the same as other IPv6 address fields. Current master produces sparse warnings without this change. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-11-05openvswitch: Drop packets when interdev is not upChunhe Li1-0/+5
If the internal device is not up, it should drop received packets. Sometimes it receive the broadcast or multicast packets, and the ip protocol stack will casue more cpu usage wasted. Signed-off-by: Chunhe Li <lichunhe@huawei.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-11-05openvswitch: Refactor get_dp() function into multiple access APIs.Andy Zhou1-10/+21
Avoid recursive read_rcu_lock() by using the lighter weight get_dp_rcu() API. Add proper locking assertions to get_dp(). Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-11-05openvswitch: Refactor ovs_flow_cmd_fill_info().Joe Stringer1-27/+66
Split up ovs_flow_cmd_fill_info() to make it easier to cache parts of a dump reply. This will be used to streamline flow_dump in a future patch. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-11-05openvswitch: refactor do_output() to move NULL check out of fast pathAndy Zhou1-14/+11
skb_clone() NULL check is implemented in do_output(), as past of the common (fast) path. Refactoring so that NULL check is done in the slow path, immediately after skb_clone() is called. Besides optimization, this change also improves code readability by making the skb_clone() NULL check consistent within OVS datapath module. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-11-05openvswitch: Additional logging for -EINVAL on flow setups.Jesse Gross2-7/+22
There are many possible ways that a flow can be invalid so we've added logging for most of them. This adds logs for the remaining possible cases so there isn't any ambiguity while debugging. CC: Federico Iezzi <fiezzi@enter.it> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-11-05openvswitch: Remove redundant tcp_flags code.Joe Stringer1-10/+3
These two cases used to be treated differently for IPv4/IPv6, but they are now identical. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-11-05openvswitch: Move table destroy to dp-rcu callback.Pravin B Shelar3-8/+10
Ths simplifies flow-table-destroy API. No need to pass explicit parameter about context. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Thomas Graf <tgraf@redhat.com>
2014-11-05openvswitch: Add basic MPLS support to kernelSimon Horman10-30/+345
Allow datapath to recognize and extract MPLS labels into flow keys and execute actions which push, pop, and set labels on packets. Based heavily on work by Leo Alterman, Ravi K, Isaku Yamahata and Joe Stringer. Cc: Ravi K <rkerur@gmail.com> Cc: Leo Alterman <lalterman@nicira.com> Cc: Isaku Yamahata <yamahata@valinux.co.jp> Cc: Joe Stringer <joe@wand.net.nz> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-11-05net: Remove MPLS GSO feature.Pravin B Shelar10-18/+5
Device can export MPLS GSO support in dev->mpls_features same way it export vlan features in dev->vlan_features. So it is safe to remove NETIF_F_GSO_MPLS redundant flag. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-11-05fou: Fix typo in returning flags in netlinkTom Herbert3-3/+3
When filling netlink info, dport is being returned as flags. Fix instances to return correct value. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05r8152: disable the tasklet by defaulthayeswang1-2/+6
Let the tasklet only be enabled after open(), and be disabled for the other situation. The tasklet is only necessary after open() for tx/rx, so it could be disabled by default. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUsDaniel Borkmann2-10/+10
It has been reported that generating an MLD listener report on devices with large MTUs (e.g. 9000) and a high number of IPv6 addresses can trigger a skb_over_panic(): skbuff: skb_over_panic: text:ffffffff80612a5d len:3776 put:20 head:ffff88046d751000 data:ffff88046d751010 tail:0xed0 end:0xec0 dev:port1 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:100! invalid opcode: 0000 [#1] SMP Modules linked in: ixgbe(O) CPU: 3 PID: 0 Comm: swapper/3 Tainted: G O 3.14.23+ #4 [...] Call Trace: <IRQ> [<ffffffff80578226>] ? skb_put+0x3a/0x3b [<ffffffff80612a5d>] ? add_grhead+0x45/0x8e [<ffffffff80612e3a>] ? add_grec+0x394/0x3d4 [<ffffffff80613222>] ? mld_ifc_timer_expire+0x195/0x20d [<ffffffff8061308d>] ? mld_dad_timer_expire+0x45/0x45 [<ffffffff80255b5d>] ? call_timer_fn.isra.29+0x12/0x68 [<ffffffff80255d16>] ? run_timer_softirq+0x163/0x182 [<ffffffff80250e6f>] ? __do_softirq+0xe0/0x21d [<ffffffff8025112b>] ? irq_exit+0x4e/0xd3 [<ffffffff802214bb>] ? smp_apic_timer_interrupt+0x3b/0x46 [<ffffffff8063f10a>] ? apic_timer_interrupt+0x6a/0x70 mld_newpack() skb allocations are usually requested with dev->mtu in size, since commit 72e09ad107e7 ("ipv6: avoid high order allocations") we have changed the limit in order to be less likely to fail. However, in MLD/IGMP code, we have some rather ugly AVAILABLE(skb) macros, which determine if we may end up doing an skb_put() for adding another record. To avoid possible fragmentation, we check the skb's tailroom as skb->dev->mtu - skb->len, which is a wrong assumption as the actual max allocation size can be much smaller. The IGMP case doesn't have this issue as commit 57e1ab6eaddc ("igmp: refine skb allocations") stores the allocation size in the cb[]. Set a reserved_tailroom to make it fit into the MTU and use skb_availroom() helper instead. This also allows to get rid of igmp_skb_size(). Reported-by: Wei Liu <lw1a2.jing@gmail.com> Fixes: 72e09ad107e7 ("ipv6: avoid high order allocations") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: David L Stevens <david.stevens@oracle.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05net: Convert SEQ_START_TOKEN/seq_printf to seq_putsJoe Perches3-12/+3
Using a single fixed string is smaller code size than using a format and many string arguments. Reduces overall code size a little. $ size net/ipv4/igmp.o* net/ipv6/mcast.o* net/ipv6/ip6_flowlabel.o* text data bss dec hex filename 34269 7012 14824 56105 db29 net/ipv4/igmp.o.new 34315 7012 14824 56151 db57 net/ipv4/igmp.o.old 30078 7869 13200 51147 c7cb net/ipv6/mcast.o.new 30105 7869 13200 51174 c7e6 net/ipv6/mcast.o.old 11434 3748 8580 23762 5cd2 net/ipv6/ip6_flowlabel.o.new 11491 3748 8580 23819 5d0b net/ipv6/ip6_flowlabel.o.old Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05fast_hash: avoid indirect function callsHannes Frederic Sowa6-93/+98
By default the arch_fast_hash hashing function pointers are initialized to jhash(2). If during boot-up a CPU with SSE4.2 is detected they get updated to the CRC32 ones. This dispatching scheme incurs a function pointer lookup and indirect call for every hashing operation. rhashtable as a user of arch_fast_hash e.g. stores pointers to hashing functions in its structure, too, causing two indirect branches per hashing operation. Using alternative_call we can get away with one of those indirect branches. Acked-by: Daniel Borkmann <dborkman@redhat.com> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05Merge branch 'amd-xgbe-next'David S. Miller11-277/+955
Tom Lendacky says: ==================== amd-xgbe: AMD XGBE driver updates 2014-11-04 The following series of patches includes functional updates to the driver as well as some trivial changes for function renaming and spelling fixes. - Move channel and ring structure allocation into the device open path - Rename the pre_xmit function to dev_xmit - Explicitly use the u32 data type for the device descriptors - Use page allocation for the receive buffers - Add support for split header/payload receive - Add support for per DMA channel interrupts - Add support for receive side scaling (RSS) - Add support for ethtool receive side scaling commands - Fix the spelling of descriptors - After a PCS reset, sync the PCS and PHY modes - Add dependency on HAS_IOMEM to both the amd-xgbe and amd-xgbe-phy drivers This patch series is based on net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05amd-xgbe-phy: Let AMD_XGBE_PHY depend on HAS_IOMEMLendacky, Thomas1-1/+1
The amd-xgbe-phy driver needs to perform ioremap calls, so add HAS_IOMEM to its build dependency. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05amd-xgbe: Let AMD_XGBE depend on HAS_IOMEMLendacky, Thomas1-1/+1
The amd-xgbe driver needs to perform ioremap calls, so add HAS_IOMEM to its build dependency. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05amd-xgbe-phy: Sync PCS and PHY modes after resetLendacky, Thomas1-1/+2
This patch adds support to sync the states of the PCS and the PHY after a reset is performed. If the PCS and the PHY are not in the same state after reset an extra mode change would be performed. This extra mode change might not be needed if the PCS and the PHY are synced up after reset. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05amd-xgbe: Fix a spelling errorLendacky, Thomas1-2/+2
This patch fixes the spelling of the word "descriptor" in a couple of locations. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05amd-xgbe: Add receive side scaling ethtool supportLendacky, Thomas3-0/+96
This patch adds support for ethtool receive side scaling (RSS) commands. Support is added to get/set the RSS hash key and the RSS lookup table. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05amd-xgbe: Provide support for receive side scalingLendacky, Thomas5-2/+233
This patch provides support for receive side scaling (RSS). RSS allows for spreading incoming network packets across the Rx queues. When used in conjunction with the per DMA channel interrupt support, this allows the receive processing to be spread across multiple processors. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05amd-xgbe: Add support for per DMA channel interruptsLendacky, Thomas5-51/+219
This patch provides support for interrupts that are generated by the Tx/Rx DMA channel pairs of the device. This allows for Tx and Rx processing to run across multiple processsors. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05amd-xgbe: Implement split header receive supportLendacky, Thomas5-111/+201
Provide support for splitting IP packets so that the header and payload can be sent to different DMA addresses. This will allow the IP header to be put into the linear part of the skb while the payload can be added as frags. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05amd-xgbe: Use page allocations for Rx buffersLendacky, Thomas4-127/+196
Use page allocations for Rx buffers instead of pre-allocating skbs of a set size. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05amd-xgbe: Use the u32 data type for descriptorsLendacky, Thomas1-4/+4
The Tx and Rx descriptors are unsigned 32 bit values. Use the u32 type, rather than unsigned int, to map these descriptors. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05amd-xgbe: Rename pre_xmit function to dev_xmitLendacky, Thomas3-6/+6
The pre_xmit function name implies that it performs operations prior to transmitting the packet when in fact it is responsible for setting up the descriptors and initiating the transmit. Rename this to function from pre_xmit to dev_xmit, which is consistent with the name used during receive processing - dev_read. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05amd-xgbe: Move ring allocation to device openLendacky, Thomas3-70/+93
Move the channel and ring tracking structures allocation to device open. This will allow for future support to vary the number of Tx/Rx queues without unloading the module. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05bridge: include in6.h in if_bridge.h for struct in6_addrGregory Fong1-0/+1
if_bridge.h uses struct in6_addr ip6, but wasn't including the in6.h header. Thomas Backlund originally sent a patch to do this, but this revealed a redefinition issue: https://lkml.org/lkml/2013/1/13/116 The redefinition issue should have been fixed by the following Linux commits: ee262ad827f89e2dc7851ec2986953b5b125c6bc inet: defines IPPROTO_* needed for module alias generation cfd280c91253cc28e4919e349fa7a813b63e71e8 net: sync some IP headers with glibc and the following glibc commit: 6c82a2f8d7c8e21e39237225c819f182ae438db3 Coordinate IPv6 definitions for Linux and glibc so actually include the header now. Reported-by: Colin Guthrie <colin@mageia.org> Reported-by: Christiaan Welvaart <cjw@daneel.dyndns.org> Reported-by: Thomas Backlund <tmb@mageia.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Gregory Fong <gregory.0xf0@gmail.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05tcp: zero retrans_stamp if all retrans were ackedMarcelo Leitner1-29/+31
Ueki Kohei reported that when we are using NewReno with connections that have a very low traffic, we may timeout the connection too early if a second loss occurs after the first one was successfully acked but no data was transfered later. Below is his description of it: When SACK is disabled, and a socket suffers multiple separate TCP retransmissions, that socket's ETIMEDOUT value is calculated from the time of the *first* retransmission instead of the *latest* retransmission. This happens because the tcp_sock's retrans_stamp is set once then never cleared. Take the following connection: Linux remote-machine | | send#1---->(*1)|--------> data#1 --------->| | | | RTO : : | | | ---(*2)|----> data#1(retrans) ---->| | (*3)|<---------- ACK <----------| | | | | : : | : : | : : 16 minutes (or more) : | : : | : : | : : | | | send#2---->(*4)|--------> data#2 --------->| | | | RTO : : | | | ---(*5)|----> data#2(retrans) ---->| | | | | | | RTO*2 : : | | | | | | ETIMEDOUT<----(*6)| | (*1) One data packet sent. (*2) Because no ACK packet is received, the packet is retransmitted. (*3) The ACK packet is received. The transmitted packet is acknowledged. At this point the first "retransmission event" has passed and been recovered from. Any future retransmission is a completely new "event". (*4) After 16 minutes (to correspond with retries2=15), a new data packet is sent. Note: No data is transmitted between (*3) and (*4). The socket's timeout SHOULD be calculated from this point in time, but instead it's calculated from the prior "event" 16 minutes ago. (*5) Because no ACK packet is received, the packet is retransmitted. (*6) At the time of the 2nd retransmission, the socket returns ETIMEDOUT. Therefore, now we clear retrans_stamp as soon as all data during the loss window is fully acked. Reported-by: Ueki Kohei Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Acked-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05ipv6: move INET6_MATCH() to include/net/inet6_hashtables.hWANG Cong2-10/+10
It is only used in net/ipv6/inet6_hashtables.c. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05net: Add and use skb_copy_datagram_msg() helper.David S. Miller43-57/+58
This encapsulates all of the skb_copy_datagram_iovec() callers with call argument signature "skb, offset, msghdr->msg_iov, length". When we move to iov_iters in the networking, the iov_iter object will sit in the msghdr. Having a helper like this means there will be less places to touch during that transformation. Based upon descriptions and patch from Al Viro. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05Merge branch 'gue-next'David S. Miller15-118/+565
Tom Herbert says: ==================== gue: Remote checksum offload This patch set implements remote checksum offload for GUE, which is a mechanism that provides checksum offload of encapsulated packets using rudimentary offload capabilities found in most Network Interface Card (NIC) devices. The outer header checksum for UDP is enabled in packets and, with some additional meta information in the GUE header, a receiver is able to deduce the checksum to be set for an inner encapsulated packet. Effectively this offloads the computation of the inner checksum. Enabling the outer checksum in encapsulation has the additional advantage that it covers more of the packet than the inner checksum including the encapsulation headers. Remote checksum offload is described in: http://tools.ietf.org/html/draft-herbert-remotecsumoffload-01 The GUE transmit and receive paths are modified to support the remote checksum offload option. The option contains a checksum offset and checksum start which are directly derived from values set in stack when doing CHECKSUM_PARTIAL. On receipt of the option, the operation is to calculate the packet checksum from "start" to end of the packet (normally derived for checksum complete), and then set the resultant value at checksum "offset" (the checksum field has already been primed with the pseudo header). This emulates a NIC that implements NETIF_F_HW_CSUM. The primary purpose of this feature is to eliminate cost of performing checksum calculation over a packet when encpasulating. In this patch set: - Move fou_build_header into fou.c and split it into a couple of functions - Enable offloading of outer UDP checksum in encapsulation - Change udp_offload to support remote checksum offload, includes new GSO type and ensuring encapsulated layers (TCP) doesn't try to set a checksum covered by RCO - TX support for RCO with GUE. This is configured through ip_tunnel and set the option on transmit when packet being encapsulated is CHECKSUM_PARTIAL - RX support for RCO with GUE for normal and GRO paths. Includes resolving the offloaded checksum v2: Address comments from davem: Move accounting for private option field in gue_encap_hlen to patch in which we add the remote checksum offload option. Testing: I ran performance numbers using netperf TCP_STREAM and TCP_RR with 200 streams, comparing GUE with and without remote checksum offload (doing checksum-unnecessary to complete conversion in both cases). These were run on mlnx4 and bnx2x. Some mlnx4 results are below. GRE/GUE TCP_STREAM IPv4, with remote checksum offload 9.71% TX CPU utilization 7.42% RX CPU utilization 36380 Mbps IPv4, without remote checksum offload 12.40% TX CPU utilization 7.36% RX CPU utilization 36591 Mbps TCP_RR IPv4, with remote checksum offload 77.79% CPU utilization 91/144/216 90/95/99% latencies 1.95127e+06 tps IPv4, without remote checksum offload 78.70% CPU utilization 89/152/297 90/95/99% latencies 1.95458e+06 tps IPIP/GUE TCP_STREAM With remote checksum offload 10.30% TX CPU utilization 7.43% RX CPU utilization 36486 Mbps Without remote checksum offload 12.47% TX CPU utilization 7.49% RX CPU utilization 36694 Mbps TCP_RR With remote checksum offload 77.80% CPU utilization 87/153/270 90/95/99% latencies 1.98735e+06 tps Without remote checksum offload 77.98% CPU utilization 87/150/287 90/95/99% latencies 1.98737e+06 tps SIT/GUE TCP_STREAM With remote checksum offload 9.68% TX CPU utilization 7.36% RX CPU utilization 35971 Mbps Without remote checksum offload 12.95% TX CPU utilization 8.04% RX CPU utilization 36177 Mbps TCP_RR With remote checksum offload 79.32% CPU utilization 94/158/295 90/95/99% latencies 1.88842e+06 tps Without remote checksum offload 80.23% CPU utilization 94/149/226 90/95/99% latencies 1.90338e+06 tps VXLAN TCP_STREAM 35.03% TX CPU utilization 20.85% RX CPU utilization 36230 Mbps TCP_RR 77.36% CPU utilization 84/146/270 90/95/99% latencies 2.08063e+06 tps We can also look at CPU time in csum_partial using perf (with bnx2x setup). For GRE with TCP_STREAM I see: With remote checksum offload 0.33% TX 1.81% RX Without remote checksum offload 6.00% TX 0.51% RX I suspect the fact that time in csum_partial noticably increases with remote checksum offload for RX is due to taking the cache miss on the encapsulated header in that function. By similar reasoning, if on the TX side the packet were not in cache (say we did a splice from a file whose data was never touched by the CPU) the CPU savings for TX would probably be more pronounced. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05gue: Receive side of remote checksum offloadTom Herbert1-9/+161
Add processing of the remote checksum offload option in both the normal path as well as the GRO path. The implements patching the affected checksum to derive the offloaded checksum. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05gue: TX support for using remote checksum offload optionTom Herbert3-4/+46
Add if_tunnel flag TUNNEL_ENCAP_FLAG_REMCSUM to configure remote checksum offload on an IP tunnel. Add logic in gue_build_header to insert remote checksum offload option. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05gue: Protocol constants for remote checksum offloadTom Herbert1-1/+4
Define a private flag for remote checksun offload as well as a length for the option. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05udp: Changes to udp_offload to support remote checksum offloadTom Herbert9-6/+29
Add a new GSO type, SKB_GSO_TUNNEL_REMCSUM, which indicates remote checksum offload being done (in this case inner checksum must not be offloaded to the NIC). Added logic in __skb_udp_tunnel_segment to handle remote checksum offload case. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05gue: Add infrastructure for flags and optionsTom Herbert2-53/+189
Add functions and basic definitions for processing standard flags, private flags, and control messages. This includes definitions to compute length of optional fields corresponding to a set of flags. Flag validation is in validate_gue_flags function. This checks for unknown flags, and that length of optional fields is <= length in guehdr hlen. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05udp: Offload outer UDP tunnel csum if availableTom Herbert1-16/+36
In __skb_udp_tunnel_segment if outer UDP checksums are enabled and ip_summed is not already CHECKSUM_PARTIAL, set up checksum offload if device features allow it. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05net: Move fou_build_header into fou.c and refactorTom Herbert4-49/+120
Move fou_build_header out of ip_tunnel.c and into fou.c splitting it up into fou_build_header, gue_build_header, and fou_build_udp. This allows for other users for TX of FOU or GUE. Change ip_tunnel_encap to call fou_build_header or gue_build_header based on the tunnel encapsulation type. Similarly, added fou_encap_hlen and gue_encap_hlen functions which are called by ip_encap_hlen. New net/fou.h has prototypes and defines for this. Added NET_FOU_IP_TUNNELS configuration. When this is set, IP tunnels can use FOU/GUE and fou module is also selected. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05Merge branch 'stmmac-net'David S. Miller1-23/+29
Giuseppe Cavallaro says: ==================== stmmac: review and fix lock and atomicity Recently some issues have been reported for the driver for locking mechanism and atomicity. In fact, enabling DEBUG support to prove lock and to verify if sleeping while atomic context some warnings occur at runtime. I have reproduced all on STi platforms. Concerning the tx path, I had provided a patch time ago but I discarded the idea to completely remove locks; in this patch-set we can have some useful fixes instead of. This patch-set is to fix the atomicity in the PM stuff where I tried to collect all the points and advice reported in the past weeks. As final result, on my side no warnings and no problem when suspend/resume the driver on STi boxes. I also added a patch that fixes the locks for the EEE. As pointed in some thread there was a design problem behind the eee initialization and I have tried to fix that before. As final result no issues when proving locks too. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05stmmac: fix atomicity in pm routinesGiuseppe CAVALLARO1-13/+16
This patch is to fix the atomicity when suspend and resume the driver. The clk api have been changed (as reported by Hao Liang) and the skb allocation is done out of the hw setup function and taking care about the GFP flags. Reported-by: Hao Liang <hliang1025@gmail.com> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Hao Liang <hliang1025@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05stmmac: fix concurrency in eee initialization.Giuseppe CAVALLARO1-6/+9
This patch aims to fix the concurrency in eee initialization inside the stmmac driver and related warnings when enable DEBUG_ATOMIC_SLEEP. Prior this patch, the stmmac_eee_init could be called in several places as shown below: stmmac_open stmmac_resume PHY Layer | | | stmmac_hw_setup stmmac_adjust_link | | stmmac ethtool |__________________________|______________| | stmmac_eee_init The patch removes the stmmac_eee_init call inside the stmmac_hw_setup that is unnecessary. It is sufficient to call it in the adjust_link to always guarantee that EEE is always configured at mac level too. Fixing the lock protection now it is covered another case (not considered before). The stmmac_eee_init could be called by the ethtool so critical sections must be protected inside this function too. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05stmmac: fix lock in stmmac_set_rx_modeGiuseppe CAVALLARO1-2/+0
When compile with CONFIG_PROVE_LOCKING the following warnings happen: [snip] HARDIRQ-ON-W at: [<c0480c1c>] _raw_spin_lock+0x3c/0x4c [<c02c2828>] stmmac_set_rx_mode+0x18/0x3c [<c038b2cc>] dev_set_rx_mode+0x1c/0x28 [<c038b38c>] __dev_open+0xb4/0xf8 [<c038b5a8>] __dev_change_flags+0x94/0x128 [<c038b6a8>] dev_change_flags+0x10/0x48 [<c062afe0>] ip_auto_config+0x1d4/0x1084 [<c000873c>] do_one_initcall+0x108/0x15c [<c060ec50>] kernel_init_freeable+0x1a8/0x248 [<c0472cc0>] kernel_init+0x8/0x160 [<c000dfc8>] ret_from_fork+0x14/0x2c INITIAL USE at: [<c0480c1c>] _raw_spin_lock+0x3c/0x4c [<c02c2828>] stmmac_set_rx_mode+0x18/0x3c [<c038b2cc>] dev_set_rx_mode+0x1c/0x28 [<c038b38c>] __dev_open+0xb4/0xf8 [<c038b5a8>] __dev_change_flags+0x94/0x128 [<c038b6a8>] dev_change_flags+0x10/0x48 [<c062afe0>] ip_auto_config+0x1d4/0x1084 [<c000873c>] do_one_initcall+0x108/0x15c [<c060ec50>] kernel_init_freeable+0x1a8/0x248 [<c0472cc0>] kernel_init+0x8/0x160 [<c000dfc8>] ret_from_fork+0x14/0x2c so the patch just removes the lock protection in the stmmac_set_rx_mode Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Emilio Lopez <emilio@elopez.com.ar> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05stmmac: release tx lock, in case of dma mapping error.Fabrice Gasnier1-0/+1
Add missing spin_unlock when tx frames gets dropped. Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05stmmac: fix stmmac_tx_avail should be called with TX lockedFabrice Gasnier1-2/+3
stmmac_tx_avail() may lie if used unprotected. It's using cur_tx and dirty_tx index. These index may be already in use by tx_clean when entering xmit routine. So, this should be called locked. This can cause transmit queue to be stuck, with following message: NETDEV WATCHDOG: eth0 (stmmaceth): transmit queue 0 timed out Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05Merge branch 'stmmac-next'David S. Miller6-97/+16
Giuseppe Cavallaro says: ==================== stmmac: review driver Koptions Recently many Koption options have been added to have new glue logic on several platforms. The main goal behind this work is to guarantee that the driver built fine on all the branches where it is present independently of which glue logic is selected. IMHO, it is better to remove all the not necessary Koption(s) that can hide build problems when something changes in the driver and especially when the DT compatibility allows us to manage all the platform data. I compiled the driver w/o any issue on net-next Git for: x86, arm and sh4. In case of there are build problems on some repos now it will be easy to catch them and cherry-pick patches from mainstream. For sure, do not hesitate to contact me in case of issue. Also this set removes STMMAC_DEBUG_FS and BUS_MODE_DA. The latter is useless and the former can be replaced by DEBUG_FS (always to make safe the build). V2: patch-set re-based on top of the latest updates for net-next ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05stmmac: remove BUS_MODE_DAGiuseppe CAVALLARO2-14/+0
This is a very old and often unused option to configure a bit in a register inside the DMA. This support should not stay under Koption and should be extended for new chips too. This will be do later maybe via device-tree parameters. Also no performance impact when remove this setting on STi platforms. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05stmmac: remove STMMAC_DEBUG_FSGiuseppe CAVALLARO2-15/+7
the STMMAC_DEBUG_FS Koption is now removed from the driver configuration and this support will be built by default when DEBUG_FS is present. This can also be useful on building driver verification. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05stmmac: remove specific SoC Koption from platform.Giuseppe CAVALLARO4-68/+9
This patch removes all the Koptions added to build the glue-logic files for all different architectures: DWMAC_MESON, DWMAC_SUNXI, DWMAC_STI ... Nowadays the stmmac needs to be compiled on several platforms; in some case it very convenient to guarantee that its build is always completed with success on all the branches where the driver is present. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05drivers: net: ethernet: xilinx: xilinx_emaclite: revert the original commit ↵Chen Gang1-1/+0
"1db3ddff1602edf2390b7667dcbaa0f71512e3ea" Microblaze is a fpga soft core, it can be customized easily, which may cause many various hardware version strings. So the original fix patch based on hard-coded compatible version strings is not a good idea (although it is correct for current issue). For it, there will be a new solving way soon (which based on the device tree). The original issue is related with qemu, so can only change the hardware version string in qemu for it, then keep the original driver no touch ( qemu is for virtualization which has much easier life than real world). Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Acked-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05include/linux/socket.h: Fix commentRasmus Villemoes1-1/+1
File descriptors are always closed on exit :-) Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05net: Add missing descriptions for fwmark_reflect for ipv4 and ipv6.Loganaden Velvindron1-0/+14
It was initially sent by Lorenzo Colitti, but was subsequently lost in the final diff he submitted. Signed-off-by: Loganaden Velvindron <logan@elandsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05geneve: Unregister pernet subsys on module unload.Jesse Gross1-0/+1
The pernet ops aren't ever unregistered, which causes a memory leak and an OOPs if the module is ever reinserted. Fixes: 0b5e8b8eeae4 ("net: Add Geneve tunneling protocol driver") CC: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05geneve: Set GSO type on transmit.Jesse Gross1-0/+2
Geneve does not currently set the inner protocol type when transmitting packets. This causes GSO segmentation to fail on NICs that do not support Geneve offloading. CC: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04net: phy: spi_ks8995: remove sysfs bin file by registered attributeVladimir Zapolskiy1-1/+3
When a sysfs binary file is asked to be removed, it is found by attribute name, so strictly speaking this change is not a fix, but just in case when attribute name is changed in the driver or sysfs internals are changed, it might be better to remove the previously created file using right the same binary attribute. Signed-off-by: Vladimir Zapolskiy <vz@mleia.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: David S. Miller <davem@davemloft.net> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04udp: remove blank line between set and testFabian Frederick1-1/+0
Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04ipv6: trivial, add bracket for the if blockFlorent Fourcot1-2/+2
The "else" block is on several lines and use bracket. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04Merge branch 'xgene-net'David S. Miller7-13/+49
Iyappan Subramanian says: ==================== drivers: net: xgene: Fix crash for backward compatibility This patch set fixes the following issues that were reported during regression. Patch 1,2 : Adds backward compatibility with the older firmware (<= 1.13.28). Patch 3 : Use separate hardware resources (descriptor ring, prefetch buffer) that are not shared with the firmware ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04drivers: net: xgene: fix: Use separate resourcesIyappan Subramanian2-3/+6
This patch fixes the following kernel crash during SGMII based 1GbE probe. BUG: Bad page state in process swapper/0 pfn:40fe6ad page:ffffffbee37a75d8 count:-1 mapcount:0 mapping: (null) index:0x0 flags: 0x0() page dumped because: nonzero _count Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.17.0+ #7 Call trace: [<ffffffc000087fa0>] dump_backtrace+0x0/0x12c [<ffffffc0000880dc>] show_stack+0x10/0x1c [<ffffffc0004d981c>] dump_stack+0x74/0xc4 [<ffffffc00012fe70>] bad_page+0xd8/0x128 [<ffffffc000133000>] get_page_from_freelist+0x4b8/0x640 [<ffffffc000133260>] __alloc_pages_nodemask+0xd8/0x834 [<ffffffc0004194f8>] __netdev_alloc_frag+0x124/0x1b8 [<ffffffc00041bfdc>] __netdev_alloc_skb+0x90/0x10c [<ffffffc00039ff30>] xgene_enet_refill_bufpool+0x11c/0x280 [<ffffffc0003a11a4>] xgene_enet_process_ring+0x168/0x340 [<ffffffc0003a1498>] xgene_enet_napi+0x1c/0x50 [<ffffffc00042b454>] net_rx_action+0xc8/0x18c [<ffffffc0000b0880>] __do_softirq+0x114/0x24c [<ffffffc0000b0c34>] irq_exit+0x94/0xc8 [<ffffffc0000e68a0>] __handle_domain_irq+0x8c/0xf4 [<ffffffc000081288>] gic_handle_irq+0x30/0x7c This was due to hardware resource sharing conflict with the firmware. This patch fixes this crash by using resources (descriptor ring, prefetch buffer) that are not shared. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Keyur Chudgar <kchudgar@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04drivers: net: xgene: Backward compatibility with older firmwareIyappan Subramanian6-5/+38
This patch adds support when used with older firmware (<= 1.13.28). - Added xgene_ring_mgr_init() to check whether ring manager is initialized - Calling xgene_ring_mgr_init() from xgene_port_ops.reset() - To handle errors, changed the return type of xgene_port_ops.reset() Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Keyur Chudgar <kchudgar@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04dtb: xgene: fix: Backward compatibility with older firmwareIyappan Subramanian1-5/+5
The following kernel crash was reported when using older firmware (<= 1.13.28). [ 0.980000] libphy: APM X-Gene MDIO bus: probed [ 1.130000] Unhandled fault: synchronous external abort (0x96000010) at 0xffffff800009a17c [ 1.140000] Internal error: : 96000010 [#1] SMP [ 1.140000] Modules linked in: [ 1.140000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.17.0+ #21 [ 1.140000] task: ffffffc3f0110000 ti: ffffffc3f0064000 task.ti: ffffffc3f0064000 [ 1.140000] PC is at ioread32+0x58/0x68 [ 1.140000] LR is at xgene_enet_setup_ring+0x18c/0x1cc [ 1.140000] pc : [<ffffffc0003cec68>] lr : [<ffffffc00053dad8>] pstate: a0000045 [ 1.140000] sp : ffffffc3f0067b20 [ 1.140000] x29: ffffffc3f0067b20 x28: ffffffc000aa8ea0 [ 1.140000] x27: ffffffc000bb2000 x26: ffffffc000a64270 [ 1.140000] x25: ffffffc000b05ad8 x24: ffffffc0ff99ba58 [ 1.140000] x23: 0000000000004000 x22: 0000000000004000 [ 1.140000] x21: 0000000000000200 x20: 0000000000200000 [ 1.140000] x19: ffffffc0ff99ba18 x18: ffffffc0007a6000 [ 1.140000] x17: 0000000000000007 x16: 000000000000000e [ 1.140000] x15: 0000000000000001 x14: 0000000000000000 [ 1.140000] x13: ffffffbeedb71320 x12: 00000000ffffff80 [ 1.140000] x11: 0000000000000002 x10: 0000000000000000 [ 1.140000] x9 : 0000000000000000 x8 : ffffffc3eb2a4000 [ 1.140000] x7 : 0000000000000000 x6 : 0000000000000000 [ 1.140000] x5 : 0000000001080000 x4 : 000000007d654010 [ 1.140000] x3 : ffffffffffffffff x2 : 000000000003ffff [ 1.140000] x1 : ffffff800009a17c x0 : ffffff800009a17c The issue was that the older firmware does not support 10GbE and SGMII based 1GBE interfaces. This patch changes the address length of the reg property of sgmii0 and xgmii nodes and serves as preparatory patch for the fix. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Keyur Chudgar <kchudgar@apm.com> Reported-by: Dann Frazier <dann.frazier@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04esp4: remove assignment in if conditionFabian Frederick1-1/+3
Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04Merge branch 'ecn_via_routing_table'David S. Miller5-44/+87
Florian Westphal says: ==================== net: allow setting ecn via routing table Here is v4 of the patchset, its exactly the same as v3 except in patch3/3 where I added the missing 'const' qualifier to a function argument that Eric spotted during review. I preserved Erics Acks so that he doesn't have to resend them. v3 cover letter: When using syn cookies, then do not simply trust that the echoed timestamp was not modified to make sure that ecn is not turned on magically when it is disabled on the host. The first two patches, which were not part of earlier series, prepare the cookie code for the ecn route metrics change by allowing is to more easily use the existing dst object for ecn validation. The 3rd patch adds the ecn route metric feature support. It is almost the same as in v2, except that we'll now also test the dst_features when decoding a syn cookie timestamp that indicates ecn support. These three patches then allow turning on explicit congestion notification based on the destination network. For example, assuming the default tcp_ecn sysctl '2', the following will enable ecn (tcp_ecn=1 behaviour, i.e. request ecn to be enabled for a tcp connection) for all connections to hosts inside the 192.168.2/24 network: ip route change 192.168.2.0/24 dev eth0 features ecn Having a more fine-grained per-route setting can be beneficial for various reasons, for example 1) within data centers, or 2) local ISPs may deploy ECN support for their own video/streaming services [1], etc. Joint work with Daniel Borkmann, feature suggested by Hannes Frederic Sowa. The patch to enable this in iproute2 will be posted shortly, it is currently also available here: http://git.breakpoint.cc/cgit/fw/iproute2.git/commit/?h=iproute_features&id=8843d2d8973fb81c78a7efe6d42e3a17d739003e [1] http://www.ietf.org/proceedings/89/slides/slides-89-tsvarea-1.pdf, p.15 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04net: allow setting ecn via routing tableFlorian Westphal5-17/+31
This patch allows to set ECN on a per-route basis in case the sysctl tcp_ecn is not set to 1. In other words, when ECN is set for specific routes, it provides a tcp_ecn=1 behaviour for that route while the rest of the stack acts according to the global settings. One can use 'ip route change dev $dev $net features ecn' to toggle this. Having a more fine-grained per-route setting can be beneficial for various reasons, for example, 1) within data centers, or 2) local ISPs may deploy ECN support for their own video/streaming services [1], etc. There was a recent measurement study/paper [2] which scanned the Alexa's publicly available top million websites list from a vantage point in US, Europe and Asia: Half of the Alexa list will now happily use ECN (tcp_ecn=2, most likely blamed to commit 255cac91c3 ("tcp: extend ECN sysctl to allow server-side only ECN") ;)); the break in connectivity on-path was found is about 1 in 10,000 cases. Timeouts rather than receiving back RSTs were much more common in the negotiation phase (and mostly seen in the Alexa middle band, ranks around 50k-150k): from 12-thousand hosts on which there _may_ be ECN-linked connection failures, only 79 failed with RST when _not_ failing with RST when ECN is not requested. It's unclear though, how much equipment in the wild actually marks CE when buffers start to fill up. We thought about a fallback to non-ECN for retransmitted SYNs as another global option (which could perhaps one day be made default), but as Eric points out, there's much more work needed to detect broken middleboxes. Two examples Eric mentioned are buggy firewalls that accept only a single SYN per flow, and middleboxes that successfully let an ECN flow establish, but later mark CE for all packets (so cwnd converges to 1). [1] http://www.ietf.org/proceedings/89/slides/slides-89-tsvarea-1.pdf, p.15 [2] http://ecn.ethz.ch/ Joint work with Daniel Borkmann. Reference: http://thread.gmane.org/gmane.linux.network/335797 Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04syncookies: split cookie_check_timestamp() into two functionsFlorian Westphal3-18/+27
The function cookie_check_timestamp(), both called from IPv4/6 context, is being used to decode the echoed timestamp from the SYN/ACK into TCP options used for follow-up communication with the peer. We can remove ECN handling from that function, split it into a separate one, and simply rename the original function into cookie_decode_options(). cookie_decode_options() just fills in tcp_option struct based on the echoed timestamp received from the peer. Anything that fails in this function will actually discard the request socket. While this is the natural place for decoding options such as ECN which commit 172d69e63c7f ("syncookies: add support for ECN") added, we argue that in particular for ECN handling, it can be checked at a later point in time as the request sock would actually not need to be dropped from this, but just ECN support turned off. Therefore, we split this functionality into cookie_ecn_ok(), which tells us if the timestamp indicates ECN support AND the tcp_ecn sysctl is enabled. This prepares for per-route ECN support: just looking at the tcp_ecn sysctl won't be enough anymore at that point; if the timestamp indicates ECN and sysctl tcp_ecn == 0, we will also need to check the ECN dst metric. This would mean adding a route lookup to cookie_check_timestamp(), which we definitely want to avoid. As we already do a route lookup at a later point in cookie_{v4,v6}_check(), we can simply make use of that as well for the new cookie_ecn_ok() function w/o any additional cost. Joint work with Daniel Borkmann. Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04syncookies: avoid magic values and document which-bit-is-what-optionFlorian Westphal1-15/+35
Was a bit more difficult to read than needed due to magic shifts; add defines and document the used encoding scheme. Joint work with Daniel Borkmann. Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04igmp: remove camel case definitionsFabian Frederick1-14/+14
use standard uppercase for definitions Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04udp: remove else after returnFabian Frederick1-6/+6
else is unnecessary after return 0 in __udp4_lib_rcv() Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04inet: frags: remove inline on static in c fileFabian Frederick1-8/+8
remove __inline__ / inline and let compiler decide what to do with static functions Inspired-by: "David S. Miller" <davem@davemloft.net> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04ipv4: remove 0/NULL assignment on staticFabian Frederick1-8/+8
static values are automatically initialized to 0 Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04ipv4: use seq_puts instead of seq_printf where possibleFabian Frederick1-3/+3
Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04tcp: spelling s/plugable/pluggableFabian Frederick1-1/+1
Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04cipso: remove NULL assignment on staticFabian Frederick1-1/+3
Also add blank line after structure declarations Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04ipv4: include linux/bug.h instead of asm/bug.hFabian Frederick1-1/+1
Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04cipso: kerneldoc warning fixFabian Frederick1-1/+1
Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03net: add rbnode to struct sk_buffEric Dumazet2-27/+20
Yaogong replaces TCP out of order receive queue by an RB tree. As netem already does a private skb->{next/prev/tstamp} union with a 'struct rb_node', lets do this in a cleaner way. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yaogong Wang <wygivan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03Merge branch 'master' of ↵David S. Miller6-117/+51
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2014-11-03 This series contains updates to i40e and i40evf. Akeem adds a check for i40e so that flow director flush and reinit are not done when flow director is not enabled. Mitch fixes the i40evf driver to properly handle multiple admin queue messages, by reinit the msg_size field each time we go through the loop. Without this, we may receive truncated messages due to the firmware thinking we have insufficient buffer size. Also fixes the link checking logic to only check the carrier state if the interface is actually open, which allows link changes to be reported correctly without spamming the VFs. Updates i40e to inset the VSI ID in the QTX_CTL register when configuring queues for VMDq VSIs. Paul adds support for 10G-base-T in i40evf. Jesse fixes i40e where the call to irq_dynamic_disable() was turning off the interrupt completely when trying to set ITR to 0 (for lowest moderation). Shannon removes debugfs dump stats function, since it was not being kept up-to-date and was redundant with the ethtool output. Also, scales back the LAN MSIx usage to force queue/vector sharing and leave some vectors for Flow Director, VMDq, etc. when there are more cores than vectors available to the PF. Cleans up the error reporting for get_lump() resource tracking errors. Also adds a check for the debug module parameter earlier to be able to catch the early configuration phase admin queue messages. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03sfc: don't BUG_ON efx->max_channels == 0 in probeEdward Cree1-1/+2
efx_ef10_probe() was BUGging out if the BAR2 size was 0. This is unnecessarily violent; instead we should just fail to probe the device. Kept a WARN_ON as this problem indicates a broken or misconfigured NIC. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03Merge branch 'ipv6_tunnel_iflink_init'David S. Miller4-30/+11
Steffen Klassert says: ==================== ipv6: Fix iflink setting for ipv6 tunnels The ipv6 tunnels do the dev->iflink setting too early, it gets overwritten by register_netdev(). So set dev->iflink from within a ndo_init function to keep the configured setting. This patchset fixes this for ip6_tunnel, vti6, sit and gre6. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03gre6: Move the setting of dev->iflink into the ndo_init functions.Steffen Klassert1-2/+3
Otherwise it gets overwritten by register_netdev(). Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03sit: Use ipip6_tunnel_init as the ndo_init function.Steffen Klassert1-9/+6
ipip6_tunnel_init() sets the dev->iflink via a call to ipip6_tunnel_bind_dev(). After that, register_netdevice() sets dev->iflink = -1. So we loose the iflink configuration for ipv6 tunnels. Fix this by using ipip6_tunnel_init() as the ndo_init function. Then ipip6_tunnel_init() is called after dev->iflink is set to -1 from register_netdevice(). Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03vti6: Use vti6_dev_init as the ndo_init function.Steffen Klassert1-10/+1
vti6_dev_init() sets the dev->iflink via a call to vti6_link_config(). After that, register_netdevice() sets dev->iflink = -1. So we loose the iflink configuration for vti6 tunnels. Fix this by using vti6_dev_init() as the ndo_init function. Then vti6_dev_init() is called after dev->iflink is set to -1 from register_netdevice(). Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03ip6_tunnel: Use ip6_tnl_dev_init as the ndo_init function.Steffen Klassert1-9/+1
ip6_tnl_dev_init() sets the dev->iflink via a call to ip6_tnl_link_config(). After that, register_netdevice() sets dev->iflink = -1. So we loose the iflink configuration for ipv6 tunnels. Fix this by using ip6_tnl_dev_init() as the ndo_init function. Then ip6_tnl_dev_init() is called after dev->iflink is set to -1 from register_netdevice(). Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03hamradio: 6pack: remove unnecessary checkSudip Mukherjee1-2/+1
this is check for dev is unnecessary, as we are already checking dev after allocating it via alloc_netdev, and jumping to label: out if it is NULL. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03PPC: bpf_jit_comp: add SKF_AD_PKTTYPE instructionDenis Kirjanov3-0/+13
Add BPF extension SKF_AD_PKTTYPE to ppc JIT to load skb->pkt_type field. Before: [ 88.262622] test_bpf: #11 LD_IND_NET 86 97 99 PASS [ 88.265740] test_bpf: #12 LD_PKTTYPE 109 107 PASS After: [ 80.605964] test_bpf: #11 LD_IND_NET 44 40 39 PASS [ 80.607370] test_bpf: #12 LD_PKTTYPE 9 9 PASS CC: Alexei Starovoitov<alexei.starovoitov@gmail.com> CC: Michael Ellerman<mpe@ellerman.id.au> Cc: Matt Evans <matt@ozlabs.org> Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> v2: Added test rusults Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03macvtap: Fix csum_start when VLAN tags are presentHerbert Xu1-0/+2
When VLAN is in use in macvtap_put_user, we end up setting csum_start to the wrong place. The result is that the whoever ends up doing the checksum setting will corrupt the packet instead of writing the checksum to the expected location, usually this means writing the checksum with an offset of -4. This patch fixes this by adjusting csum_start when VLAN tags are detected. Fixes: f09e2249c4f5 ("macvtap: restore vlan header on user read") Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03net: fec: fix suspend broken on multiple MACs silliconsNimrod Andy1-8/+8
On i.MX6SX sdb platform, there has two same enet MACs, after system up, just eth0 is up, and then do suspend/resume test: [ 50.437967] PM: Syncing filesystems ... done. [ 50.476924] Freezing user space processes ... (elapsed 0.005 seconds) done. [ 50.490093] Freezing remaining freezable tasks ... (elapsed 0.004 seconds) done. [ 50.559771] ------------[ cut here ]------------ [ 50.564453] WARNING: CPU: 0 PID: 575 at drivers/clk/clk.c:851 __clk_disable+0x60/0x6c() [ 50.572475] Modules linked in: [ 50.575578] CPU: 0 PID: 575 Comm: sh Not tainted 3.18.0-rc2-next-20141031-00007-gf61135b #21 [ 50.584031] Backtrace: [ 50.586550] [<80011ecc>] (dump_backtrace) from [<8001206c>] (show_stack+0x18/0x1c) [ 50.594136] r6:808a7a54 r5:00000000 r4:00000000 r3:00000000 [ 50.599920] [<80012054>] (show_stack) from [<806ab3c0>] (dump_stack+0x80/0x9c) [ 50.607187] [<806ab340>] (dump_stack) from [<8002a3e8>] (warn_slowpath_common+0x6c/0x8c) [ 50.615294] r5:00000353 r4:00000000 [ 50.618940] [<8002a37c>] (warn_slowpath_common) from [<8002a42c>] (warn_slowpath_null+0x24/0x2c) [ 50.627738] r8:00000000 r7:be144c44 r6:be015600 r5:80070013 r4:be015600 [ 50.634573] [<8002a408>] (warn_slowpath_null) from [<804f8d4c>] (__clk_disable+0x60/0x6c) [ 50.642777] [<804f8cec>] (__clk_disable) from [<804f8e5c>] (clk_disable+0x2c/0x38) [ 50.650359] r4:be015600 r3:00000000 [ 50.654006] [<804f8e30>] (clk_disable) from [<80420ab4>] (fec_enet_clk_enable+0xc4/0x258) [ 50.662196] r5:be3cb620 r4:be3cb000 [ 50.665838] [<804209f0>] (fec_enet_clk_enable) from [<80421178>] (fec_suspend+0x30/0x180) [ 50.674026] r7:be144c44 r6:be144c10 r5:8037f5a4 r4:be3cb000 [ 50.679802] [<80421148>] (fec_suspend) from [<8037f5d8>] (platform_pm_suspend+0x34/0x64) [ 50.687906] r10:00000000 r9:00000000 r8:00000000 r7:be144c44 r6:be144c10 r5:8037f5a4 [ 50.695852] r4:be144c10 r3:80421148 [ 50.699511] [<8037f5a4>] (platform_pm_suspend) from [<8038784c>] (dpm_run_callback.isra.14+0x34/0x6c) [ 50.708764] [<80387818>] (dpm_run_callback.isra.14) from [<80387f00>] (__device_suspend+0x12c/0x2a4) [ 50.717909] r9:8098ec8c r8:80973bec r6:00000002 r5:811c7038 r4:be144c10 [ 50.724746] [<80387dd4>] (__device_suspend) from [<803894fc>] (dpm_suspend+0x64/0x224) [ 50.732675] r8:80973bec r7:be144c10 r6:8098ec24 r5:811c7038 r4:be144cc4 [ 50.739509] [<80389498>] (dpm_suspend) from [<8038999c>] (dpm_suspend_start+0x60/0x68) [ 50.747438] r10:8082fa24 r9:00000000 r8:00000004 r7:00000003 r6:00000000 r5:8116ec80 [ 50.755386] r4:00000002 [ 50.757969] [<8038993c>] (dpm_suspend_start) from [<800679d8>] (suspend_devices_and_enter+0x90/0x3ec) [ 50.767202] r4:00000003 r3:8116eca0 [ 50.770843] [<80067948>] (suspend_devices_and_enter) from [<80067f40>] (pm_suspend+0x20c/0x2a4) [ 50.779553] r8:00000004 r7:00000003 r6:00000000 r5:8116ec8c r4:00000003 [ 50.786394] [<80067d34>] (pm_suspend) from [<80066858>] (state_store+0x70/0xc0) [ 50.793718] r6:8116ec90 r5:00000003 r4:bd88a800 r3:0000006d [ 50.799496] [<800667e8>] (state_store) from [<802b0384>] (kobj_attr_store+0x1c/0x28) [ 50.807251] r10:bd399f78 r8:00000000 r7:bd88a800 r6:bd88a800 r5:00000004 r4:bd085680 [ 50.815219] [<802b0368>] (kobj_attr_store) from [<80153090>] (sysfs_kf_write+0x54/0x58) [ 50.823252] [<8015303c>] (sysfs_kf_write) from [<80151fd8>] (kernfs_fop_write+0xd0/0x194) [ 50.831441] r6:00000004 r5:bd08568c r4:bd085680 r3:8015303c [ 50.837220] [<80151f08>] (kernfs_fop_write) from [<800eddb4>] (vfs_write+0xb8/0x1a8) [ 50.844975] r10:00000000 r9:00000000 r8:00000000 r7:bd399f78 r6:01336408 r5:00000004 [ 50.852924] r4:bc584dc0 [ 50.855505] [<800edcfc>] (vfs_write) from [<800ee0b8>] (SyS_write+0x48/0x88) [ 50.862567] r10:00000000 r8:00000000 r7:01336408 r6:00000004 r5:bc584dc0 r4:bc584dc0 [ 50.870537] [<800ee070>] (SyS_write) from [<8000eb00>] (ret_fast_syscall+0x0/0x48) [ 50.878120] r9:bd398000 r8:8000ecc4 r7:00000004 r6:76f42b48 r5:01336408 r4:00000004 [ 50.885983] ---[ end trace 7545115d752a316a ]--- [ 50.890765] ------------[ cut here ]------------ The root cause is that eth1 is not opened and clock is not enabled, and .suspend() still call .fec_enet_clk_enable() to disable clock. To avoid the broken, let it check network device up status by calling .netif_running() before disable/enable clocks. Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03Merge branch 'tun-net'David S. Miller1-11/+17
Herbert Xu says: ==================== tun: Fix csum_start and TUN_PKT_STRIP The first patch fixes a serious problem that breaks checksum offload in VMs while the second patch fixes a problem that probably affects no one. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03tun: Fix TUN_PKT_STRIP settingHerbert Xu1-4/+8
We set the flag TUN_PKT_STRIP if the user buffer provided is too small to contain the entire packet plus meta-data. However, this has been broken ever since we added GSO meta-data. VLAN acceleration also has the same problem. This patch fixes this by taking both into account when setting the TUN_PKT_STRIP flag. The fact that this has been broken for six years without anyone realising means that nobody actually uses this flag. Fixes: f43798c27684 ("tun: Allow GSO using virtio_net_hdr") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03tun: Fix csum_start with VLAN accelerationHerbert Xu1-7/+9
When VLAN acceleration is in use on the xmit path, we end up setting csum_start to the wrong place. The result is that the whoever ends up doing the checksum setting will corrupt the packet instead of writing the checksum to the expected location, usually this means writing the checksum with an offset of -4. This patch fixes this by adjusting csum_start when VLAN acceleration is detected. Fixes: 6680ec68eff4 ("tuntap: hardware vlan tx support") Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03uapi: add missing network related headers to kbuildstephen hemminger1-0/+4
The makefile for sanitizing kernel headers uses the kbuild file to determine which files to do. Several networking related headers were missing. Without these headers iproute2 build would break. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03Merge branch 'mlx4-next'David S. Miller9-26/+190
Or Gerlitz says: ==================== Mellanox ethernet driver update Oct-30-2014 The 1st patch from Saeed fixes a bug in the last net-next batch where a VF could get access to set port configuration, the next patch from Amir fixes a race in the port VPI logic. Next are two performance patches from Ido. The patch to add checksum complete status on GRE and such packets was preceded with a patch that converted the driver to only use napi_gro_receive vs. the current code which goes through napi_gro_frags on it's usual track. Eric D. has some thoughts and suggestions on that change for which we want to take the time and consider, so for the time being dropped that patch and the ones that depend on it. Changes from V0: - have the caller to provide the __GFP_COLD hint to the service function - dropped the patch that changes the GRO logic and the subsequent dependent patches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03net/mlx4_core: Add retrieval of CONFIG_DEV parametersMatan Barak6-7/+139
Add code to issue CONFIG_DEV "get" firmware command. This command is used in order to obtain certain parameters used for supporting various RX checksumming options and vxlan UDP port. The GET operation is allowed for VFs too. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03net/mlx4_en: Add __GFP_COLD gfp flags in alloc_pagesIdo Shamay1-3/+4
Needed in order to get cache cold pages (L3 flushed) for HW scatter. Otherwise memory may flush those entries when the packet comes from PCI, causing back pressure resulting in BW decrease. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03net/mlx4_en: Remove RX buffers alignment to IP_ALIGNIdo Shamay2-13/+4
When IP_ALIGN has a non zero value, hardware will write to a non aligned address. The only reader from this address is when copying the header from the first frag into the linear buffer (further access to the IP address will be from the linear buffer, in which the headers are aligned). Since the penalty of non align access by the hardware is greater than the software memcpy, changing the frag_align to always be 0. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03net/mlx4_core: Protect port type setting by mutexAmir Vadai1-1/+8
We need to protect set_port_type() for concurrency, as the sysfs code could call it from mutliple contexts in parallel. The port_mutex is not enough because we need to protect from concurrent modification of 'info' and stopping of the port sensing work. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03net/mlx4_core: Prevent VF from changing port configurationSaeed Mahameed3-2/+35
Added wrapper to the ACCESS_REG command for handling guest HW registers access, preventing write operations, but do allow reads. This will prevent SRIOV guests to change port PTYS configuration, such as speed/advertised link modes. Fixes: adbc7ac5c15e ('net/mlx4_core: Introduce ACCESS_REG CMD [...]') Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03net: less interrupt masking in NAPIEric Dumazet1-25/+43
net_rx_action() can mask irqs a single time to transfert sd->poll_list into a private list, for a very short duration. Then, napi_complete() can avoid masking irqs again, and net_rx_action() only needs to mask irq again in slow path. This patch removes 2 couples of irq mask/unmask per typical NAPI run, more if multiple napi were triggered. Note this also allows to give control back to caller (do_softirq()) more often, so that other softirq handlers can be called a bit earlier, or ksoftirqd can be wakeup earlier under pressure. This was developed while testing an alternative to RX interrupt mitigation to reduce latencies while keeping or improving GRO aggregation on fast NIC. Idea is to test napi->gro_list at the end of a napi->poll() and reschedule one NAPI poll, but after servicing a full round of softirqs (timers, TX, rcu, ...). This will be allowed only if softirq is currently serviced by idle task or ksoftirqd, and resched not needed. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03net: shrink struct softnet_dataEric Dumazet1-7/+8
flow_limit in struct softnet_data is only read from local cpu and can be moved to fill a hole, reducing softnet_data size by 64 bytes on x86_64 While we are at it, move output_queue, output_queue_tailp and completion_queue, so that rx / tx paths touch a single cache line. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03netfilter: nft_reject_bridge: Fix powerpc build errorGuenter Roeck1-0/+1
Fix: net/bridge/netfilter/nft_reject_bridge.c: In function 'nft_reject_br_send_v6_unreach': net/bridge/netfilter/nft_reject_bridge.c:240:3: error: implicit declaration of function 'csum_ipv6_magic' csum_ipv6_magic(&nip6h->saddr, &nip6h->daddr, ^ make[3]: *** [net/bridge/netfilter/nft_reject_bridge.o] Error 1 Seen with powerpc:allmodconfig. Fixes: 523b929d5446 ("netfilter: nft_reject_bridge: don't use IP stack to reject traffic") Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03i40e: properly parse MDET registersMitch Williams1-5/+7
Fix a few problems with our parsing of the MDET registers: * Queue IDs are longer than 8 bits * Queue IDs are absolute for the device and the base queue must be subtracted out. * VF IDs are longer than 8 bits * Use the MASK define to mask the event value, instead of the SHIFT define. Change-ID: I3dc7237f480c02e1192a2a8ea782f8a02ab2a8b7 Reported-by: Marc Neustadter <marc.neustadter@intel.com> Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Patrick Lu <patrick.lu@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-11-03i40e: configure VM ID in qtx_ctlMitch Williams1-2/+6
We must insert the VSI ID in the QTX_CTL register when configuring queues for VMDQ VSIs. Change-ID: Iedfe36bd42ca0adc90a7cc2b7cf04795a98f4761 Reported-by: Marc Neustadter <marc.neustadter@intel.com> Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Patrick Lu <patrick.lu@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-11-03i40e: enable debug earlierShannon Nelson1-0/+5
Check the debug module parameter earlier to be able to catch the early configuration phase adminq messages. Change-ID: Ic84fabd72393489bbf96042de770790a80fd8468 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Patrick Lu <patrick.lu@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-11-03i40e: better wording for resource tracking errorsShannon Nelson1-6/+8
Tweak and homogenize the error reporting for get_lump() resource tracking errors. Change-ID: I11330161cc6ad8d04371c499c63071c816171c3b Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Patrick Lu <patrick.lu@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-11-03i40e: scale msix vector use when more cores than vectorsShannon Nelson1-4/+7
When there are more cores than vectors available to the PF, scale back the LAN msix usage to force queue/vector sharing and leave some vectors for Flow Director, VMDq, etc. Change-ID: Ie0317732eb85ad8d851d7da7d9af86b1bf8c21ad Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Patrick Lu <patrick.lu@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-11-03i40e: remove debugfs dump statsShannon Nelson1-91/+2
The debugfs dump stats wasn't being kept up-to-date, was redundant with the ethtool output, and didn't offer any useful additional info. Rather than continue trying to keep them aligned, just remove the debugfs command. Change-ID: Id130ed9aef01c6369ab662c7b4c5ec5b1dbc5b40 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Patrick Lu <patrick.lu@intel.com> Tested-by: Jim Young <Jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-11-03i40e: avoid disable of interrupt when changing ITRJesse Brandeburg1-2/+0
The call to irq_dynamic_disable was turning off the interrupt completely when trying to set ITR to 0 (for lowest moderation). Just remove the call as setting the values to 0 later in this function will suffice. Change-ID: I47caf1ecbe65653cf63ec833db93094cd83fd84d Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Patrick Lu <patrick.lu@intel.com> Tested-By: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-11-03i40evf: Add support for 10G base T partsPaul M Stillwell Jr2-0/+2
Add 10G-Base-T support in i40evf. Change-ID: I98a1c3138d7d6572fe7903a7c1c4692cae3260d5 Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Signed-off-by: Patrick Lu <patrick.lu@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-11-03i40e: fix link checking logicMitch Williams1-4/+7
If the interface is closed, but VFs exist, current code will spam all the VFs with link messages every second. This is because the link event code was looking at netif_carrier_ok() without checking to see if the interface was actually open. Refactor the logic to only check the carrier state if the interface is actually open. This allows link changes to be reported correctly without spamming the VFs. Change-ID: If136e79bb3820d21ea4e39e332e8a9604efc2b2a Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Patrick Lu <patrick.lu@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-11-03i40evf: properly handle multiple AQ messagesMitch Williams1-3/+1
When we receive an admin queue message, the msg_size field in the event struct gets overwritten. Because of this, we need to reinit the field each time we go through the loop. Without this we may receive truncated messages due to the firmware thinking we have insufficient buffer size. Change-ID: I21dcca5114d91365d731169965ce3ffec0e4a190 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Patrick Lu <patrick.lu@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-11-03i40e: Add condition to enter fdir flush and reinitAkeem G Abodunrin1-0/+6
When FD_SB/ATR are not enabled, do not allow flow director flush and reinit. Change-ID: Iafe261c1862992981615815551abd1ed9fada0a8 Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Patrick Lu <patrick.lu@intel.com> Tested-by: <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-11-01smc91x: retrieve IRQ and trigger flags in a modern wayLinus Walleij1-8/+12
The SMC91x is written to explicitly look up the IRQ resource from the platform device and extract the IRQ and flags, however the platform_get_irq() does additional things, like call of_irq_get() in the device tree case, which will translate the IRQ using the irqdomain and defer the probe if the IRQ host cannot be found. As we're not looking up the resource, this will not retrieve the IRQ flags, but that is better done using irqd_get_trigger_type(), as the trigger is what the driver wants to modify. We take care to preserve the semantics that will make the trigger type provided from the resource override any local specifier. Tested on the Nomadik NHK15 which has its SMC91x IRQ line connected to a STMPE2401 GPIO expander on I2C. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-01drivers: net: ethernet: xilinx: xilinx_emaclite: Compatible with 'xlnx, ↵Chen Gang1-0/+1
xps-ethernetlite-2.00.b' for QEMU using When use current latest upstream qemu (current version: 2.1.2), need let driver compatible with 'xlnx,xps-ethernetlite-2.00.b', or can not find net device in microblaze qemu. Related QEMU commands under fedora 20: yum install libvirt yum install tunctl tunctl -b ip link set tap0 up brctl addif virbr0 tap0 ./microblaze-softmmu/qemu-system-microblaze -M petalogix-s3adsp1800 \ -kernel ../linux-stable.microblaze/arch/microblaze/boot/linux.bin \ -no-reboot -append "console=ttyUL0,115200 doreboot" -nographic \ -net nic,vlan=0,model=xlnx.xps-ethernetlite,macaddr=00:16:35:AF:94:00 \ -net tap,vlan=0,ifname=tap0,script=no,downscript=no in microblaze qemu bash (guest machine): ifconfig eth0 add 192.168.122.2 netmask 255.255.255.0 ifconfig eth0 up After add this patch, can find the device, and can be used by 'telnetd' (need cross-build busybox with glibc for it), then outside can telnet to it without password. Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-01Merge branch 'systemport-net'David S. Miller1-2/+11
Florian Fainelli says: ==================== net: systemport: TX dma fixes This patch series contains two fixes for our transmit path, first one is a pretty nasty one since we were not allocating a large enough dma coherent pool for our transmit descriptors, which would work most of the time, since allocations are contiguous and we could have. Second patch fixes a less frequent, though highly likley crash when using CMA allocations. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-01net: systemport: do not crash freeing an unitialized TX ringFlorian Fainelli1-0/+7
Callers of bcm_sysport_init_tx_ring() can currently fail, and will always call bcm_sysport_fini_tx_ring() in a loop ending at the number of TX queues (32) without checking if the TX ring was successfully initialized or not. Update bcm_sysport_fini_tx_ring() to return early and avoid a crash de-referencing ring->cbs if the TX ring was not initialized, since ring->cbs is the last part of the initialization done by bcm_sysport_init_tx_ring() that could fail. Fixes: 80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-01net: systemport: fix DMA allocation/freeing sizesFlorian Fainelli1-2/+4
We should not be allocating a single byte of DMA coherent memory, but instead a full-sized struct dma_desc (8 bytes). Fixes: 80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-01net: mvpp2: fix possible memory leakSudip Mukherjee1-9/+18
we are allocating memory using kzalloc for struct mvpp2_prs_entry, but later when we are getting error we were just returning the error value without releasing the memory. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller884-8410/+18904
Conflicts: drivers/net/phy/marvell.c Simple overlapping changes in drivers/net/phy/marvell.c Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-01sunhme: Add DMA mapping error checks.David S. Miller1-5/+57
Reported-by: Meelis Roos <mroos@linux.ee> Tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-31Merge branch 'for-linus' of ↵Linus Torvalds12-29/+302
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: "A bunch of fixes for minor defects reported by Coverity, a few driver fixups and revert of i8042.nomux change so that we are once again enable active MUX mode if box claims to support it" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Revert "Input: i8042 - disable active multiplexing by default" Input: altera_ps2 - use correct type for irq return value Input: altera_ps2 - write to correct register when disabling interrupts Input: max77693-haptic - fix potential overflow Input: psmouse - remove unneeded check in psmouse_reconnect() Input: vsxxxaa - fix code dropping bytes from queue Input: ims-pcu - fix dead code in ims_pcu_ofn_reg_addr_store() Input: opencores-kbd - fix error handling Input: wm97xx - adapt parameters to tosa touchscreen. Input: i8042 - quirks for Fujitsu Lifebook A544 and Lifebook AH544 Input: stmpe-keypad - fix valid key line bitmask Input: soc_button_array - update calls to gpiod_get*()
2014-10-31Merge tag 'pm+acpi-3.18-rc3' of ↵Linus Torvalds5-53/+126
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management fixes from Rafael Wysocki: "These are fixes received after my previous pull request plus one that has been in the works for quite a while, but its previous version caused problems to happen, so it's been deferred till now. Fixed are two recent regressions (MFD enumeration and cpufreq-dt), ACPI EC regression introduced in 3.17, system suspend error code path regression introduced in 3.15, an older bug related to recovery from failing resume from hibernation and a cpufreq-dt driver issue related to operation performance points. Specifics: - Fix a crash on r8a7791/koelsch during resume from system suspend caused by a recent cpufreq-dt commit (Geert Uytterhoeven). - Fix an MFD enumeration problem introduced by a recent commit adding ACPI support to the MFD subsystem that exposed a weakness in the ACPI core causing ACPI enumeration to be applied to all devices associated with one ACPI companion object, although it should be used for one of them only (Mika Westerberg). - Fix an ACPI EC regression introduced during the 3.17 cycle causing some Samsung laptops to misbehave as a result of a workaround targeted at some Acer machines. That includes a revert of a commit that went too far and a quirk for the Acer machines in question. From Lv Zheng. - Fix a regression in the system suspend error code path introduced during the 3.15 cycle that causes it to fail to take errors from asychronous execution of "late" suspend callbacks into account (Imre Deak). - Fix a long-standing bug in the hibernation resume error code path that fails to roll back everything correcty on "freeze" callback errors and leaves some devices in a "suspended" state causing more breakage to happen subsequently (Imre Deak). - Make the cpufreq-dt driver disable operation performance points that are not supported by the VR connected to the CPU voltage plane with acceptable tolerance instead of constantly failing voltage scaling later on (Lucas Stach)" * tag 'pm+acpi-3.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / EC: Fix regression due to conflicting firmware behavior between Samsung and Acer. Revert "ACPI / EC: Add support to disallow QR_EC to be issued before completing previous QR_EC" cpufreq: cpufreq-dt: Restore default cpumask_setall(policy->cpus) PM / Sleep: fix recovery during resuming from hibernation PM / Sleep: fix async suspend_late/freeze_late error handling ACPI: Use ACPI companion to match only the first physical device cpufreq: cpufreq-dt: disable unsupported OPPs
2014-10-31Merge tag 'pci-v3.18-fixes-1' of ↵Linus Torvalds3-14/+14
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "These changes, intended for v3.18, fix: Sysfs - Fix "enable" filename change (Greg Kroah-Hartman) An unintentional sysfs filename change in commit 5136b2da770d ("PCI: convert bus code to use dev_groups"), which appeared in v3.13, changed "enable" to "enabled", and this changes it back. Old users of "enable" are currently broken and will be helped by this change. Anything that started to use "enabled" after v3.13 will be broken by this change. If necessary, we can add a symlink to make both work, but this patch doesn't do that. PCI device hotplug - Revert duplicate merge (Kamal Mostafa) A mistaken duplicate merge that added a check twice. Nothing's broken; this just removes the unnecessary code. Freescale i.MX6 - Wait for clocks to stabilize after ref_en (Richard Zhu) An i.MX6 clock problem that prevents mx6 nitrogen boards from booting" * tag 'pci-v3.18-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Rename sysfs 'enabled' file back to 'enable' PCI: imx6: Wait for clocks to stabilize after ref_en Revert duplicate "PCI: pciehp: Prevent NULL dereference during probe"
2014-10-31x86_64, entry: Fix out of bounds read on sysenterAndy Lutomirski1-1/+1
Rusty noticed a Really Bad Bug (tm) in my NT fix. The entry code reads out of bounds, causing the NT fix to be unreliable. But, and this is much, much worse, if your stack is somehow just below the top of the direct map (or a hole), you read out of bounds and crash. Excerpt from the crash: [ 1.129513] RSP: 0018:ffff88001da4bf88 EFLAGS: 00010296 2b:* f7 84 24 90 00 00 00 testl $0x4000,0x90(%rsp) That read is deterministically above the top of the stack. I thought I even single-stepped through this code when I wrote it to check the offset, but I clearly screwed it up. Fixes: 8c7aa698baca ("x86_64, entry: Filter RFLAGS.NT on entry from userspace") Reported-by: Rusty Russell <rusty@ozlabs.org> Cc: stable@vger.kernel.org Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-31Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds8-28/+51
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bugfixes from Ted Ts'o: "A set of miscellaneous ext4 bug fixes for 3.18" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: make ext4_ext_convert_to_initialized() return proper number of blocks ext4: bail early when clearing inode journal flag fails ext4: bail out from make_indexed_dir() on first error jbd2: use a better hash function for the revoke table ext4: prevent bugon on race between write/fcntl ext4: remove extent status procfs files if journal load fails ext4: disallow changing journal_csum option during remount ext4: enable journal checksum when metadata checksum feature enabled ext4: fix oops when loading block bitmap failed ext4: fix overflow when updating superblock backups after resize
2014-10-31Merge branch 'for_linus' of ↵Linus Torvalds3-13/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota and ext3 fixes from Jan Kara. * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fs, jbd: use a more generic hash function quota: Properly return errors from dquot_writeback_dquots() ext3: Don't check quota format when there are no quota files
2014-10-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds164-789/+1943
Pull networking fixes from David Miller: "A bit has accumulated, but it's been a week or so since my last batch of post-merge-window fixes, so... 1) Missing module license in netfilter reject module, from Pablo. Lots of people ran into this. 2) Off by one in mac80211 baserate calculation, from Karl Beldan. 3) Fix incorrect return value from ax88179_178a driver's set_mac_addr op, which broke use of it with bonding. From Ian Morgan. 4) Checking of skb_gso_segment()'s return value was not all encompassing, it can return an SKB pointer, a pointer error, or NULL. Fix from Florian Westphal. This is crummy, and longer term will be fixed to just return error pointers or a real SKB. 6) Encapsulation offloads not being handled by skb_gso_transport_seglen(). From Florian Westphal. 7) Fix deadlock in TIPC stack, from Ying Xue. 8) Fix performance regression from using rhashtable for netlink sockets. The problem was the synchronize_net() invoked for every socket destroy. From Thomas Graf. 9) Fix bug in eBPF verifier, and remove the strong dependency of BPF on NET. From Alexei Starovoitov. 10) In qdisc_create(), use the correct interface to allocate ->cpu_bstats, otherwise the u64_stats_sync member isn't initialized properly. From Sabrina Dubroca. 11) Off by one in ip_set_nfnl_get_byindex(), from Dan Carpenter. 12) nf_tables_newchain() was erroneously expecting error pointers from netdev_alloc_pcpu_stats(). It only returna a valid pointer or NULL. From Sabrina Dubroca. 13) Fix use-after-free in _decode_session6(), from Li RongQing. 14) When we set the TX flow hash on a socket, we mistakenly do so before we've nailed down the final source port. Move the setting deeper to fix this. From Sathya Perla. 15) NAPI budget accounting in amd-xgbe driver was counting descriptors instead of full packets, fix from Thomas Lendacky. 16) Fix total_data_buflen calculation in hyperv driver, from Haiyang Zhang. 17) Fix bcma driver build with OF_ADDRESS disabled, from Hauke Mehrtens. 18) Fix mis-use of per-cpu memory in TCP md5 code. The problem is that something that ends up being vmalloc memory can't be passed to the crypto hash routines via scatter-gather lists. From Eric Dumazet. 19) Fix regression in promiscuous mode enabling in cdc-ether, from Olivier Blin. 20) Bucket eviction and frag entry killing can race with eachother, causing an unlink of the object from the wrong list. Fix from Nikolay Aleksandrov. 21) Missing initialization of spinlock in cxgb4 driver, from Anish Bhatt. 22) Do not cache ipv4 routing failures, otherwise if the sysctl for forwarding is subsequently enabled this won't be seen. From Nicolas Cavallari" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (131 commits) drivers: net: cpsw: Support ALLMULTI and fix IFF_PROMISC in switch mode drivers: net: cpsw: Fix broken loop condition in switch mode net: ethtool: Return -EOPNOTSUPP if user space tries to read EEPROM with lengh 0 stmmac: pci: set default of the filter bins net: smc91x: Fix gpios for device tree based booting mpls: Allow mpls_gso to be built as module mpls: Fix mpls_gso handler. r8152: stop submitting intr for -EPROTO netfilter: nft_reject_bridge: restrict reject to prerouting and input netfilter: nft_reject_bridge: don't use IP stack to reject traffic netfilter: nf_reject_ipv6: split nf_send_reset6() in smaller functions netfilter: nf_reject_ipv4: split nf_send_reset() in smaller functions netfilter: nf_tables_bridge: update hook_mask to allow {pre,post}routing drivers/net: macvtap and tun depend on INET drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets drivers/net: Disable UFO through virtio net: skb_fclone_busy() needs to detect orphaned skb gre: Use inner mac length when computing tunnel length mlx4: Avoid leaking steering rules on flow creation error flow net/mlx4_en: Don't attempt to TX offload the outer UDP checksum for VXLAN ...
2014-10-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds4-13/+5
Pull sparc update from David Miller: "Two changes: 1) It makes no sense to execute a VTOC partition table request in the Sun virtual block device driver and fail to load if it doesn't succeed because a) we don't use the result at all and b) it won't succeed if there is an EFI partition on the disk, for example. We read the partition table via the normal means in the block layer anyways, so this is really completely useless, so just remove it. From Dwight Engen. 2) Hook up new bpf system call" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sunvdc: don't call VD_OP_GET_VTOC sparc: Hook up bpf system call.
2014-10-31Merge tag 'microblaze-3.18-rc3' of git://git.monstr.eu/linux-2.6-microblazeLinus Torvalds5-3/+14
Pull Microblaze updates from Michal Simek: - wire-up new bpf syscall - fix PCI bug - fix Kconfig warning * tag 'microblaze-3.18-rc3' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: Wire up bpf syscall microblaze: Fix IO space breakage after of_pci_range_to_resource() change microblaze: Fix missing NR_CPUS in menuconfig
2014-10-31Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds12-29/+34
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Fixes from all around the place: - hyper-V 32-bit PAE guest kernel fix - two IRQ allocation fixes on certain x86 boards - intel-mid boot crash fix - intel-quark quirk - /proc/interrupts duplicate irq chip name fix - cma boot crash fix - syscall audit fix - boot crash fix with certain TSC configurations (seen on Qemu) - smpboot.c build warning fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, pageattr: Prevent overflow in slow_virt_to_phys() for X86_PAE ACPI, irq, x86: Return IRQ instead of GSI in mp_register_gsi() x86, intel-mid: Create IRQs for APB timers and RTC timers x86: Don't enable F00F workaround on Intel Quark processors x86/irq: Fix XT-PIC-XT-PIC in /proc/interrupts x86, cma: Reserve DMA contiguous area after initmem_init() i386/audit: stop scribbling on the stack frame x86, apic: Handle a bad TSC more gracefully x86: ACPI: Do not translate GSI number if IOAPIC is disabled x86/smpboot: Move data structure to its primary usage scope
2014-10-31Merge branches 'pm-cpufreq' and 'pm-sleep'Rafael J. Wysocki3-27/+51
* pm-cpufreq: cpufreq: cpufreq-dt: Restore default cpumask_setall(policy->cpus) cpufreq: cpufreq-dt: disable unsupported OPPs * pm-sleep: PM / Sleep: fix recovery during resuming from hibernation PM / Sleep: fix async suspend_late/freeze_late error handling
2014-10-31Merge branches 'acpi-scan' and 'acpi-ec'Rafael J. Wysocki2-26/+75
* acpi-scan: ACPI: Use ACPI companion to match only the first physical device * acpi-ec: ACPI / EC: Fix regression due to conflicting firmware behavior between Samsung and Acer. Revert "ACPI / EC: Add support to disallow QR_EC to be issued before completing previous QR_EC"
2014-10-31Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds7-56/+99
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Various scheduler fixes all over the place: three SCHED_DL fixes, three sched/numa fixes, two generic race fixes and a comment fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/dl: Fix preemption checks sched: Update comments for CLONE_NEWNS sched: stop the unbound recursion in preempt_schedule_context() sched/fair: Fix division by zero sysctl_numa_balancing_scan_size sched/fair: Care divide error in update_task_scan_period() sched/numa: Fix unsafe get_task_struct() in task_numa_assign() sched/deadline: Fix races between rt_mutex_setprio() and dl_task_timer() sched/deadline: Don't replenish from a !SCHED_DEADLINE entity sched: Fix race between task_group and sched_task_group
2014-10-31Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds19-279/+158
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Mostly tooling fixes, plus on the kernel side: - a revert for a newly introduced PMU driver which isn't complete yet and where we ran out of time with fixes (to be tried again in v3.19) - this makes up for a large chunk of the diffstat. - compilation warning fixes - a printk message fix - event_idx usage fixes/cleanups" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf probe: Trivial typo fix for --demangle perf tools: Fix report -F dso_from for data without branch info perf tools: Fix report -F dso_to for data without branch info perf tools: Fix report -F symbol_from for data without branch info perf tools: Fix report -F symbol_to for data without branch info perf tools: Fix report -F mispredict for data without branch info perf tools: Fix report -F in_tx for data without branch info perf tools: Fix report -F abort for data without branch info perf tools: Make CPUINFO_PROC an array to support different kernel versions perf callchain: Use global caching provided by libunwind perf/x86/intel: Revert incomplete and undocumented Broadwell client support perf/x86: Fix compile warnings for intel_uncore perf: Fix typos in sample code in the perf_event.h header perf: Fix and clean up initialization of pmu::event_idx perf: Fix bogus kernel printk perf diff: Add missing hists__init() call at tool start
2014-10-31Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds1-15/+21
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull futex fixes from Ingo Molnar: "This contains two futex fixes: one fixes a race condition, the other clarifies shared/private futex comments" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Fix a race condition between REQUEUE_PI and task death futex: Mention key referencing differences between shared and private futexes
2014-10-31Merge tag 'master-2014-10-30' of ↵David S. Miller14-32/+67
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-10-31 Please pull this small batch of spooky fixes intended for the 3.18 stream...boo! Cyril Brulebois adds an rt2x00 device ID. Dan Carpenter provides a one-line masking fix for an ath9k debugfs entry. Larry Finger gives us a package of small rtlwifi fixes which add some bits that were left out of some feature updates that were included in the merge window. Hopefully this isn't a sign that the rtlwifi base is getting too big... Marc Yang brings a fix for a temporary mwifiex stall when doing 11n RX reordering. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-31drivers: net: cpsw: Support ALLMULTI and fix IFF_PROMISC in switch modeLennart Sorensen3-2/+49
The cpsw driver did not support the IFF_ALLMULTI flag which makes dynamic multicast routing not work. Related to this, when enabling IFF_PROMISC in switch mode, all registered multicast addresses are flushed, resulting in only broadcast and unicast traffic being received. A new cpsw_ale_set_allmulti function now scans through the ALE entry table and adds/removes the host port from the unregistered multicast port mask of each vlan entry depending on the state of IFF_ALLMULTI. In promiscious mode, cpsw_ale_set_allmulti is used to force reception of all multicast traffic in addition to the unicast and broadcast traffic. With this change dynamic multicast and promiscious mode both work in switch mode. Signed-off-by: Len Sorensen <lsorense@csclub.uwaterloo.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-31drivers: net: cpsw: Fix broken loop condition in switch modeLennart Sorensen1-5/+5
0d961b3b52f566f823070ce2366511a7f64b928c (drivers: net: cpsw: fix buggy loop condition) accidentally fixed a loop comparison in too many places while fixing a real bug. It was correct to fix the dual_emac mode section since there 'i' is used as an index into priv->slaves which is a 0 based array. However the other two changes (which are only used in switch mode) are wrong since there 'i' is actually the ALE port number, and port 0 is the host port, while port 1 and up are the slave ports. Putting the loop condition back in the switch mode section fixes it. A comment has been added to point out the intent clearly to avoid future confusion. Also a comment is fixed that said the opposite of what was actually happening. Signed-off-by: Len Sorensen <lsorense@csclub.uwaterloo.ca> Acked-by: Heiko Schocher <hs@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-31drivers: net: cpsw: Support ALLMULTI and fix IFF_PROMISC in switch modeLennart Sorensen3-2/+49
The cpsw driver did not support the IFF_ALLMULTI flag which makes dynamic multicast routing not work. Related to this, when enabling IFF_PROMISC in switch mode, all registered multicast addresses are flushed, resulting in only broadcast and unicast traffic being received. A new cpsw_ale_set_allmulti function now scans through the ALE entry table and adds/removes the host port from the unregistered multicast port mask of each vlan entry depending on the state of IFF_ALLMULTI. In promiscious mode, cpsw_ale_set_allmulti is used to force reception of all multicast traffic in addition to the unicast and broadcast traffic. With this change dynamic multicast and promiscious mode both work in switch mode. Signed-off-by: Len Sorensen <lsorense@csclub.uwaterloo.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-31drivers: net: cpsw: Fix broken loop condition in switch modeLennart Sorensen1-5/+5
0d961b3b52f566f823070ce2366511a7f64b928c (drivers: net: cpsw: fix buggy loop condition) accidentally fixed a loop comparison in too many places while fixing a real bug. It was correct to fix the dual_emac mode section since there 'i' is used as an index into priv->slaves which is a 0 based array. However the other two changes (which are only used in switch mode) are wrong since there 'i' is actually the ALE port number, and port 0 is the host port, while port 1 and up are the slave ports. Putting the loop condition back in the switch mode section fixes it. A comment has been added to point out the intent clearly to avoid future confusion. Also a comment is fixed that said the opposite of what was actually happening. Signed-off-by: Len Sorensen <lsorense@csclub.uwaterloo.ca> Acked-by: Heiko Schocher <hs@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-31net: ethtool: Return -EOPNOTSUPP if user space tries to read EEPROM with lengh 0Guenter Roeck1-2/+4
If a driver supports reading EEPROM but no EEPROM is installed in the system, the driver's get_eeprom_len function returns 0. ethtool will subsequently try to read that zero-length EEPROM anyway. If the driver does not support EEPROM access at all, this operation will return -EOPNOTSUPP. If the driver does support EEPROM access but no EEPROM is installed, the operation will return -EINVAL. Return -EOPNOTSUPP in both cases for consistency. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-31ethernet: mvneta: Use PHY status standard messageEzequiel Garcia1-2/+1
Use phy_print_status() to report a change in the PHY status. The current message is not verbose enough, so this commit improves it by using the generic status message. After this change, the kernel reports PHY status down and up events as: mvneta f1070000.ethernet eth0: Link is Down mvneta f1070000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-31stmmac: pci: set default of the filter binsAndy Shevchenko1-0/+7
The commit 3b57de958e2a brought the support for a different amount of the filter bins, but didn't update the PCI driver accordingly. This patch appends the default values when the device is enumerated via PCI bus. Fixes: 3b57de958e2a (net: stmmac: Support devicetree configs for mcast and ucast filter entries) Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-31bonding: add bond_tx_drop() helperEric Dumazet3-9/+14
Because bonding stats are usually sum of slave stats, it was not easy to account for tx drops at bonding layer. We can use dev->tx_dropped for this, as this counter is later added to the device stats (in dev_get_stats()) This extends the idea we had in commit ee6377147409a ("bonding: Simplify the xmit function for modes that use xmit_hash") for bond_3ad_xor_xmit() to other bonding modes. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>