commit fcba09f2b0bf27eeaa1d4d439edb649585f35040
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Sat Oct 3 13:52:18 2015 +0200

    Linux 4.2.3

commit b2b2c7be0fc8e9b0f6f32215cd23b54b07ec4b31
Author: Kyle Evans <kvans32@gmail.com>
Date:   Fri Sep 11 10:40:17 2015 -0500

    hp-wmi: limit hotkey enable
    
    commit 8a1513b49321e503fd6c8b6793e3b1f9a8a3285b upstream.
    
    Do not write initialize magic on systems that do not have
    feature query 0xb. Fixes Bug #82451.
    
    Redefine FEATURE_QUERY to align with 0xb and FEATURE2 with 0xd
    for code clearity.
    
    Add a new test function, hp_wmi_bios_2008_later() & simplify
    hp_wmi_bios_2009_later(), which fixes a bug in cases where
    an improper value is returned. Probably also fixes Bug #69131.
    
    Add missing __init tag.
    
    Signed-off-by: Kyle Evans <kvans32@gmail.com>
    Signed-off-by: Darren Hart <dvhart@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6abf903c8eb352a3705353789ac200d188466f16
Author: Luis Henriques <luis.henriques@canonical.com>
Date:   Thu Sep 17 16:01:40 2015 -0700

    zram: fix possible use after free in zcomp_create()
    
    commit 3aaf14da807a4e9931a37f21e4251abb8a67021b upstream.
    
    zcomp_create() verifies the success of zcomp_strm_{multi,single}_create()
    through comp->stream, which can potentially be pointing to memory that
    was freed if these functions returned an error.
    
    While at it, replace a 'ERR_PTR(-ENOMEM)' by a more generic
    'ERR_PTR(error)' as in the future zcomp_strm_{multi,siggle}_create()
    could return other error codes.  Function documentation updated
    accordingly.
    
    Fixes: beca3ec71fe5 ("zram: add multi stream functionality")
    Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
    Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
    Acked-by: Minchan Kim <minchan@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 92b52680751f95fa275c5a2dfb274de5c320d358
Author: Carol L Soto <clsoto@linux.vnet.ibm.com>
Date:   Thu Aug 27 14:43:25 2015 -0500

    net/mlx4_core: Capping number of requested MSIXs to MAX_MSIX
    
    [ Upstream commit 9293267a3e2a7a2555d8ddc8f9301525e5b03b1b ]
    
    We currently manage IRQs in pool_bm which is a bit field
    of MAX_MSIX bits. Thus, allocating more than MAX_MSIX
    interrupts can't be managed in pool_bm.
    Fixing this by capping number of requested MSIXs to
    MAX_MSIX.
    
    Signed-off-by: Matan Barak <matanb@mellanox.com>
    Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 0d106a6a020b5605c8a4748b9862af62ef2f8e59
Author: Stas Sergeev <stsp@list.ru>
Date:   Mon Jul 20 17:49:58 2015 -0700

    mvneta: use inband status only when explicitly enabled
    
    [ Upstream commit f8af8e6eb95093d5ce5ebcc52bd1929b0433e172 in net-next tree,
      will be pushed to Linus very soon. ]
    
    The commit 898b2970e2c9 ("mvneta: implement SGMII-based in-band link state
    signaling") implemented the link parameters auto-negotiation unconditionally.
    Unfortunately it appears that some HW that implements SGMII protocol,
    doesn't generate the inband status, so it is not possible to auto-negotiate
    anything with such HW.
    
    This patch enables the auto-negotiation only if explicitly requested with
    the 'managed' DT property.
    
    This patch fixes the following regression:
    https://lkml.org/lkml/2015/7/8/865
    
    Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>
    
    CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
    CC: netdev@vger.kernel.org
    CC: linux-kernel@vger.kernel.org
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 40448fc0043995e83336befcca83642f6f158c03
Author: Stas Sergeev <stsp@list.ru>
Date:   Mon Jul 20 17:49:57 2015 -0700

    of_mdio: add new DT property 'managed' to specify the PHY management type
    
    [ Upstream commit 4cba5c2103657d43d0886e4cff8004d95a3d0def in net-next tree,
      will be pushed to Linus very soon. ]
    
    Currently the PHY management type is selected by the MAC driver arbitrary.
    The decision is based on the presence of the "fixed-link" node and on a
    will of the driver's authors.
    This caused a regression recently, when mvneta driver suddenly started
    to use the in-band status for auto-negotiation on fixed links.
    It appears the auto-negotiation may not work when expected by the MAC driver.
    Sebastien Rannou explains:
    << Yes, I confirm that my HW does not generate an in-band status. AFAIK, it's
    a PHY that aggregates 4xSGMIIs to 1xQSGMII ; the MAC side of the PHY (with
    inband status) is connected to the switch through QSGMII, and in this context
    we are on the media side of the PHY. >>
    https://lkml.org/lkml/2015/7/10/206
    
    This patch introduces the new string property 'managed' that allows
    the user to set the management type explicitly.
    The supported values are:
    "auto" - default. Uses either MDIO or nothing, depending on the presence
    of the fixed-link node
    "in-band-status" - use in-band status
    
    Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>
    
    CC: Rob Herring <robh+dt@kernel.org>
    CC: Pawel Moll <pawel.moll@arm.com>
    CC: Mark Rutland <mark.rutland@arm.com>
    CC: Ian Campbell <ijc+devicetree@hellion.org.uk>
    CC: Kumar Gala <galak@codeaurora.org>
    CC: Florian Fainelli <f.fainelli@gmail.com>
    CC: Grant Likely <grant.likely@linaro.org>
    CC: devicetree@vger.kernel.org
    CC: linux-kernel@vger.kernel.org
    CC: netdev@vger.kernel.org
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bfba942d0d287d2765612629826b87a0749bf6bd
Author: Stas Sergeev <stsp@list.ru>
Date:   Mon Jul 20 17:49:56 2015 -0700

    net: phy: fixed_phy: handle link-down case
    
    [ Upstream 868a4215be9a6d80548ccb74763b883dc99d32a2 in net-next tree,
      will be pushed to Linus very soon. ]
    
    fixed_phy_register() currently hardcodes the fixed PHY link to 1, and
    expects to find a "speed" parameter to provide correct information
    towards the fixed PHY consumer.
    
    In a subsequent change, where we allow "managed" (e.g: (RS)GMII in-band
    status auto-negotiation) fixed PHYs, none of these parameters can be
    provided since they will be auto-negotiated, hence, we just provide a
    zero-initialized fixed_phy_status to fixed_phy_register() which makes it
    fail when we call fixed_phy_update_regs() since status.speed = 0 which
    makes us hit the "default" label and error out.
    
    Without this change, we would also see potentially inconsistent
    speed/duplex parameters for fixed PHYs when the link is DOWN.
    
    CC: netdev@vger.kernel.org
    CC: linux-kernel@vger.kernel.org
    Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>
    [florian: add more background to why this is correct and desirable]
    Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b11c94db52901ca6b5167a2089f14e679cbe0cee
Author: Florian Fainelli <f.fainelli@gmail.com>
Date:   Mon Jul 20 17:49:55 2015 -0700

    net: dsa: bcm_sf2: Do not override speed settings
    
    [ Upstream d2eac98f7d1b950b762a7eca05a9ce0ea1d878d2 in net-next tree,
      will be pushed to Linus very soon. ]
    
    The SF2 driver currently overrides speed settings for its port
    configured using a fixed PHY, this is both unnecessary and incorrect,
    because we keep feedback to the hardware parameters that we read from
    the PHY device, which in the case of a fixed PHY cannot possibly change
    speed.
    
    This is a required change to allow the fixed PHY code to allow
    registering a PHY with a link configured as DOWN by default and avoid
    some sort of circular dependency where we require the link_update
    callback to run to program the hardware, and we then utilize the fixed
    PHY parameters to program the hardware with the same settings.
    
    Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver")
    Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4c8f9d6cf799cf77d6e0a2ad0d26e066be630bf9
Author: Guillaume Nault <g.nault@alphalink.fr>
Date:   Thu Sep 24 12:54:01 2015 +0200

    ppp: fix lockdep splat in ppp_dev_uninit()
    
    [ Upstream commit 58a89ecaca53736aa465170530acea4f8be34ab4 ]
    
    ppp_dev_uninit() locks all_ppp_mutex while under rtnl mutex protection.
    ppp_create_interface() must then lock these mutexes in that same order
    to avoid possible deadlock.
    
    [  120.880011] ======================================================
    [  120.880011] [ INFO: possible circular locking dependency detected ]
    [  120.880011] 4.2.0 #1 Not tainted
    [  120.880011] -------------------------------------------------------
    [  120.880011] ppp-apitest/15827 is trying to acquire lock:
    [  120.880011]  (&pn->all_ppp_mutex){+.+.+.}, at: [<ffffffffa0145f56>] ppp_dev_uninit+0x64/0xb0 [ppp_generic]
    [  120.880011]
    [  120.880011] but task is already holding lock:
    [  120.880011]  (rtnl_mutex){+.+.+.}, at: [<ffffffff812e4255>] rtnl_lock+0x12/0x14
    [  120.880011]
    [  120.880011] which lock already depends on the new lock.
    [  120.880011]
    [  120.880011]
    [  120.880011] the existing dependency chain (in reverse order) is:
    [  120.880011]
    [  120.880011] -> #1 (rtnl_mutex){+.+.+.}:
    [  120.880011]        [<ffffffff81073a6f>] lock_acquire+0xcf/0x10e
    [  120.880011]        [<ffffffff813ab18a>] mutex_lock_nested+0x56/0x341
    [  120.880011]        [<ffffffff812e4255>] rtnl_lock+0x12/0x14
    [  120.880011]        [<ffffffff812d9d94>] register_netdev+0x11/0x27
    [  120.880011]        [<ffffffffa0147b17>] ppp_ioctl+0x289/0xc98 [ppp_generic]
    [  120.880011]        [<ffffffff8113b367>] do_vfs_ioctl+0x4ea/0x532
    [  120.880011]        [<ffffffff8113b3fd>] SyS_ioctl+0x4e/0x7d
    [  120.880011]        [<ffffffff813ad7d7>] entry_SYSCALL_64_fastpath+0x12/0x6f
    [  120.880011]
    [  120.880011] -> #0 (&pn->all_ppp_mutex){+.+.+.}:
    [  120.880011]        [<ffffffff8107334e>] __lock_acquire+0xb07/0xe76
    [  120.880011]        [<ffffffff81073a6f>] lock_acquire+0xcf/0x10e
    [  120.880011]        [<ffffffff813ab18a>] mutex_lock_nested+0x56/0x341
    [  120.880011]        [<ffffffffa0145f56>] ppp_dev_uninit+0x64/0xb0 [ppp_generic]
    [  120.880011]        [<ffffffff812d5263>] rollback_registered_many+0x19e/0x252
    [  120.880011]        [<ffffffff812d5381>] rollback_registered+0x29/0x38
    [  120.880011]        [<ffffffff812d53fa>] unregister_netdevice_queue+0x6a/0x77
    [  120.880011]        [<ffffffffa0146a94>] ppp_release+0x42/0x79 [ppp_generic]
    [  120.880011]        [<ffffffff8112d9f6>] __fput+0xec/0x192
    [  120.880011]        [<ffffffff8112dacc>] ____fput+0x9/0xb
    [  120.880011]        [<ffffffff8105447a>] task_work_run+0x66/0x80
    [  120.880011]        [<ffffffff81001801>] prepare_exit_to_usermode+0x8c/0xa7
    [  120.880011]        [<ffffffff81001900>] syscall_return_slowpath+0xe4/0x104
    [  120.880011]        [<ffffffff813ad931>] int_ret_from_sys_call+0x25/0x9f
    [  120.880011]
    [  120.880011] other info that might help us debug this:
    [  120.880011]
    [  120.880011]  Possible unsafe locking scenario:
    [  120.880011]
    [  120.880011]        CPU0                    CPU1
    [  120.880011]        ----                    ----
    [  120.880011]   lock(rtnl_mutex);
    [  120.880011]                                lock(&pn->all_ppp_mutex);
    [  120.880011]                                lock(rtnl_mutex);
    [  120.880011]   lock(&pn->all_ppp_mutex);
    [  120.880011]
    [  120.880011]  *** DEADLOCK ***
    
    Fixes: 8cb775bc0a34 ("ppp: fix device unregistration upon netns deletion")
    Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
    Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
    Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 45c191bb3aabf6df3db4ba5e94dd24b96edf6ab5
Author: Wilson Kok <wkok@cumulusnetworks.com>
Date:   Tue Sep 22 21:40:22 2015 -0700

    fib_rules: fix fib rule dumps across multiple skbs
    
    [ Upstream commit 41fc014332d91ee90c32840bf161f9685b7fbf2b ]
    
    dump_rules returns skb length and not error.
    But when family == AF_UNSPEC, the caller of dump_rules
    assumes that it returns an error. Hence, when family == AF_UNSPEC,
    we continue trying to dump on -EMSGSIZE errors resulting in
    incorrect dump idx carried between skbs belonging to the same dump.
    This results in fib rule dump always only dumping rules that fit
    into the first skb.
    
    This patch fixes dump_rules to return error so that we exit correctly
    and idx is correctly maintained between skbs that are part of the
    same dump.
    
    Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
    Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d8abd0589da3bc4fabe1450a5e98099b28874a30
Author: WANG Cong <xiyou.wangcong@gmail.com>
Date:   Tue Sep 22 17:01:11 2015 -0700

    net: revert "net_sched: move tp->root allocation into fw_init()"
    
    [ Upstream commit d8aecb10115497f6cdf841df8c88ebb3ba25fa28 ]
    
    fw filter uses tp->root==NULL to check if it is the old method,
    so it doesn't need allocation at all in this case. This patch
    reverts the offending commit and adds some comments for old
    method to make it obvious.
    
    Fixes: 33f8b9ecdb15 ("net_sched: move tp->root allocation into fw_init()")
    Reported-by: Akshat Kakkar <akshat.1984@gmail.com>
    Cc: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
    Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dd9eb1b17ca8fdbcfa496b61a7a5a2a34445a3da
Author: David Woodhouse <dwmw2@infradead.org>
Date:   Wed Sep 23 19:45:08 2015 +0100

    Fix AF_PACKET ABI breakage in 4.2
    
    [ Upstream commit d3869efe7a8a2298516d9af4f91487cf486ca945 ]
    
    Commit 7d82410950aa ("virtio: add explicit big-endian support to memory
    accessors") accidentally changed the virtio_net header used by
    AF_PACKET with PACKET_VNET_HDR from host-endian to big-endian.
    
    Since virtio_legacy_is_little_endian() is a very long identifier,
    define a vio_le macro and use that throughout the code instead of the
    hard-coded 'false' for little-endian.
    
    This restores the ABI to match 4.1 and earlier kernels, and makes my
    test program work again.
    
    Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 9d0af4ef230500589fec21785e66b24af81d8ca7
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Sep 23 14:00:21 2015 -0700

    tcp: add proper TS val into RST packets
    
    [ Upstream commit 675ee231d960af2af3606b4480324e26797eb010 ]
    
    RST packets sent on behalf of TCP connections with TS option (RFC 7323
    TCP timestamps) have incorrect TS val (set to 0), but correct TS ecr.
    
    A > B: Flags [S], seq 0, win 65535, options [mss 1000,nop,nop,TS val 100
    ecr 0], length 0
    B > A: Flags [S.], seq 2444755794, ack 1, win 28960, options [mss
    1460,nop,nop,TS val 7264344 ecr 100], length 0
    A > B: Flags [.], ack 1, win 65535, options [nop,nop,TS val 110 ecr
    7264344], length 0
    
    B > A: Flags [R.], seq 1, ack 1, win 28960, options [nop,nop,TS val 0
    ecr 110], length 0
    
    We need to call skb_mstamp_get() to get proper TS val,
    derived from skb->skb_mstamp
    
    Note that RFC 1323 was advocating to not send TS option in RST segment,
    but RFC 7323 recommends the opposite :
    
      Once TSopt has been successfully negotiated, that is both <SYN> and
      <SYN,ACK> contain TSopt, the TSopt MUST be sent in every non-<RST>
      segment for the duration of the connection, and SHOULD be sent in an
      <RST> segment (see Section 5.2 for details)
    
    Note this RFC recommends to send TS val = 0, but we believe it is
    premature : We do not know if all TCP stacks are properly
    handling the receive side :
    
       When an <RST> segment is
       received, it MUST NOT be subjected to the PAWS check by verifying an
       acceptable value in SEG.TSval, and information from the Timestamps
       option MUST NOT be used to update connection state information.
       SEG.TSecr MAY be used to provide stricter <RST> acceptance checks.
    
    In 5 years, if/when all TCP stack are RFC 7323 ready, we might consider
    to decide to send TS val = 0, if it buys something.
    
    Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Acked-by: Yuchung Cheng <ycheng@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1e0c9d37719e535b948b506638d87125ff266373
Author: Jesse Gross <jesse@nicira.com>
Date:   Mon Sep 21 20:21:20 2015 -0700

    openvswitch: Zero flows on allocation.
    
    [ Upstream commit ae5f2fb1d51fa128a460bcfbe3c56d7ab8bf6a43 ]
    
    When support for megaflows was introduced, OVS needed to start
    installing flows with a mask applied to them. Since masking is an
    expensive operation, OVS also had an optimization that would only
    take the parts of the flow keys that were covered by a non-zero
    mask. The values stored in the remaining pieces should not matter
    because they are masked out.
    
    While this works fine for the purposes of matching (which must always
    look at the mask), serialization to netlink can be problematic. Since
    the flow and the mask are serialized separately, the uninitialized
    portions of the flow can be encoded with whatever values happen to be
    present.
    
    In terms of functionality, this has little effect since these fields
    will be masked out by definition. However, it leaks kernel memory to
    userspace, which is a potential security vulnerability. It is also
    possible that other code paths could look at the masked key and get
    uninitialized data, although this does not currently appear to be an
    issue in practice.
    
    This removes the mask optimization for flows that are being installed.
    This was always intended to be the case as the mask optimizations were
    really targetting per-packet flow operations.
    
    Fixes: 03f0d916 ("openvswitch: Mega flow implementation")
    Signed-off-by: Jesse Gross <jesse@nicira.com>
    Acked-by: Pravin B Shelar <pshelar@nicira.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ccbe6aba49fde168abba5649976da6370453cbf8
Author: Russell King <rmk+kernel@arm.linux.org.uk>
Date:   Mon Sep 21 21:42:59 2015 +0100

    net: dsa: actually force the speed on the CPU port
    
    [ Upstream commit 53adc9e83028d9e35b6408231ebaf62a94a16e4d ]
    
    Commit 54d792f257c6 ("net: dsa: Centralise global and port setup
    code into mv88e6xxx.") merged in the 4.2 merge window broke the link
    speed forcing for the CPU port of Marvell DSA switches.  The original
    code was:
    
            /* MAC Forcing register: don't force link, speed, duplex
             * or flow control state to any particular values on physical
             * ports, but force the CPU port and all DSA ports to 1000 Mb/s
             * full duplex.
             */
            if (dsa_is_cpu_port(ds, p) || ds->dsa_port_mask & (1 << p))
                    REG_WRITE(addr, 0x01, 0x003e);
            else
                    REG_WRITE(addr, 0x01, 0x0003);
    
    but the new code does a read-modify-write:
    
                    reg = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_PCS_CTRL);
                    if (dsa_is_cpu_port(ds, port) ||
                        ds->dsa_port_mask & (1 << port)) {
                            reg |= PORT_PCS_CTRL_FORCE_LINK |
                                    PORT_PCS_CTRL_LINK_UP |
                                    PORT_PCS_CTRL_DUPLEX_FULL |
                                    PORT_PCS_CTRL_FORCE_DUPLEX;
                            if (mv88e6xxx_6065_family(ds))
                                    reg |= PORT_PCS_CTRL_100;
                            else
                                    reg |= PORT_PCS_CTRL_1000;
    
    The link speed in the PCS control register is a two bit field.  Forcing
    the link speed in this way doesn't ensure that the bit field is set to
    the correct value - on the hardware I have here, the speed bitfield
    remains set to 0x03, resulting in the speed not being forced to gigabit.
    
    We must clear both bits before forcing the link speed.
    
    Fixes: 54d792f257c6 ("net: dsa: Centralise global and port setup code into mv88e6xxx.")
    Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
    Acked-by: Andrew Lunn <andrew@lunn.ch>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e2a3131de43c6e8072ed618330c49f14d87dba6e
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Tue Sep 22 11:38:56 2015 +0800

    netlink: Replace rhash_portid with bound
    
    [ Upstream commit da314c9923fed553a007785a901fd395b7eb6c19 ]
    
    On Mon, Sep 21, 2015 at 02:20:22PM -0400, Tejun Heo wrote:
    >
    > store_release and load_acquire are different from the usual memory
    > barriers and can't be paired this way.  You have to pair store_release
    > and load_acquire.  Besides, it isn't a particularly good idea to
    
    OK I've decided to drop the acquire/release helpers as they don't
    help us at all and simply pessimises the code by using full memory
    barriers (on some architectures) where only a write or read barrier
    is needed.
    
    > depend on memory barriers embedded in other data structures like the
    > above.  Here, especially, rhashtable_insert() would have write barrier
    > *before* the entry is hashed not necessarily *after*, which means that
    > in the above case, a socket which appears to have set bound to a
    > reader might not visible when the reader tries to look up the socket
    > on the hashtable.
    
    But you are right we do need an explicit write barrier here to
    ensure that the hashing is visible.
    
    > There's no reason to be overly smart here.  This isn't a crazy hot
    > path, write barriers tend to be very cheap, store_release more so.
    > Please just do smp_store_release() and note what it's paired with.
    
    It's not about being overly smart.  It's about actually understanding
    what's going on with the code.  I've seen too many instances of
    people simply sprinkling synchronisation primitives around without
    any knowledge of what is happening underneath, which is just a recipe
    for creating hard-to-debug races.
    
    > > @@ -1539,7 +1546,7 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr,
    > >  		}
    > >  	}
    > >
    > > -	if (!nlk->portid) {
    > > +	if (!nlk->bound) {
    >
    > I don't think you can skip load_acquire here just because this is the
    > second deref of the variable.  That doesn't change anything.  Race
    > condition could still happen between the first and second tests and
    > skipping the second would lead to the same kind of bug.
    
    The reason this one is OK is because we do not use nlk->portid or
    try to get nlk from the hash table before we return to user-space.
    
    However, there is a real bug here that none of these acquire/release
    helpers discovered.  The two bound tests here used to be a single
    one.  Now that they are separate it is entirely possible for another
    thread to come in the middle and bind the socket.  So we need to
    repeat the portid check in order to maintain consistency.
    
    > > @@ -1587,7 +1594,7 @@ static int netlink_connect(struct socket *sock, struct sockaddr *addr,
    > >  	    !netlink_allowed(sock, NL_CFG_F_NONROOT_SEND))
    > >  		return -EPERM;
    > >
    > > -	if (!nlk->portid)
    > > +	if (!nlk->bound)
    >
    > Don't we need load_acquire here too?  Is this path holding a lock
    > which makes that unnecessary?
    
    Ditto.
    
    ---8<---
    The commit 1f770c0a09da855a2b51af6d19de97fb955eca85 ("netlink:
    Fix autobind race condition that leads to zero port ID") created
    some new races that can occur due to inconcsistencies between the
    two port IDs.
    
    Tejun is right that a barrier is unavoidable.  Therefore I am
    reverting to the original patch that used a boolean to indicate
    that a user netlink socket has been bound.
    
    Barriers have been added where necessary to ensure that a valid
    portid and the hashed socket is visible.
    
    I have also changed netlink_insert to only return EBUSY if the
    socket is bound to a portid different to the requested one.  This
    combined with only reading nlk->bound once in netlink_bind fixes
    a race where two threads that bind the socket at the same time
    with different port IDs may both succeed.
    
    Fixes: 1f770c0a09da ("netlink: Fix autobind race condition that leads to zero port ID")
    Reported-by: Tejun Heo <tj@kernel.org>
    Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Nacked-by: Tejun Heo <tj@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6e32e731184134db406c428f491a9811cf58252a
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Fri Sep 18 19:16:50 2015 +0800

    netlink: Fix autobind race condition that leads to zero port ID
    
    [ Upstream commit 1f770c0a09da855a2b51af6d19de97fb955eca85 ]
    
    The commit c0bb07df7d981e4091432754e30c9c720e2c0c78 ("netlink:
    Reset portid after netlink_insert failure") introduced a race
    condition where if two threads try to autobind the same socket
    one of them may end up with a zero port ID.  This led to kernel
    deadlocks that were observed by multiple people.
    
    This patch reverts that commit and instead fixes it by introducing
    a separte rhash_portid variable so that the real portid is only set
    after the socket has been successfully hashed.
    
    Fixes: c0bb07df7d98 ("netlink: Reset portid after netlink_insert failure")
    Reported-by: Tejun Heo <tj@kernel.org>
    Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3463bb420c2c8ac9ecabc907575ae2297f83f45c
Author: Michael S. Tsirkin <mst@redhat.com>
Date:   Fri Sep 18 13:41:09 2015 +0300

    macvtap: fix TUNSETSNDBUF values > 64k
    
    [ Upstream commit 3ea79249e81e5ed051f2e6480cbde896d99046e8 ]
    
    Upon TUNSETSNDBUF,  macvtap reads the requested sndbuf size into
    a local variable u.
    commit 39ec7de7092b ("macvtap: fix uninitialized access on
    TUNSETIFF") changed its type to u16 (which is the right thing to
    do for all other macvtap ioctls), breaking all values > 64k.
    
    The value of TUNSETSNDBUF is actually a signed 32 bit integer, so
    the right thing to do is to read it into an int.
    
    Cc: David S. Miller <davem@davemloft.net>
    Fixes: 39ec7de7092b ("macvtap: fix uninitialized access on TUNSETIFF")
    Reported-by: Mark A. Peloquin
    Bisected-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
    Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
    Tested-by:  Matthew Rosato <mjrosato@linux.vnet.ibm.com>
    Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d192596179a62abe787ed7184fc1cd1d1ec41920
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Sep 15 18:29:47 2015 -0700

    net/mlx4_en: really allow to change RSS key
    
    [ Upsteam commit 4671fc6d47e0a0108fe24a4d830347d6a6ef4aa7 ]
    
    When changing rss key, we do not want to overwrite user provided key
    by the one provided by netdev_rss_key_fill(), which is the host random
    key generated at boot time.
    
    Fixes: 947cbb0ac242 ("net/mlx4_en: Support for configurable RSS hash function")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Eyal Perry <eyalpe@mellanox.com>
    CC: Amir Vadai <amirv@mellanox.com>
    Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 065b4761929a2315ddcc2f4356a9efbd6d105d0e
Author: Roopa Prabhu <roopa@cumulusnetworks.com>
Date:   Tue Sep 15 14:44:29 2015 -0700

    rtnetlink: catch -EOPNOTSUPP errors from ndo_bridge_getlink
    
    [ Upstream commit d64f69b0373a7d0bcec8b5da7712977518a8f42b ]
    
    problem reported:
    	kernel 4.1.3
    	------------
    	# bridge vlan
    	port	vlan ids
    	eth0	 1 PVID Egress Untagged
    	 	90
    	 	91
    	 	92
    	 	93
    	 	94
    	 	95
    	 	96
    	 	97
    	 	98
    	 	99
    	 	100
    
    	vmbr0	 1 PVID Egress Untagged
    	 	94
    
    	kernel 4.2
    	-----------
    	# bridge vlan
    	port	vlan ids
    
    ndo_bridge_getlink can return -EOPNOTSUPP when an interfaces
    ndo_bridge_getlink op is set to switchdev_port_bridge_getlink
    and CONFIG_SWITCHDEV is not defined. This today can happen to
    bond, rocker and team devices. This patch adds -EOPNOTSUPP
    checks after calls to ndo_bridge_getlink.
    
    Fixes: 85fdb956726ff2a ("switchdev: cut over to new switchdev_port_bridge_getlink")
    Reported-by: Alexandre DERUMIER <aderumier@odiso.com>
    Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit abb7a0340081651d037faa2149163f475f2c44ea
Author: Simon Guinot <simon.guinot@sequanux.org>
Date:   Tue Sep 15 22:41:21 2015 +0200

    net: mvneta: fix DMA buffer unmapping in mvneta_rx()
    
    [ Upstream commit daf158d0d544cec80b7b30deff8cfc59a6e17610 ]
    
    This patch fixes a regression introduced by the commit a84e32894191
    ("net: mvneta: fix refilling for Rx DMA buffers"). Due to this commit
    the newly allocated Rx buffers are DMA-unmapped in place of those passed
    to the networking stack. Obviously, this causes data corruptions.
    
    This patch fixes the issue by ensuring that the right Rx buffers are
    DMA-unmapped.
    
    Reported-by: Oren Laskin <oren@igneous.io>
    Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
    Fixes: a84e32894191 ("net: mvneta: fix refilling for Rx DMA buffers")
    Cc: <stable@vger.kernel.org> # v3.8+
    Tested-by: Oren Laskin <oren@igneous.io>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c5de8f88c0177f3ac724cd4ad4caa685fee11945
Author: Linus Lüssing <linus.luessing@c0d3.blue>
Date:   Fri Sep 11 18:39:48 2015 +0200

    bridge: fix igmpv3 / mldv2 report parsing
    
    [ Upstream commit c2d4fbd2163e607915cc05798ce7fb7f31117cc1 ]
    
    With the newly introduced helper functions the skb pulling is hidden in
    the checksumming function - and undone before returning to the caller.
    
    The IGMPv3 and MLDv2 report parsing functions in the bridge still
    assumed that the skb is pointing to the beginning of the IGMP/MLD
    message while it is now kept at the beginning of the IPv4/6 header,
    breaking the message parsing and creating packet loss.
    
    Fixing this by taking the offset between IP and IGMP/MLD header into
    account, too.
    
    Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code")
    Reported-by: Tobias Powalowski <tobias.powalowski@googlemail.com>
    Tested-by: Tobias Powalowski <tobias.powalowski@googlemail.com>
    Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 9a04c65c6bcd5fea9e892d338f3d3da8df46c9a1
Author: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Date:   Thu Sep 10 17:31:15 2015 -0300

    sctp: fix race on protocol/netns initialization
    
    [ Upstream commit 8e2d61e0aed2b7c4ecb35844fe07e0b2b762dee4 ]
    
    Consider sctp module is unloaded and is being requested because an user
    is creating a sctp socket.
    
    During initialization, sctp will add the new protocol type and then
    initialize pernet subsys:
    
            status = sctp_v4_protosw_init();
            if (status)
                    goto err_protosw_init;
    
            status = sctp_v6_protosw_init();
            if (status)
                    goto err_v6_protosw_init;
    
            status = register_pernet_subsys(&sctp_net_ops);
    
    The problem is that after those calls to sctp_v{4,6}_protosw_init(), it
    is possible for userspace to create SCTP sockets like if the module is
    already fully loaded. If that happens, one of the possible effects is
    that we will have readers for net->sctp.local_addr_list list earlier
    than expected and sctp_net_init() does not take precautions while
    dealing with that list, leading to a potential panic but not limited to
    that, as sctp_sock_init() will copy a bunch of blank/partially
    initialized values from net->sctp.
    
    The race happens like this:
    
         CPU 0                           |  CPU 1
      socket()                           |
       __sock_create                     | socket()
        inet_create                      |  __sock_create
         list_for_each_entry_rcu(        |
            answer, &inetsw[sock->type], |
            list) {                      |   inet_create
          /* no hits */                  |
         if (unlikely(err)) {            |
          ...                            |
          request_module()               |
          /* socket creation is blocked  |
           * the module is fully loaded  |
           */                            |
           sctp_init                     |
            sctp_v4_protosw_init         |
             inet_register_protosw       |
              list_add_rcu(&p->list,     |
                           last_perm);   |
                                         |  list_for_each_entry_rcu(
                                         |     answer, &inetsw[sock->type],
            sctp_v6_protosw_init         |     list) {
                                         |     /* hit, so assumes protocol
                                         |      * is already loaded
                                         |      */
                                         |  /* socket creation continues
                                         |   * before netns is initialized
                                         |   */
            register_pernet_subsys       |
    
    Simply inverting the initialization order between
    register_pernet_subsys() and sctp_v4_protosw_init() is not possible
    because register_pernet_subsys() will create a control sctp socket, so
    the protocol must be already visible by then. Deferring the socket
    creation to a work-queue is not good specially because we loose the
    ability to handle its errors.
    
    So, as suggested by Vlad, the fix is to split netns initialization in
    two moments: defaults and control socket, so that the defaults are
    already loaded by when we register the protocol, while control socket
    initialization is kept at the same moment it is today.
    
    Fixes: 4db67e808640 ("sctp: Make the address lists per network namespace")
    Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
    Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 62f43b58d2b2c4f0200b9ca2b997f4c484f0272f
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Thu Sep 10 20:05:46 2015 +0200

    netlink, mmap: transform mmap skb into full skb on taps
    
    [ Upstream commit 1853c949646005b5959c483becde86608f548f24 ]
    
    Ken-ichirou reported that running netlink in mmap mode for receive in
    combination with nlmon will throw a NULL pointer dereference in
    __kfree_skb() on nlmon_xmit(), in my case I can also trigger an "unable
    to handle kernel paging request". The problem is the skb_clone() in
    __netlink_deliver_tap_skb() for skbs that are mmaped.
    
    I.e. the cloned skb doesn't have a destructor, whereas the mmap netlink
    skb has it pointed to netlink_skb_destructor(), set in the handler
    netlink_ring_setup_skb(). There, skb->head is being set to NULL, so
    that in such cases, __kfree_skb() doesn't perform a skb_release_data()
    via skb_release_all(), where skb->head is possibly being freed through
    kfree(head) into slab allocator, although netlink mmap skb->head points
    to the mmap buffer. Similarly, the same has to be done also for large
    netlink skbs where the data area is vmalloced. Therefore, as discussed,
    make a copy for these rather rare cases for now. This fixes the issue
    on my and Ken-ichirou's test-cases.
    
    Reference: http://thread.gmane.org/gmane.linux.network/371129
    Fixes: bcbde0d449ed ("net: netlink: virtual tap device management")
    Reported-by: Ken-ichirou MATSUZAWA <chamaken@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Tested-by: Ken-ichirou MATSUZAWA <chamaken@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 12e082bc14a2c95787a79228ab8a1f9300cc8667
Author: Florian Fainelli <f.fainelli@gmail.com>
Date:   Tue Sep 8 20:06:41 2015 -0700

    net: dsa: bcm_sf2: Fix 64-bits register writes
    
    [ Upstream commit 03679a14739a0d4c14b52ba65a69ff553bfba73b ]
    
    The macro to write 64-bits quantities to the 32-bits register swapped
    the value and offsets arguments, we want to preserve the ordering of the
    arguments with respect to how writel() is implemented for instance:
    value first, offset/base second.
    
    Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver")
    Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
    Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e60f4a39c2173ad637d5c6541404b7847acac246
Author: Roopa Prabhu <roopa@cumulusnetworks.com>
Date:   Tue Sep 8 10:53:04 2015 -0700

    ipv6: fix multipath route replace error recovery
    
    [ Upstream commit 6b9ea5a64ed5eeb3f68f2e6fcce0ed1179801d1e ]
    
    Problem:
    The ecmp route replace support for ipv6 in the kernel, deletes the
    existing ecmp route too early, ie when it installs the first nexthop.
    If there is an error in installing the subsequent nexthops, its too late
    to recover the already deleted existing route leaving the fib
    in an inconsistent state.
    
    This patch reduces the possibility of this by doing the following:
    a) Changes the existing multipath route add code to a two stage process:
      build rt6_infos + insert them
    	ip6_route_add rt6_info creation code is moved into
    	ip6_route_info_create.
    b) This ensures that most errors are caught during building rt6_infos
      and we fail early
    c) Separates multipath add and del code. Because add needs the special
      two stage mode in a) and delete essentially does not care.
    d) In any event if the code fails during inserting a route again, a
      warning is printed (This should be unlikely)
    
    Before the patch:
    $ip -6 route show
    3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
    3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
    3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
    
    /* Try replacing the route with a duplicate nexthop */
    $ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
    fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
    swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
    RTNETLINK answers: File exists
    
    $ip -6 route show
    /* previously added ecmp route 3000:1000:1000:1000::2 dissappears from
     * kernel */
    
    After the patch:
    $ip -6 route show
    3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
    3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
    3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
    
    /* Try replacing the route with a duplicate nexthop */
    $ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
    fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
    swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
    RTNETLINK answers: File exists
    
    $ip -6 route show
    3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
    3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
    3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
    
    Fixes: 27596472473a ("ipv6: fix ECMP route replacement")
    Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
    Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
    Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5548af0c5fc799dd4e165e9755a909e5cb60e4a0
Author: Florian Fainelli <f.fainelli@gmail.com>
Date:   Sat Sep 5 13:07:27 2015 -0700

    net: dsa: bcm_sf2: Fix ageing conditions and operation
    
    [ Upstream commit 39797a279d62972cd914ef580fdfacb13e508bf8 ]
    
    The comparison check between cur_hw_state and hw_state is currently
    invalid because cur_hw_state is right shifted by G_MISTP_SHIFT, while
    hw_state is not, so we end-up comparing bits 2:0 with bits 7:5, which is
    going to cause an additional aging to occur. Fix this by not shifting
    cur_hw_state while reading it, but instead, mask the value with the
    appropriately shitfted bitmask.
    
    The other problem with the fast-ageing process is that we did not set
    the EN_AGE_DYNAMIC bit to request the ageing to occur for dynamically
    learned MAC addresses. Finally, write back 0 to the FAST_AGE_CTRL
    register to avoid leaving spurious bits sets from one operation to the
    other.
    
    Fixes: 12f460f23423 ("net: dsa: bcm_sf2: add HW bridging support")
    Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f5f10834321f31c4b08f4f9760e0857cfa90add4
Author: Richard Laing <richard.laing@alliedtelesis.co.nz>
Date:   Thu Sep 3 13:52:31 2015 +1200

    net/ipv6: Correct PIM6 mrt_lock handling
    
    [ Upstream commit 25b4a44c19c83d98e8c0807a7ede07c1f28eab8b ]
    
    In the IPv6 multicast routing code the mrt_lock was not being released
    correctly in the MFC iterator, as a result adding or deleting a MIF would
    cause a hang because the mrt_lock could not be acquired.
    
    This fix is a copy of the code for the IPv4 case and ensures that the lock
    is released correctly.
    
    Signed-off-by: Richard Laing <richard.laing@alliedtelesis.co.nz>
    Acked-by: Cong Wang <cwang@twopensource.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c8bf2008b31f0c522290b32a246e48a551001128
Author: Atsushi Nemoto <nemoto@toshiba-tops.co.jp>
Date:   Wed Sep 2 17:49:29 2015 +0900

    net: eth: altera: fix napi poll_list corruption
    
    [ Upstream commit 4548a697e4969d695047cebd6d9af5e2f6cc728e ]
    
    tse_poll() calls __napi_complete() with irq enabled.  This leads napi
    poll_list corruption and may stop all napi drivers working.
    Use napi_complete() instead of __napi_complete().
    
    Signed-off-by: Atsushi Nemoto <nemoto@toshiba-tops.co.jp>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 496e7b36b54f554a314bc218c4f02d51b81a2d81
Author: Russell King <rmk+kernel@arm.linux.org.uk>
Date:   Wed Sep 2 17:24:14 2015 +0800

    net: fec: clear receive interrupts before processing a packet
    
    [ Upstream commit ed63f1dcd5788d36f942fbcce350742385e3e18c ]
    
    The patch just to re-submit the patch "db3421c114cfa6326" because the
    patch "4d494cdc92b3b9a0" remove the change.
    
    Clear any pending receive interrupt before we process a pending packet.
    This helps to avoid any spurious interrupts being raised after we have
    fully cleaned the receive ring, while still allowing an interrupt to be
    raised if we receive another packet.
    
    The position of this is critical: we must do this prior to reading the
    next packet status to avoid potentially dropping an interrupt when a
    packet is still pending.
    
    Acked-by: Fugang Duan <B38611@freescale.com>
    Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dd35e5b8ad3ddcb3dd13e076ba87e16fd4bd2e99
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Thu Sep 3 00:29:07 2015 +0200

    ipv6: fix exthdrs offload registration in out_rt path
    
    [ Upstream commit e41b0bedba0293b9e1e8d1e8ed553104b9693656 ]
    
    We previously register IPPROTO_ROUTING offload under inet6_add_offload(),
    but in error path, we try to unregister it with inet_del_offload(). This
    doesn't seem correct, it should actually be inet6_del_offload(), also
    ipv6_exthdrs_offload_exit() from that commit seems rather incorrect (it
    also uses rthdr_offload twice), but it got removed entirely later on.
    
    Fixes: 3336288a9fea ("ipv6: Switch to using new offload infrastructure.")
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4230591d474281936dc03c37d0177f6015aedaea
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Wed Sep 2 14:00:36 2015 +0200

    sock, diag: fix panic in sock_diag_put_filterinfo
    
    [ Upstream commit b382c08656000c12a146723a153b85b13a855b49 ]
    
    diag socket's sock_diag_put_filterinfo() dumps classic BPF programs
    upon request to user space (ss -0 -b). However, native eBPF programs
    attached to sockets (SO_ATTACH_BPF) cannot be dumped with this method:
    
    Their orig_prog is always NULL. However, sock_diag_put_filterinfo()
    unconditionally tries to access its filter length resp. wants to copy
    the filter insns from there. Internal cBPF to eBPF transformations
    attached to sockets don't have this issue, as orig_prog state is kept.
    
    It's currently only used by packet sockets. If we would want to add
    native eBPF support in the future, this needs to be done through
    a different attribute than PACKET_DIAG_FILTER to not confuse possible
    user space disassemblers that work on diag data.
    
    Fixes: 89aa075832b0 ("net: sock: allow eBPF programs to be attached to sockets")
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
    Acked-by: Alexei Starovoitov <ast@plumgrid.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 001fc2f5d7ee719cf698eee845bc95d468b16380
Author: Mark Salter <msalter@redhat.com>
Date:   Tue Sep 1 09:36:05 2015 -0400

    phylib: fix device deletion order in mdiobus_unregister()
    
    [ Upstream commit b6c6aedcbcbacd7b0cb4b64ed5ac835bc1c60a03 ]
    
    commit 8b63ec1837fa ("phylib: Make PHYs children of their MDIO bus, not
    the bus' parent.") uncovered a problem in mdiobus_unregister() which
    leads to this warning when I reboot an APM Mustang (arm64) platform:
    
      WARNING: CPU: 7 PID: 4239 at fs/sysfs/group.c:224 sysfs_remove_group+0xa0/0xa4()
      sysfs group fffffe0000e07a10 not found for kobject 'xgene-mii-eth0:03'
      ...
      CPU: 7 PID: 4239 Comm: reboot Tainted: G            E   4.2.0-0.18.el7.test15.aarch64 #1
      Hardware name: AppliedMicro Mustang/Mustang, BIOS 1.1.0 Aug 26 2015
      Call Trace:
      [<fffffe000009739c>] dump_backtrace+0x0/0x170
      [<fffffe000009752c>] show_stack+0x20/0x2c
      [<fffffe00007436f0>] dump_stack+0x78/0x9c
      [<fffffe00000c2cb4>] warn_slowpath_common+0xa0/0xd8
      [<fffffe00000c2d60>] warn_slowpath_fmt+0x74/0x88
      [<fffffe0000293d3c>] sysfs_remove_group+0x9c/0xa4
      [<fffffe00004a8bac>] dpm_sysfs_remove+0x5c/0x70
      [<fffffe000049b388>] device_del+0x44/0x208
      [<fffffe000049b578>] device_unregister+0x2c/0x7c
      [<fffffe000050dc68>] mdiobus_unregister+0x48/0x94
      [<fffffe000052afd0>] xgene_enet_mdio_remove+0x28/0x44
      [<fffffe000052d3f0>] xgene_enet_remove+0xd0/0xd8
      [<fffffe000052d424>] xgene_enet_shutdown+0x2c/0x3c
      [<fffffe00004a204c>] platform_drv_shutdown+0x24/0x40
      [<fffffe000049d4f4>] device_shutdown+0xf0/0x1b4
      [<fffffe00000e31ec>] kernel_restart_prepare+0x40/0x4c
      [<fffffe00000e32f8>] kernel_restart+0x1c/0x80
      [<fffffe00000e3670>] SyS_reboot+0x17c/0x250
    
    The problem is that mdiobus_unregister() deletes the bus device before
    unregistering the phy devices on the bus. This wasn't a problem before
    because the phys were not children of the bus:
    
      /sys/devices/platform/APMC0D05:00/net/eth0/xgene-mii-eth0:03
      /sys/devices/platform/APMC0D05:00/net/eth0/xgene-mii-eth0
    
    But now that they are:
    
      /sys/devices/platform/APMC0D05:00/net/eth0/xgene-mii-eth0/xgene-mii-eth0:03
    
    when mdiobus_unregister deletes the bus device, the phy subdirs are
    removed from sysfs also. So when the phys are unregistered afterward,
    we get the warning. This patch changes the order so that phys are
    unregistered before the bus device is deleted.
    
    Fixes: 8b63ec1837fa ("phylib: Make PHYs children of their MDIO bus, not the bus' parent.")
    Signed-off-by: Mark Salter <msalter@redhat.com>
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Tested-by: Mark Langsdorf <mlangsdo@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>