aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2018-07-25Merge tag 'clk-fixes-for-linus' of ↵HEADmasterLinus Torvalds7-18/+87
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "One more round of updates for problems seen this -rc series. Drivers fixes are: - Amlogic Meson audio divider fix and CPU clk critical marking - Qualcomm multimedia GDSC marked as 'always on' to keep display working - Aspeed fixes for critical clks, resets causing clks to stay disabled, and an incorrect HPLL frequency calculation - Marvell Armada 3700 cpu clks would undervolt when switching from low frequencies to high frequencies because the voltage didn't stabilize in time so now we switch to an intermediate frequency Plus we have a core framework thinko that messed up the debugfs flag printing logic to make it not very useful" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: aspeed: Support HPLL strapping on ast2400 clk: mvebu: armada-37xx-periph: Fix switching CPU rate from 300Mhz to 1.2GHz clk: aspeed: Mark bclk (PCIe) and dclk (VGA) as critical clk/mmcc-msm8996: Make mmagic_bimc_gdsc ALWAYS_ON clk: aspeed: Treat a gate in reset as disabled clk: Really show symbolic clock flags in debugfs clk: qcom: gcc-msm8996: Disable halt check on UFS tx clock clk: meson: audio-divider is one based clk: meson-gxbb: set fclk_div2 as CLK_IS_CRITICAL
2018-07-25Merge tag 'fscache-fixes-20180725' of ↵Linus Torvalds7-14/+25
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull fscache/cachefiles fixes from David Howells: - Allow cancelled operations to be queued so they can be cleaned up. - Fix a refcounting bug in the monitoring of reads on backend files whereby a race can occur between monitor objects being listed for work, the work processing being queued and the work processor running and destroying the monitor objects. - Fix a ref overput in object attachment, whereby a tentatively considered object is put in error handling without first being 'got'. - Fix a missing clear of the CACHEFILES_OBJECT_ACTIVE flag whereby an assertion occurs when we retry because it seems the object is now active. - Wait rather BUG'ing on an object collision in the depths of cachefiles as the active object should be being cleaned up - also depends on the one above. * tag 'fscache-fixes-20180725' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: cachefiles: Wait rather than BUG'ing on "Unexpected object collision" cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag fscache: Fix reference overput in fscache_attach_object() error handling cachefiles: Fix refcounting bug in backing-file read monitoring fscache: Allow cancelled operations to be enqueued
2018-07-25cachefiles: Wait rather than BUG'ing on "Unexpected object collision"Kiran Kumar Modukuri1-1/+0
If we meet a conflicting object that is marked FSCACHE_OBJECT_IS_LIVE in the active object tree, we have been emitting a BUG after logging information about it and the new object. Instead, we should wait for the CACHEFILES_OBJECT_ACTIVE flag to be cleared on the old object (or return an error). The ACTIVE flag should be cleared after it has been removed from the active object tree. A timeout of 60s is used in the wait, so we shouldn't be able to get stuck there. Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem") Signed-off-by: Kiran Kumar Modukuri <kiran.modukuri@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com>
2018-07-25cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flagKiran Kumar Modukuri1-1/+1
In cachefiles_mark_object_active(), the new object is marked active and then we try to add it to the active object tree. If a conflicting object is already present, we want to wait for that to go away. After the wait, we go round again and try to re-mark the object as being active - but it's already marked active from the first time we went through and a BUG is issued. Fix this by clearing the CACHEFILES_OBJECT_ACTIVE flag before we try again. Analysis from Kiran Kumar Modukuri: [Impact] Oops during heavy NFS + FSCache + Cachefiles CacheFiles: Error: Overlong wait for old active object to go away. BUG: unable to handle kernel NULL pointer dereference at 0000000000000002 CacheFiles: Error: Object already active kernel BUG at fs/cachefiles/namei.c:163! [Cause] In a heavily loaded system with big files being read and truncated, an fscache object for a cookie is being dropped and a new object being looked. The new object being looked for has to wait for the old object to go away before the new object is moved to active state. [Fix] Clear the flag 'CACHEFILES_OBJECT_ACTIVE' for the new object when retrying the object lookup. [Testcase] Have run ~100 hours of NFS stress tests and have not seen this bug recur. [Regression Potential] - Limited to fscache/cachefiles. Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem") Signed-off-by: Kiran Kumar Modukuri <kiran.modukuri@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com>
2018-07-25fscache: Fix reference overput in fscache_attach_object() error handlingKiran Kumar Modukuri4-5/+8
When a cookie is allocated that causes fscache_object structs to be allocated, those objects are initialised with the cookie pointer, but aren't blessed with a ref on that cookie unless the attachment is successfully completed in fscache_attach_object(). If attachment fails because the parent object was dying or there was a collision, fscache_attach_object() returns without incrementing the cookie counter - but upon failure of this function, the object is released which then puts the cookie, whether or not a ref was taken on the cookie. Fix this by taking a ref on the cookie when it is assigned in fscache_object_init(), even when we're creating a root object. Analysis from Kiran Kumar: This bug has been seen in 4.4.0-124-generic #148-Ubuntu kernel BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776277 fscache cookie ref count updated incorrectly during fscache object allocation resulting in following Oops. kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321! kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639! [Cause] Two threads are trying to do operate on a cookie and two objects. (1) One thread tries to unmount the filesystem and in process goes over a huge list of objects marking them dead and deleting the objects. cookie->usage is also decremented in following path: nfs_fscache_release_super_cookie -> __fscache_relinquish_cookie ->__fscache_cookie_put ->BUG_ON(atomic_read(&cookie->usage) <= 0); (2) A second thread tries to lookup an object for reading data in following path: fscache_alloc_object 1) cachefiles_alloc_object -> fscache_object_init -> assign cookie, but usage not bumped. 2) fscache_attach_object -> fails in cant_attach_object because the cookie's backing object or cookie's->parent object are going away 3) fscache_put_object -> cachefiles_put_object ->fscache_object_destroy ->fscache_cookie_put ->BUG_ON(atomic_read(&cookie->usage) <= 0); [NOTE from dhowells] It's unclear as to the circumstances in which (2) can take place, given that thread (1) is in nfs_kill_super(), however a conflicting NFS mount with slightly different parameters that creates a different superblock would do it. A backtrace from Kiran seems to show that this is a possibility: kernel BUG at/build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639! ... RIP: __fscache_cookie_put+0x3a/0x40 [fscache] Call Trace: __fscache_relinquish_cookie+0x87/0x120 [fscache] nfs_fscache_release_super_cookie+0x2d/0xb0 [nfs] nfs_kill_super+0x29/0x40 [nfs] deactivate_locked_super+0x48/0x80 deactivate_super+0x5c/0x60 cleanup_mnt+0x3f/0x90 __cleanup_mnt+0x12/0x20 task_work_run+0x86/0xb0 exit_to_usermode_loop+0xc2/0xd0 syscall_return_slowpath+0x4e/0x60 int_ret_from_sys_call+0x25/0x9f [Fix] Bump up the cookie usage in fscache_object_init, when it is first being assigned a cookie atomically such that the cookie is added and bumped up if its refcount is not zero. Remove the assignment in fscache_attach_object(). [Testcase] I have run ~100 hours of NFS stress tests and not seen this bug recur. [Regression Potential] - Limited to fscache/cachefiles. Fixes: ccc4fc3d11e9 ("FS-Cache: Implement the cookie management part of the netfs API") Signed-off-by: Kiran Kumar Modukuri <kiran.modukuri@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com>
2018-07-25cachefiles: Fix refcounting bug in backing-file read monitoringKiran Kumar Modukuri1-5/+12
cachefiles_read_waiter() has the right to access a 'monitor' object by virtue of being called under the waitqueue lock for one of the pages in its purview. However, it has no ref on that monitor object or on the associated operation. What it is allowed to do is to move the monitor object to the operation's to_do list, but once it drops the work_lock, it's actually no longer permitted to access that object. However, it is trying to enqueue the retrieval operation for processing - but it can only do this via a pointer in the monitor object, something it shouldn't be doing. If it doesn't enqueue the operation, the operation may not get processed. If the order is flipped so that the enqueue is first, then it's possible for the work processor to look at the to_do list before the monitor is enqueued upon it. Fix this by getting a ref on the operation so that we can trust that it will still be there once we've added the monitor to the to_do list and dropped the work_lock. The op can then be enqueued after the lock is dropped. The bug can manifest in one of a couple of ways. The first manifestation looks like: FS-Cache: FS-Cache: Assertion failed FS-Cache: 6 == 5 is false ------------[ cut here ]------------ kernel BUG at fs/fscache/operation.c:494! RIP: 0010:fscache_put_operation+0x1e3/0x1f0 ... fscache_op_work_func+0x26/0x50 process_one_work+0x131/0x290 worker_thread+0x45/0x360 kthread+0xf8/0x130 ? create_worker+0x190/0x190 ? kthread_cancel_work_sync+0x10/0x10 ret_from_fork+0x1f/0x30 This is due to the operation being in the DEAD state (6) rather than INITIALISED, COMPLETE or CANCELLED (5) because it's already passed through fscache_put_operation(). The bug can also manifest like the following: kernel BUG at fs/fscache/operation.c:69! ... [exception RIP: fscache_enqueue_operation+246] ... #7 [ffff883fff083c10] fscache_enqueue_operation at ffffffffa0b793c6 #8 [ffff883fff083c28] cachefiles_read_waiter at ffffffffa0b15a48 #9 [ffff883fff083c48] __wake_up_common at ffffffff810af028 I'm not entirely certain as to which is line 69 in Lei's kernel, so I'm not entirely clear which assertion failed. Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem") Reported-by: Lei Xue <carmark.dlut@gmail.com> Reported-by: Vegard Nossum <vegard.nossum@gmail.com> Reported-by: Anthony DeRobertis <aderobertis@metrics.net> Reported-by: NeilBrown <neilb@suse.com> Reported-by: Daniel Axtens <dja@axtens.net> Reported-by: Kiran Kumar Modukuri <kiran.modukuri@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Daniel Axtens <dja@axtens.net>
2018-07-25fscache: Allow cancelled operations to be enqueuedKiran Kumar Modukuri1-2/+4
Alter the state-check assertion in fscache_enqueue_operation() to allow cancelled operations to be given processing time so they can be cleaned up. Also fix a debugging statement that was requiring such operations to have an object assigned. Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem") Reported-by: Kiran Kumar Modukuri <kiran.modukuri@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com>
2018-07-24Merge tag 'mips_fixes_4.18_4' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Paul Burton: "A couple more MIPS fixes for 4.18: - Fix an off-by-one in reporting PCI resource sizes to userland which regressed in v3.12. - Fix writes to DDR controller registers used to flush write buffers, which regressed with some refactoring in v4.2" * tag 'mips_fixes_4.18_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: ath79: fix register address in ath79_ddr_wb_flush() MIPS: Fix off-by-one in pci_resource_to_user()
2018-07-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds75-562/+1029
Pull networking fixes from David Miller: 1) Handle stations tied to AP_VLANs properly during mac80211 hw reconfig. From Manikanta Pubbisetty. 2) Fix jump stack depth validation in nf_tables, from Taehee Yoo. 3) Fix quota handling in aRFS flow expiration of mlx5 driver, from Eran Ben Elisha. 4) Exit path handling fix in powerpc64 BPF JIT, from Daniel Borkmann. 5) Use ptr_ring_consume_bh() in page pool code, from Tariq Toukan. 6) Fix cached netdev name leak in nf_tables, from Florian Westphal. 7) Fix memory leaks on chain rename, also from Florian Westphal. 8) Several fixes to DCTCP congestion control ACK handling, from Yuchunk Cheng. 9) Missing rcu_read_unlock() in CAIF protocol code, from Yue Haibing. 10) Fix link local address handling with VRF, from David Ahern. 11) Don't clobber 'err' on a successful call to __skb_linearize() in skb_segment(). From Eric Dumazet. 12) Fix vxlan fdb notification races, from Roopa Prabhu. 13) Hash UDP fragments consistently, from Paolo Abeni. 14) If TCP receives lots of out of order tiny packets, we do really silly stuff. Make the out-of-order queue ending more robust to this kind of behavior, from Eric Dumazet. 15) Don't leak netlink dump state in nf_tables, from Florian Westphal. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits) net: axienet: Fix double deregister of mdio qmi_wwan: fix interface number for DW5821e production firmware ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull bnx2x: Fix invalid memory access in rss hash config path. net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper r8169: restore previous behavior to accept BIOS WoL settings cfg80211: never ignore user regulatory hint sock: fix sg page frag coalescing in sk_alloc_sg netfilter: nf_tables: move dumper state allocation into ->start tcp: add tcp_ooo_try_coalesce() helper tcp: call tcp_drop() from tcp_data_queue_ofo() tcp: detect malicious patterns in tcp_collapse_ofo_queue() tcp: avoid collapses in tcp_prune_queue() if possible tcp: free batches of packets in tcp_prune_ofo_queue() ip: hash fragments consistently ipv6: use fib6_info_hold_safe() when necessary can: xilinx_can: fix power management handling can: xilinx_can: fix incorrect clear of non-processed interrupts can: xilinx_can: fix RX overflow interrupt not being enabled can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting ...
2018-07-24net: axienet: Fix double deregister of mdioShubhrajyoti Datta1-0/+1
If the registration fails then mdio_unregister is called. However at unbind the unregister ia attempted again resulting in the below crash [ 73.544038] kernel BUG at drivers/net/phy/mdio_bus.c:415! [ 73.549362] Internal error: Oops - BUG: 0 [#1] SMP [ 73.554127] Modules linked in: [ 73.557168] CPU: 0 PID: 2249 Comm: sh Not tainted 4.14.0 #183 [ 73.562895] Hardware name: xlnx,zynqmp (DT) [ 73.567062] task: ffffffc879e41180 task.stack: ffffff800cbe0000 [ 73.572973] PC is at mdiobus_unregister+0x84/0x88 [ 73.577656] LR is at axienet_mdio_teardown+0x18/0x30 [ 73.582601] pc : [<ffffff80085fa4cc>] lr : [<ffffff8008616858>] pstate: 20000145 [ 73.589981] sp : ffffff800cbe3c30 [ 73.593277] x29: ffffff800cbe3c30 x28: ffffffc879e41180 [ 73.598573] x27: ffffff8008a21000 x26: 0000000000000040 [ 73.603868] x25: 0000000000000124 x24: ffffffc879efe920 [ 73.609164] x23: 0000000000000060 x22: ffffffc879e02000 [ 73.614459] x21: ffffffc879e02800 x20: ffffffc87b0b8870 [ 73.619754] x19: ffffffc879e02800 x18: 000000000000025d [ 73.625050] x17: 0000007f9a719ad0 x16: ffffff8008195bd8 [ 73.630345] x15: 0000007f9a6b3d00 x14: 0000000000000010 [ 73.635640] x13: 74656e7265687465 x12: 0000000000000030 [ 73.640935] x11: 0000000000000030 x10: 0101010101010101 [ 73.646231] x9 : 241f394f42533300 x8 : ffffffc8799f6e98 [ 73.651526] x7 : ffffffc8799f6f18 x6 : ffffffc87b0ba318 [ 73.656822] x5 : ffffffc87b0ba498 x4 : 0000000000000000 [ 73.662117] x3 : 0000000000000000 x2 : 0000000000000008 [ 73.667412] x1 : 0000000000000004 x0 : ffffffc8799f4000 [ 73.672708] Process sh (pid: 2249, stack limit = 0xffffff800cbe0000) Fix the same by making the bus NULL on unregister. Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24qmi_wwan: fix interface number for DW5821e production firmwareAleksander Morgado1-1/+1
The original mapping for the DW5821e was done using a development version of the firmware. Confirmed with the vendor that the final USB layout ends up exposing the QMI control/data ports in USB config #1, interface #0, not in interface #1 (which is now a HID interface). T: Bus=01 Lev=03 Prnt=04 Port=00 Cnt=01 Dev#= 16 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 2 P: Vendor=413c ProdID=81d7 Rev=03.18 S: Manufacturer=DELL S: Product=DW5821e Snapdragon X20 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#= 1 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option Fixes: e7e197edd09c25 ("qmi_wwan: add support for the Dell Wireless 5821e module") Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pullWillem de Bruijn2-4/+10
Syzbot reported a read beyond the end of the skb head when returning IPV6_ORIGDSTADDR: BUG: KMSAN: kernel-infoleak in put_cmsg+0x5ef/0x860 net/core/scm.c:242 CPU: 0 PID: 4501 Comm: syz-executor128 Not tainted 4.17.0+ #9 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:113 kmsan_report+0x188/0x2a0 mm/kmsan/kmsan.c:1125 kmsan_internal_check_memory+0x138/0x1f0 mm/kmsan/kmsan.c:1219 kmsan_copy_to_user+0x7a/0x160 mm/kmsan/kmsan.c:1261 copy_to_user include/linux/uaccess.h:184 [inline] put_cmsg+0x5ef/0x860 net/core/scm.c:242 ip6_datagram_recv_specific_ctl+0x1cf3/0x1eb0 net/ipv6/datagram.c:719 ip6_datagram_recv_ctl+0x41c/0x450 net/ipv6/datagram.c:733 rawv6_recvmsg+0x10fb/0x1460 net/ipv6/raw.c:521 [..] This logic and its ipv4 counterpart read the destination port from the packet at skb_transport_offset(skb) + 4. With MSG_MORE and a local SOCK_RAW sender, syzbot was able to cook a packet that stores headers exactly up to skb_transport_offset(skb) in the head and the remainder in a frag. Call pskb_may_pull before accessing the pointer to ensure that it lies in skb head. Link: http://lkml.kernel.org/r/CAF=yD-LEJwZj5a1-bAAj2Oy_hKmGygV6rsJ_WOrAYnv-fnayiQ@mail.gmail.com Reported-by: syzbot+9adb4b567003cac781f0@syzkaller.appspotmail.com Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24bnx2x: Fix invalid memory access in rss hash config path.Sudarsana Reddy Kalluru1-3/+10
Rx hash/filter table configuration uses rss_conf_obj to configure filters in the hardware. This object is initialized only when the interface is brought up. This patch adds driver changes to configure rss params only when the device is in opened state. In port disabled case, the config will be cached in the driver structure which will be applied in the successive load path. Please consider applying it to 'net' branch. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapperJack Morgenstein1-1/+1
Function mlx4_RST2INIT_QP_wrapper saved the qp number passed in the qp context, rather than the one passed in the input modifier. However, the qp number in the qp context is not defined as a required parameter by the FW. Therefore, drivers may choose to not specify the qp number in the qp context for the reset-to-init transition. Thus, we must save the qp number passed in the command input modifier -- which is always present. (This saved qp number is used as the input modifier for command 2RST_QP when a slave's qp's are destroyed). Fixes: c82e9aa0a8bc ("mlx4_core: resource tracking for HCA resources used by guests") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24r8169: restore previous behavior to accept BIOS WoL settingsHeiner Kallweit1-2/+1
Commit 7edf6d314cd0 tried to resolve an inconsistency (BIOS WoL settings are accepted, but device isn't wakeup-enabled) resulting from a previous broken-BIOS workaround by making disabled WoL the default. This however had some side effects, most likely due to a broken BIOS some systems don't properly resume from suspend when the MagicPacket WoL bit isn't set in the chip, see https://bugzilla.kernel.org/show_bug.cgi?id=200195 Therefore restore the WoL behavior from 4.16. Reported-by: Albert Astals Cid <aacid@kde.org> Fixes: 7edf6d314cd0 ("r8169: disable WOL per default") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fix from Martin Schwidefsky. Guenter Roeck reports that the s390 allmodconfig build fails because of a gcc plugin problem. The fix won't be in-tree until 4.19, so for now disable the gcc plugins on s390. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: disable gcc plugins
2018-07-24media: staging: omap4iss: Include asm/cacheflush.h after generic includesGuenter Roeck1-1/+2
Including asm/cacheflush.h first results in the following build error when trying to build sparc32:allmodconfig, because 'struct page' has not been declared, and the function declaration ends up creating a separate (private) declaration of struct page (as a result of function arguments being in the scope of the function declaration and definition, not in global scope). The C scoping rules do not just affect variable visibility, they also affect type declaration visibility. The end result is that when the actual call site is seen in <linux/highmem.h>, the 'struct page' type in the caller is not the same 'struct page' that the function was declared with, resulting in: In file included from arch/sparc/include/asm/page.h:10:0, ... from drivers/staging/media/omap4iss/iss_video.c:15: include/linux/highmem.h: In function 'clear_user_highpage': include/linux/highmem.h:137:31: error: passing argument 1 of 'sparc_flush_page_to_ram' from incompatible pointer type Include generic includes files first to fix the problem. Fixes: fc96d58c10162 ("[media] v4l: omap4iss: Add support for OMAP4 camera interface - Video devices") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> [ Added explanation of C scope rules - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller7-150/+191
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Make sure we don't go over the maximum jump stack boundary, from Taehee Yoo. 2) Missing rcu_barrier() in hash and rbtree sets, also from Taehee. 3) Missing check to nul-node in rbtree timeout routine, from Taehee. 4) Use dev->name from flowtable to fix a memleak, from Florian. 5) Oneliner to free flowtable object on removal, from Florian. 6) Memleak in chain rename transaction, again from Florian. 7) Don't allow two chains to use the same name in the same transaction, from Florian. 8) handle DCCP SYNC/SYNCACK as invalid, this triggers an uninitialized timer in conntrack reported by syzbot, from Florian. 9) Fix leak in case netlink_dump_start() fails, from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24Merge tag 'mac80211-for-davem-2018-07-24' of ↵David S. Miller6-53/+38
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Only a few fixes: * always keep regulatory user hint * add missing break statement in station flags parsing * fix non-linear SKBs in port-control-over-nl80211 * reconfigure VLAN stations during HW restart ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24cfg80211: never ignore user regulatory hintAmar Singhal1-25/+3
Currently user regulatory hint is ignored if all wiphys in the system are self managed. But the hint is not ignored if there is no wiphy in the system. This affects the global regulatory setting. Global regulatory setting needs to be maintained so that it can be applied to a new wiphy entering the system. Therefore, do not ignore user regulatory setting even if all wiphys in the system are self managed. Signed-off-by: Amar Singhal <asinghal@codeaurora.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-07-24s390: disable gcc pluginsMartin Schwidefsky1-1/+1
The s390 build currently fails with the latent entropy plugin: arch/s390/kernel/als.o: In function `verify_facilities': als.c:(.init.text+0x24): undefined reference to `latent_entropy' als.c:(.init.text+0xae): undefined reference to `latent_entropy' make[3]: *** [arch/s390/boot/compressed/vmlinux] Error 1 make[2]: *** [arch/s390/boot/compressed/vmlinux] Error 2 make[1]: *** [bzImage] Error 2 This will be fixed with the early boot rework from Vasily, which is planned for the 4.19 merge window. For 4.18 the simplest solution is to disable the gcc plugins and reenable them after the early boot rework is upstream. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-07-23sock: fix sg page frag coalescing in sk_alloc_sgDaniel Borkmann1-3/+3
Current sg coalescing logic in sk_alloc_sg() (latter is used by tls and sockmap) is not quite correct in that we do fetch the previous sg entry, however the subsequent check whether the refilled page frag from the socket is still the same as from the last entry with prior offset and length matching the start of the current buffer is comparing always the first sg list entry instead of the prior one. Fixes: 3c4d7559159b ("tls: kernel TLS support") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24netfilter: nf_tables: move dumper state allocation into ->startFlorian Westphal1-104/+115
Shaochun Chen points out we leak dumper filter state allocations stored in dump_control->data in case there is an error before netlink sets cb_running (after which ->done will be called at some point). In order to fix this, add .start functions and do the allocations there. ->done is going to clean up, and in case error occurs before ->start invocation no cleanups need to be done anymore. Reported-by: shaochun chen <cscnull@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-23Merge branch 'tcp-robust-ooo'David S. Miller1-12/+50
Eric Dumazet says: ==================== Juha-Matti Tilli reported that malicious peers could inject tiny packets in out_of_order_queue, forcing very expensive calls to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for every incoming packet. With tcp_rmem[2] default of 6MB, the ooo queue could contain ~7000 nodes. This patch series makes sure we cut cpu cycles enough to render the attack not critical. We might in the future go further, like disconnecting or black-holing proven malicious flows. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23tcp: add tcp_ooo_try_coalesce() helperEric Dumazet1-4/+21
In case skb in out_or_order_queue is the result of multiple skbs coalescing, we would like to get a proper gso_segs counter tracking, so that future tcp_drop() can report an accurate number. I chose to not implement this tracking for skbs in receive queue, since they are not dropped, unless socket is disconnected. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23tcp: call tcp_drop() from tcp_data_queue_ofo()Eric Dumazet1-2/+2
In order to be able to give better diagnostics and detect malicious traffic, we need to have better sk->sk_drops tracking. Fixes: 9f5afeae5152 ("tcp: use an RB tree for ooo receive queue") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23tcp: detect malicious patterns in tcp_collapse_ofo_queue()Eric Dumazet1-2/+13
In case an attacker feeds tiny packets completely out of order, tcp_collapse_ofo_queue() might scan the whole rb-tree, performing expensive copies, but not changing socket memory usage at all. 1) Do not attempt to collapse tiny skbs. 2) Add logic to exit early when too many tiny skbs are detected. We prefer not doing aggressive collapsing (which copies packets) for pathological flows, and revert to tcp_prune_ofo_queue() which will be less expensive. In the future, we might add the possibility of terminating flows that are proven to be malicious. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23tcp: avoid collapses in tcp_prune_queue() if possibleEric Dumazet1-0/+3
Right after a TCP flow is created, receiving tiny out of order packets allways hit the condition : if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf) tcp_clamp_window(sk); tcp_clamp_window() increases sk_rcvbuf to match sk_rmem_alloc (guarded by tcp_rmem[2]) Calling tcp_collapse_ofo_queue() in this case is not useful, and offers a O(N^2) surface attack to malicious peers. Better not attempt anything before full queue capacity is reached, forcing attacker to spend lots of resource and allow us to more easily detect the abuse. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23tcp: free batches of packets in tcp_prune_ofo_queue()Eric Dumazet1-4/+11
Juha-Matti Tilli reported that malicious peers could inject tiny packets in out_of_order_queue, forcing very expensive calls to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for every incoming packet. out_of_order_queue rb-tree can contain thousands of nodes, iterating over all of them is not nice. Before linux-4.9, we would have pruned all packets in ofo_queue in one go, every XXXX packets. XXXX depends on sk_rcvbuf and skbs truesize, but is about 7000 packets with tcp_rmem[2] default of 6 MB. Since we plan to increase tcp_rmem[2] in the future to cope with modern BDP, can not revert to the old behavior, without great pain. Strategy taken in this patch is to purge ~12.5 % of the queue capacity. Fixes: 36a6503fedda ("tcp: refine tcp_prune_ofo_queue() to not drop all packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23ip: hash fragments consistentlyPaolo Abeni2-0/+4
The skb hash for locally generated ip[v6] fragments belonging to the same datagram can vary in several circumstances: * for connected UDP[v6] sockets, the first fragment get its hash via set_owner_w()/skb_set_hash_from_sk() * for unconnected IPv6 UDPv6 sockets, the first fragment can get its hash via ip6_make_flowlabel()/skb_get_hash_flowi6(), if auto_flowlabel is enabled For the following frags the hash is usually computed via skb_get_hash(). The above can cause OoO for unconnected IPv6 UDPv6 socket: in that scenario the egress tx queue can be selected on a per packet basis via the skb hash. It may also fool flow-oriented schedulers to place fragments belonging to the same datagram in different flows. Fix the issue by copying the skb hash from the head frag into the others at fragmentation time. Before this commit: perf probe -a "dev_queue_xmit skb skb->hash skb->l4_hash:b1@0/8 skb->sw_hash:b1@1/8" netperf -H $IPV4 -t UDP_STREAM -l 5 -- -m 2000 -n & perf record -e probe:dev_queue_xmit -e probe:skb_set_owner_w -a sleep 0.1 perf script probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=3713014309 l4_hash=1 sw_hash=0 probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=0 l4_hash=0 sw_hash=0 After this commit: probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0 probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0 Fixes: b73c3d0e4f0e ("net: Save TX flow hash in sock and set in skbuf on xmit") Fixes: 67800f9b1f4e ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23ipv6: use fib6_info_hold_safe() when necessaryWei Wang3-11/+38
In the code path where only rcu read lock is held, e.g. in the route lookup code path, it is not safe to directly call fib6_info_hold() because the fib6_info may already have been deleted but still exists in the rcu grace period. Holding reference to it could cause double free and crash the kernel. This patch adds a new function fib6_info_hold_safe() and replace fib6_info_hold() in all necessary places. Syzbot reported 3 crash traces because of this. One of them is: 8021q: adding VLAN 0 to HW filter on device team0 IPv6: ADDRCONF(NETDEV_CHANGE): team0: link becomes ready dst_release: dst:(____ptrval____) refcnt:-1 dst_release: dst:(____ptrval____) refcnt:-2 WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 dst_hold include/net/dst.h:239 [inline] WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204 dst_release: dst:(____ptrval____) refcnt:-1 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 4845 Comm: syz-executor493 Not tainted 4.18.0-rc3+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 panic+0x238/0x4e7 kernel/panic.c:184 dst_release: dst:(____ptrval____) refcnt:-2 dst_release: dst:(____ptrval____) refcnt:-3 __warn.cold.8+0x163/0x1ba kernel/panic.c:536 dst_release: dst:(____ptrval____) refcnt:-4 report_bug+0x252/0x2d0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:178 [inline] do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296 dst_release: dst:(____ptrval____) refcnt:-5 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992 RIP: 0010:dst_hold include/net/dst.h:239 [inline] RIP: 0010:ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204 Code: c1 ed 03 89 9d 18 ff ff ff 48 b8 00 00 00 00 00 fc ff df 41 c6 44 05 00 f8 e9 2d 01 00 00 4c 8b a5 c8 fe ff ff e8 1a f6 e6 fa <0f> 0b e9 6a fc ff ff e8 0e f6 e6 fa 48 8b 85 d0 fe ff ff 48 8d 78 RSP: 0018:ffff8801a8fcf178 EFLAGS: 00010293 RAX: ffff8801a8eba5c0 RBX: 0000000000000000 RCX: ffffffff869511e6 RDX: 0000000000000000 RSI: ffffffff869515b6 RDI: 0000000000000005 RBP: ffff8801a8fcf2c8 R08: ffff8801a8eba5c0 R09: ffffed0035ac8338 R10: ffffed0035ac8338 R11: ffff8801ad6419c3 R12: ffff8801a8fcf720 R13: ffff8801a8fcf6a0 R14: ffff8801ad6419c0 R15: ffff8801ad641980 ip6_make_skb+0x2c8/0x600 net/ipv6/ip6_output.c:1768 udpv6_sendmsg+0x2c90/0x35f0 net/ipv6/udp.c:1376 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:641 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:651 ___sys_sendmsg+0x51d/0x930 net/socket.c:2125 __sys_sendmmsg+0x240/0x6f0 net/socket.c:2220 __do_sys_sendmmsg net/socket.c:2249 [inline] __se_sys_sendmmsg net/socket.c:2246 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2246 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x446ba9 Code: e8 cc bb 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fb39a469da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 00000000006dcc54 RCX: 0000000000446ba9 RDX: 00000000000000b8 RSI: 0000000020001b00 RDI: 0000000000000003 RBP: 00000000006dcc50 R08: 00007fb39a46a700 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 45c828efc7a64843 R13: e6eeb815b9d8a477 R14: 5068caf6f713c6fc R15: 0000000000000001 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Rebooting in 86400 seconds.. Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes") Reported-by: syzbot+902e2a1bcd4f7808cef5@syzkaller.appspotmail.com Reported-by: syzbot+8ae62d67f647abeeceb9@syzkaller.appspotmail.com Reported-by: syzbot+3f08feb14086930677d0@syzkaller.appspotmail.com Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23Merge tag 'linux-can-fixes-for-4.18-20180723' of ↵David S. Miller4-113/+321
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2018-07-23 this is a pull request of 12 patches for net/master. The patch by Stephane Grosjean for the peak_canfd CAN driver fixes a problem with older firmware. The next patch is by Roman Fietze and fixes the setup of the CCCR register in the m_can driver. Nicholas Mc Guire's patch for the mpc5xxx_can driver adds missing error checking. The two patches by Faiz Abbas fix the runtime resume and clean up the probe function in the m_can driver. The last 7 patches by Anssi Hannula fix several problem in the xilinx_can driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23can: xilinx_can: fix power management handlingAnssi Hannula1-41/+28
There are several issues with the suspend/resume handling code of the driver: - The device is attached and detached in the runtime_suspend() and runtime_resume() callbacks if the interface is running. However, during xcan_chip_start() the interface is considered running, causing the resume handler to incorrectly call netif_start_queue() at the beginning of xcan_chip_start(), and on xcan_chip_start() error return the suspend handler detaches the device leaving the user unable to bring-up the device anymore. - The device is not brought properly up on system resume. A reset is done and the code tries to determine the bus state after that. However, after reset the device is always in Configuration mode (down), so the state checking code does not make sense and communication will also not work. - The suspend callback tries to set the device to sleep mode (low-power mode which monitors the bus and brings the device back to normal mode on activity), but then immediately disables the clocks (possibly before the device reaches the sleep mode), which does not make sense to me. If a clean shutdown is wanted before disabling clocks, we can just bring it down completely instead of only sleep mode. Reorganize the PM code so that only the clock logic remains in the runtime PM callbacks and the system PM callbacks contain the device bring-up/down logic. This makes calling the runtime PM callbacks during e.g. xcan_chip_start() safe. The system PM callbacks now simply call common code to start/stop the HW if the interface was running, replacing the broken code from before. xcan_chip_stop() is updated to use the common reset code so that it will wait for the reset to complete. Reset also disables all interrupts so do not do that separately. Also, the device_may_wakeup() checks are removed as the driver does not have wakeup support. Tested on Zynq-7000 integrated CAN. Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: Michal Simek <michal.simek@xilinx.com> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-07-23can: xilinx_can: fix incorrect clear of non-processed interruptsAnssi Hannula1-5/+5
xcan_interrupt() clears ERROR|RXOFLV|BSOFF|ARBLST interrupts if any of them is asserted. This does not take into account that some of them could have been asserted between interrupt status read and interrupt clear, therefore clearing them without handling them. Fix the code to only clear those interrupts that it knows are asserted and therefore going to be processed in xcan_err_interrupt(). Fixes: b1201e44f50b ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: Michal Simek <michal.simek@xilinx.com> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-07-23can: xilinx_can: fix RX overflow interrupt not being enabledAnssi Hannula1-1/+1
RX overflow interrupt (RXOFLW) is disabled even though xcan_interrupt() processes it. This means that an RX overflow interrupt will only be processed when another interrupt gets asserted (e.g. for RX/TX). Fix that by enabling the RXOFLW interrupt. Fixes: b1201e44f50b ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: Michal Simek <michal.simek@xilinx.com> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-07-23can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accountingAnssi Hannula1-16/+123
The xilinx_can driver assumes that the TXOK interrupt only clears after it has been acknowledged as many times as there have been successfully sent frames. However, the documentation does not mention such behavior, instead saying just that the interrupt is cleared when the clear bit is set. Similarly, testing seems to also suggest that it is immediately cleared regardless of the amount of frames having been sent. Performing some heavy TX load and then going back to idle has the tx_head drifting further away from tx_tail over time, steadily reducing the amount of frames the driver keeps in the TX FIFO (but not to zero, as the TXOK interrupt always frees up space for 1 frame from the driver's perspective, so frames continue to be sent) and delaying the local echo frames. The TX FIFO tracking is also otherwise buggy as it does not account for TX FIFO being cleared after software resets, causing BUG!, TX FIFO full when queue awake! messages to be output. There does not seem to be any way to accurately track the state of the TX FIFO for local echo support while using the full TX FIFO. The Zynq version of the HW (but not the soft-AXI version) has watermark programming support and with it an additional TX-FIFO-empty interrupt bit. Modify the driver to only put 1 frame into TX FIFO at a time on soft-AXI and 2 frames at a time on Zynq. On Zynq the TXFEMP interrupt bit is used to detect whether 1 or 2 frames have been sent at interrupt processing time. Tested with the integrated CAN on Zynq-7000 SoC. The 1-frame-FIFO mode was also tested. An alternative way to solve this would be to drop local echo support but keep using the full TX FIFO. v2: Add FIFO space check before TX queue wake with locking to synchronize with queue stop. This avoids waking the queue when xmit() had just filled it. v3: Keep local echo support and reduce the amount of frames in FIFO instead as suggested by Marc Kleine-Budde. Fixes: b1201e44f50b ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-07-23can: xilinx_can: fix recovery from error states not being propagatedAnssi Hannula1-28/+127
The xilinx_can driver contains no mechanism for propagating recovery from CAN_STATE_ERROR_WARNING and CAN_STATE_ERROR_PASSIVE. Add such a mechanism by factoring the handling of XCAN_STATE_ERROR_PASSIVE and XCAN_STATE_ERROR_WARNING out of xcan_err_interrupt and checking for recovery after RX and TX if the interface is in one of those states. Tested with the integrated CAN on Zynq-7000 SoC. Fixes: b1201e44f50b ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-07-23can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOKAnssi Hannula1-13/+5
If the device gets into a state where RXNEMP (RX FIFO not empty) interrupt is asserted without RXOK (new frame received successfully) interrupt being asserted, xcan_rx_poll() will continue to try to clear RXNEMP without actually reading frames from RX FIFO. If the RX FIFO is not empty, the interrupt will not be cleared and napi_schedule() will just be called again. This situation can occur when: (a) xcan_rx() returns without reading RX FIFO due to an error condition. The code tries to clear both RXOK and RXNEMP but RXNEMP will not clear due to a frame still being in the FIFO. The frame will never be read from the FIFO as RXOK is no longer set. (b) A frame is received between xcan_rx_poll() reading interrupt status and clearing RXOK. RXOK will be cleared, but RXNEMP will again remain set as the new message is still in the FIFO. I'm able to trigger case (b) by flooding the bus with frames under load. There does not seem to be any benefit in using both RXNEMP and RXOK in the way the driver does, and the polling example in the reference manual (UG585 v1.10 18.3.7 Read Messages from RxFIFO) also says that either RXOK or RXNEMP can be used for detecting incoming messages. Fix the issue and simplify the RX processing by only using RXNEMP without RXOK. Tested with the integrated CAN on Zynq-7000 SoC. Fixes: b1201e44f50b ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-07-23can: xilinx_can: fix device dropping off bus on RX overrunAnssi Hannula1-1/+0
The xilinx_can driver performs a software reset when an RX overrun is detected. This causes the device to enter Configuration mode where no messages are received or transmitted. The documentation does not mention any need to perform a reset on an RX overrun, and testing by inducing an RX overflow also indicated that the device continues to work just fine without a reset. Remove the software reset. Tested with the integrated CAN on Zynq-7000 SoC. Fixes: b1201e44f50b ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-07-23can: m_can: Move accessing of message ram to after clocks are enabledFaiz Abbas1-4/+3
MCAN message ram should only be accessed once clocks are enabled. Therefore, move the call to parse/init the message ram to after clocks are enabled. Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-07-23can: m_can: Fix runtime resume callFaiz Abbas1-4/+4
pm_runtime_get_sync() returns a 1 if the state of the device is already 'active'. This is not a failure case and should return a success. Therefore fix error handling for pm_runtime_get_sync() call such that it returns success when the value is 1. Also cleanup the TODO for using runtime PM for sleep mode as that is implemented. Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Cc: <stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-07-23can: mpc5xxx_can: check of_iomap return before useNicholas Mc Guire1-0/+5
of_iomap() can return NULL so that return needs to be checked and NULL treated as failure. While at it also take care of the missing of_node_put() in the error path. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Fixes: commit afa17a500a36 ("net/can: add driver for mscan family & mpc52xx_mscan") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-07-23can: m_can.c: fix setup of CCCR register: clear CCCR NISO bit before ↵Roman Fietze1-1/+2
checking can.ctrlmode Inside m_can_chip_config(), when setting up the new value of the CCCR, the CCCR_NISO bit is not cleared like the others, CCCR_TEST, CCCR_MON, CCCR_BRSE and CCCR_FDOE, before checking the can.ctrlmode bits for CAN_CTRLMODE_FD_NON_ISO. This way once the controller was configured for CAN_CTRLMODE_FD_NON_ISO, this mode could never be cleared again. This fix is only relevant for controllers with version 3.1.x or 3.2.x. Older versions do not support NISO. Signed-off-by: Roman Fietze <roman.fietze@telemotive.de> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-07-23can: peak_canfd: fix firmware < v3.3.0: limit allocation to 32-bit DMA addr onlyStephane Grosjean1-0/+19
The DMA logic in firmwares < v3.3.0 embedded in the PCAN-PCIe FD cards family is not capable of handling a mix of 32-bit and 64-bit logical addresses. If the board is equipped with 2 or 4 CAN ports, then such a situation might lead to a PCIe Bus Error "Malformed TLP" packet as well as "irq xx: nobody cared" issue. This patch adds a workaround that requests only 32-bit DMA addresses when these might be allocated outside of the 4 GB area. This issue has been fixed in firmware v3.3.0 and next. Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-07-22Linux 4.18-rc6v4.18-rc6Linus Torvalds1-1/+1
2018-07-22Merge tag 'nvme-for-4.18' of git://git.infradead.org/nvmeLinus Torvalds2-34/+41
Pull NVMe fixes from Christoph Hellwig: - fix a regression in 4.18 that causes a memory leak on probe failure (Keith Bush) - fix a deadlock in the passthrough ioctl code (Scott Bauer) - don't enable AENs if not supported (Weiping Zhang) - fix an old regression in metadata handling in the passthrough ioctl code (Roland Dreier) * tag 'nvme-for-4.18' of git://git.infradead.org/nvme: nvme: fix handling of metadata_len for NVME_IOCTL_IO_CMD nvme: don't enable AEN if not supported nvme: ensure forward progress during Admin passthru nvme-pci: fix memory leak on probe failure
2018-07-22Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds8-29/+14
Pull vfs fixes from Al Viro: "Fix several places that screw up cleanups after failures halfway through opening a file (one open-coding filp_clone_open() and getting it wrong, two misusing alloc_file()). That part is -stable fodder from the 'work.open' branch. And Christoph's regression fix for uapi breakage in aio series; include/uapi/linux/aio_abi.h shouldn't be pulling in the kernel definition of sigset_t, the reason for doing so in the first place had been bogus - there's no need to expose struct __aio_sigset in aio_abi.h at all" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: aio: don't expose __aio_sigset in uapi ocxlflash_getfile(): fix double-iput() on alloc_file() failures cxl_getfile(): fix double-iput() on alloc_file() failures drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open()
2018-07-22alpha: fix osf_wait4() breakageAl Viro2-5/+2
kernel_wait4() expects a userland address for status - it's only rusage that goes as a kernel one (and needs a copyout afterwards) [ Also, fix the prototype of kernel_wait4() to have that __user annotation - Linus ] Fixes: 92ebce5ac55d ("osf_wait4: switch to kernel_wait4()") Cc: stable@kernel.org # v4.13+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-22net: prevent ISA drivers from building on PPC32Randy Dunlap3-3/+4
Prevent drivers from building on PPC32 if they use isa_bus_to_virt(), isa_virt_to_bus(), or isa_page_to_bus(), which are not available and thus cause build errors. ../drivers/net/ethernet/3com/3c515.c: In function 'corkscrew_open': ../drivers/net/ethernet/3com/3c515.c:824:9: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration] ../drivers/net/ethernet/amd/lance.c: In function 'lance_rx': ../drivers/net/ethernet/amd/lance.c:1203:23: error: implicit declaration of function 'isa_bus_to_virt'; did you mean 'bus_to_virt'? [-Werror=implicit-function-declaration] ../drivers/net/ethernet/amd/ni65.c: In function 'ni65_init_lance': ../drivers/net/ethernet/amd/ni65.c:585:20: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration] ../drivers/net/ethernet/cirrus/cs89x0.c: In function 'net_open': ../drivers/net/ethernet/cirrus/cs89x0.c:897:20: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration] Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-22nfp: flower: ensure dead neighbour entries are not offloadedJohn Hurley1-1/+1
Previously only the neighbour state was checked to decide if an offloaded entry should be removed. However, there can be situations when the entry is dead but still marked as valid. This can lead to dead entries not being removed from fw tables or even incorrect data being added. Check the entry dead bit before deciding if it should be added to or removed from fw neighbour tables. Fixes: 8e6a9046b66a ("nfp: flower vxlan neighbour offload") Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-22Merge branch 'vxlan-fix-default-fdb-entry-user-space-notify-ordering-race'David S. Miller2-42/+93
Roopa Prabhu says: ==================== vxlan: fix default fdb entry user-space notify ordering/race Problem: In vxlan_newlink, a default fdb entry is added before register_netdev. The default fdb creation function notifies user-space of the fdb entry on the vxlan device which user-space does not know about yet. (RTM_NEWNEIGH goes before RTM_NEWLINK for the same ifindex). This series fixes the user-space netlink notification ordering issue with the following changes: - decouple fdb notify from fdb create. - Move fdb notify after register_netdev. - modify rtnl_configure_link to allow configuring a link early. - Call rtnl_configure_link in vxlan newlink handler to notify userspace about the newlink before fdb notify and hence avoiding the user-space race. ==================== Fixes: afbd8bae9c79 ("vxlan: add implicit fdb entry for default destination") Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
2018-07-22vxlan: fix default fdb entry netlink notify ordering during netdev createRoopa Prabhu1-8/+21
Problem: In vxlan_newlink, a default fdb entry is added before register_netdev. The default fdb creation function also notifies user-space of the fdb entry on the vxlan device which user-space does not know about yet. (RTM_NEWNEIGH goes before RTM_NEWLINK for the same ifindex). This patch fixes the user-space netlink notification ordering issue with the following changes: - decouple fdb notify from fdb create. - Move fdb notify after register_netdev. - Call rtnl_configure_link in vxlan newlink handler to notify userspace about the newlink before fdb notify and hence avoiding the user-space race. Fixes: afbd8bae9c79 ("vxlan: add implicit fdb entry for default destination") Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-22vxlan: make netlink notify in vxlan_fdb_destroy optionalRoopa Prabhu1-6/+8
Add a new option do_notify to vxlan_fdb_destroy to make sending netlink notify optional. Used by a later patch. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-22vxlan: add new fdb alloc and create helpersRoopa Prabhu1-29/+62
- Add new vxlan_fdb_alloc helper - rename existing vxlan_fdb_create into vxlan_fdb_update: because it really creates or updates an existing fdb entry - move new fdb creation into a separate vxlan_fdb_create Main motivation for this change is to introduce the ability to decouple vxlan fdb creation and notify, used in a later patch. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-22rtnetlink: add rtnl_link_state check in rtnl_configure_linkRoopa Prabhu1-3/+6
rtnl_configure_link sets dev->rtnl_link_state to RTNL_LINK_INITIALIZED and unconditionally calls __dev_notify_flags to notify user-space of dev flags. current call sequence for rtnl_configure_link rtnetlink_newlink rtnl_link_ops->newlink rtnl_configure_link (unconditionally notifies userspace of default and new dev flags) If a newlink handler wants to call rtnl_configure_link early, we will end up with duplicate notifications to user-space. This patch fixes rtnl_configure_link to check rtnl_link_state and call __dev_notify_flags with gchanges = 0 if already RTNL_LINK_INITIALIZED. Later in the series, this patch will help the following sequence where a driver implementing newlink can call rtnl_configure_link to initialize the link early. makes the following call sequence work: rtnetlink_newlink rtnl_link_ops->newlink (vxlan) -> rtnl_configure_link (initializes link and notifies user-space of default dev flags) rtnl_configure_link (updates dev flags if requested by user ifm and notifies user-space of new dev flags) Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-22atl1c: reserve min skb headroomFlorian Westphal1-0/+1
Got crash report with following backtrace: BUG: unable to handle kernel paging request at ffff8801869daffe RIP: 0010:[<ffffffff816429c4>] [<ffffffff816429c4>] ip6_finish_output2+0x394/0x4c0 RSP: 0018:ffff880186c83a98 EFLAGS: 00010283 RAX: ffff8801869db00e ... [<ffffffff81644cdc>] ip6_finish_output+0x8c/0xf0 [<ffffffff81644d97>] ip6_output+0x57/0x100 [<ffffffff81643dc9>] ip6_forward+0x4b9/0x840 [<ffffffff81645566>] ip6_rcv_finish+0x66/0xc0 [<ffffffff81645db9>] ipv6_rcv+0x319/0x530 [<ffffffff815892ac>] netif_receive_skb+0x1c/0x70 [<ffffffffc0060bec>] atl1c_clean+0x1ec/0x310 [atl1c] ... The bad access is in neigh_hh_output(), at skb->data - 16 (HH_DATA_MOD). atl1c driver provided skb with no headroom, so 14 bytes (ethernet header) got pulled, but then 16 are copied. Reserve NET_SKB_PAD bytes headroom, like netdev_alloc_skb(). Compile tested only; I lack hardware. Fixes: 7b7017642199 ("atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21multicast: do not restore deleted record source filter mode to new oneHangbin Liu2-4/+2
There are two scenarios that we will restore deleted records. The first is when device down and up(or unmap/remap). In this scenario the new filter mode is same with previous one. Because we get it from in_dev->mc_list and we do not touch it during device down and up. The other scenario is when a new socket join a group which was just delete and not finish sending status reports. In this scenario, we should use the current filter mode instead of restore old one. Here are 4 cases in total. old_socket new_socket before_fix after_fix IN(A) IN(A) ALLOW(A) ALLOW(A) IN(A) EX( ) TO_IN( ) TO_EX( ) EX( ) IN(A) TO_EX( ) ALLOW(A) EX( ) EX( ) TO_EX( ) TO_EX( ) Fixes: 24803f38a5c0b (igmp: do not remove igmp souce list info when set link down) Fixes: 1666d49e1d416 (mld: do not remove mld souce list info when set link down) Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21net: dsa: mv88e6xxx: fix races between lock and irq freeingUwe Kleine-König1-8/+13
free_irq() waits until all handlers for this IRQ have completed. As the relevant handler (mv88e6xxx_g1_irq_thread_fn()) takes the chip's reg_lock it might never return if the thread calling free_irq() holds this lock. For the same reason kthread_cancel_delayed_work_sync() in the polling case must not hold this lock. Also first free the irq (or stop the worker respectively) such that mv88e6xxx_g1_irq_thread_work() isn't called any more before the irq mappings are dropped in mv88e6xxx_g1_irq_free_common() to prevent the worker thread to call handle_nested_irq(0) which results in a NULL-pointer exception. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21net: skb_segment() should not return NULLEric Dumazet1-5/+5
syzbot caught a NULL deref [1], caused by skb_segment() skb_segment() has many "goto err;" that assume the @err variable contains -ENOMEM. A successful call to __skb_linearize() should not clear @err, otherwise a subsequent memory allocation error could return NULL. While we are at it, we might use -EINVAL instead of -ENOMEM when MAX_SKB_FRAGS limit is reached. [1] kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN CPU: 0 PID: 13285 Comm: syz-executor3 Not tainted 4.18.0-rc4+ #146 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tcp_gso_segment+0x3dc/0x1780 net/ipv4/tcp_offload.c:106 Code: f0 ff ff 0f 87 1c fd ff ff e8 00 88 0b fb 48 8b 75 d0 48 b9 00 00 00 00 00 fc ff df 48 8d be 90 00 00 00 48 89 f8 48 c1 e8 03 <0f> b6 14 08 48 8d 86 94 00 00 00 48 89 c6 83 e0 07 48 c1 ee 03 0f RSP: 0018:ffff88019b7fd060 EFLAGS: 00010206 RAX: 0000000000000012 RBX: 0000000000000020 RCX: dffffc0000000000 RDX: 0000000000040000 RSI: 0000000000000000 RDI: 0000000000000090 RBP: ffff88019b7fd0f0 R08: ffff88019510e0c0 R09: ffffed003b5c46d6 R10: ffffed003b5c46d6 R11: ffff8801dae236b3 R12: 0000000000000001 R13: ffff8801d6c581f4 R14: 0000000000000000 R15: ffff8801d6c58128 FS: 00007fcae64d6700(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004e8664 CR3: 00000001b669b000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: tcp4_gso_segment+0x1c3/0x440 net/ipv4/tcp_offload.c:54 inet_gso_segment+0x64e/0x12d0 net/ipv4/af_inet.c:1342 inet_gso_segment+0x64e/0x12d0 net/ipv4/af_inet.c:1342 skb_mac_gso_segment+0x3b5/0x740 net/core/dev.c:2792 __skb_gso_segment+0x3c3/0x880 net/core/dev.c:2865 skb_gso_segment include/linux/netdevice.h:4099 [inline] validate_xmit_skb+0x640/0xf30 net/core/dev.c:3104 __dev_queue_xmit+0xc14/0x3910 net/core/dev.c:3561 dev_queue_xmit+0x17/0x20 net/core/dev.c:3602 neigh_hh_output include/net/neighbour.h:473 [inline] neigh_output include/net/neighbour.h:481 [inline] ip_finish_output2+0x1063/0x1860 net/ipv4/ip_output.c:229 ip_finish_output+0x841/0xfa0 net/ipv4/ip_output.c:317 NF_HOOK_COND include/linux/netfilter.h:276 [inline] ip_output+0x223/0x880 net/ipv4/ip_output.c:405 dst_output include/net/dst.h:444 [inline] ip_local_out+0xc5/0x1b0 net/ipv4/ip_output.c:124 iptunnel_xmit+0x567/0x850 net/ipv4/ip_tunnel_core.c:91 ip_tunnel_xmit+0x1598/0x3af1 net/ipv4/ip_tunnel.c:778 ipip_tunnel_xmit+0x264/0x2c0 net/ipv4/ipip.c:308 __netdev_start_xmit include/linux/netdevice.h:4148 [inline] netdev_start_xmit include/linux/netdevice.h:4157 [inline] xmit_one net/core/dev.c:3034 [inline] dev_hard_start_xmit+0x26c/0xc30 net/core/dev.c:3050 __dev_queue_xmit+0x29ef/0x3910 net/core/dev.c:3569 dev_queue_xmit+0x17/0x20 net/core/dev.c:3602 neigh_direct_output+0x15/0x20 net/core/neighbour.c:1403 neigh_output include/net/neighbour.h:483 [inline] ip_finish_output2+0xa67/0x1860 net/ipv4/ip_output.c:229 ip_finish_output+0x841/0xfa0 net/ipv4/ip_output.c:317 NF_HOOK_COND include/linux/netfilter.h:276 [inline] ip_output+0x223/0x880 net/ipv4/ip_output.c:405 dst_output include/net/dst.h:444 [inline] ip_local_out+0xc5/0x1b0 net/ipv4/ip_output.c:124 ip_queue_xmit+0x9df/0x1f80 net/ipv4/ip_output.c:504 tcp_transmit_skb+0x1bf9/0x3f10 net/ipv4/tcp_output.c:1168 tcp_write_xmit+0x1641/0x5c20 net/ipv4/tcp_output.c:2363 __tcp_push_pending_frames+0xb2/0x290 net/ipv4/tcp_output.c:2536 tcp_push+0x638/0x8c0 net/ipv4/tcp.c:735 tcp_sendmsg_locked+0x2ec5/0x3f00 net/ipv4/tcp.c:1410 tcp_sendmsg+0x2f/0x50 net/ipv4/tcp.c:1447 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:641 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:651 __sys_sendto+0x3d7/0x670 net/socket.c:1797 __do_sys_sendto net/socket.c:1809 [inline] __se_sys_sendto net/socket.c:1805 [inline] __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1805 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x455ab9 Code: 1d ba fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb b9 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fcae64d5c68 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007fcae64d66d4 RCX: 0000000000455ab9 RDX: 0000000000000001 RSI: 0000000020000200 RDI: 0000000000000013 RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000014 R13: 00000000004c1145 R14: 00000000004d1818 R15: 0000000000000006 Modules linked in: Dumping ftrace buffer: (ftrace buffer empty) Fixes: ddff00d42043 ("net: Move skb_has_shared_frag check out of GRE code and into segmentation") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21net/ipv6: Fix linklocal to global address with VRFDavid Ahern3-4/+12
Example setup: host: ip -6 addr add dev eth1 2001:db8:104::4 where eth1 is enslaved to a VRF switch: ip -6 ro add 2001:db8:104::4/128 dev br1 where br1 only has an LLA ping6 2001:db8:104::4 ssh 2001:db8:104::4 (NOTE: UDP works fine if the PKTINFO has the address set to the global address and ifindex is set to the index of eth1 with a destination an LLA). For ICMP, icmp6_iif needs to be updated to check if skb->dev is an L3 master. If it is then return the ifindex from rt6i_idev similar to what is done for loopback. For TCP, restore the original tcp_v6_iif definition which is needed in most places and add a new tcp_v6_iif_l3_slave that considers the l3_slave variability. This latter check is only needed for socket lookups. Fixes: 9ff74384600a ("net: vrf: Handle ipv6 multicast and link-local addresses") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21Merge tag 'armsoc-fixes' of ↵Linus Torvalds3-7/+25
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: - Fix interrupt type on ethernet switch for i.MX-based RDU2 - GPC on i.MX exposed too large a register window which resulted in userspace being able to crash the machine. - Fixup of bad merge resolution moving GPIO DT nodes under pinctrl on droid4. * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: dts: imx6: RDU2: fix irq type for mv88e6xxx switch soc: imx: gpc: restrict register range for regmap access ARM: dts: omap4-droid4: fix dts w.r.t. pwm
2018-07-21Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds1-3/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Ingo Molnar: "A single fix for a MCE-polling regression, which prevented the disabling of polling" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/MCE: Remove min interval polling limitation
2018-07-21Merge branch 'x86-pti-urgent-for-linus' of ↵Linus Torvalds3-9/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 pti fixes from Ingo Molnar: "An APM fix, and a BTS hardware-tracing fix related to PTI changes" * 'x86-pti-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apm: Don't access __preempt_count with zeroed fs x86/events/intel/ds: Fix bts_interrupt_threshold alignment
2018-07-21Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds2-2/+15
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Two fixes: a stop-machine preemption fix and a SCHED_DEADLINE fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Fix switched_from_dl() warning stop_machine: Disable preemption when waking two stopper threads
2018-07-21Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds4-8/+84
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core kernel fixes from Ingo Molnar: "This is mostly the copy_to_user_mcsafe() related fixes from Dan Williams, and an ORC fix for Clang" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm/memcpy_mcsafe: Fix copy_to_user_mcsafe() exception handling lib/iov_iter: Fix pipe handling in _copy_to_iter_mcsafe() lib/iov_iter: Document _copy_to_iter_flushcache() lib/iov_iter: Document _copy_to_iter_mcsafe() objtool: Use '.strtab' if '.shstrtab' doesn't exist, to support ORC tables on Clang
2018-07-21Merge tag 'powerpc-4.18-4' of ↵Linus Torvalds8-14/+52
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Two regression fixes, one for xmon disassembly formatting and the other to fix the E500 build. Two commits to fix a potential security issue in the VFIO code under obscure circumstances. And finally a fix to the Power9 idle code to restore SPRG3, which is user visible and used for sched_getcpu(). Thanks to: Alexey Kardashevskiy, David Gibson. Gautham R. Shenoy, James Clarke" * tag 'powerpc-4.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/powernv: Fix save/restore of SPRG3 on entry/exit from stop (idle) powerpc/Makefile: Assemble with -me500 when building for E500 KVM: PPC: Check if IOMMU page is contained in the pinned physical page vfio/spapr: Use IOMMU pageshift rather than pagesize powerpc/xmon: Fix disassembly since printf changes
2018-07-21Merge tag 'for-4.18-rc5-tag' of ↵Linus Torvalds1-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "A fix of a corruption regarding fsync and clone, under some very specific conditions explained in the patch. The fix is marked for stable 3.16+ so I'd like to get it merged now given the impact" * tag 'for-4.18-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Btrfs: fix file data corruption after cloning a range and fsync
2018-07-21bpfilter: Fix mismatch in function argument typesYueHaibing1-3/+3
Fix following warning: net/ipv4/bpfilter/sockopt.c:28:5: error: symbol 'bpfilter_ip_set_sockopt' redeclared with different type net/ipv4/bpfilter/sockopt.c:34:5: error: symbol 'bpfilter_ip_get_sockopt' redeclared with different type Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21net: phy: consider PHY_IGNORE_INTERRUPT in phy_start_aneg_privHeiner Kallweit1-1/+1
The situation described in the comment can occur also with PHY_IGNORE_INTERRUPT, therefore change the condition to include it. Fixes: f555f34fdc58 ("net: phy: fix auto-negotiation stall due to unavailable interrupt") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21Merge branch 'qed-Fix-series-II'David S. Miller6-16/+27
Sudarsana Reddy Kalluru says: ==================== qed: Fix series II. The patch series fixes few issues in the qed driver. Please consider applying it to 'net' branch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21qed: Correct Multicast API to reflect existence of 256 approximate buckets.Sudarsana Reddy Kalluru5-13/+17
FW hsi contains 256 approximation buckets which are split in ramrod into eight u32 values, but driver is using eight 'unsigned long' variables. This patch fixes the mcast logic by making the API utilize u32. Fixes: 83aeb933 ("qed*: Trivial modifications") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21qed: Fix possible race for the link state value.Sudarsana Reddy Kalluru1-0/+1
There's a possible race where driver can read link status in mid-transition and see that virtual-link is up yet speed is 0. Since in this mid-transition we're guaranteed to see a mailbox from MFW soon, we can afford to treat this as link down. Fixes: cc875c2e ("qed: Add link support") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21qed: Fix link flap issue due to mismatching EEE capabilities.Sudarsana Reddy Kalluru1-3/+9
Apparently, MFW publishes EEE capabilities even for Fiber-boards that don't support them, and later since qed internally sets adv_caps it would cause link-flap avoidance (LFA) to fail when driver would initiate the link. This in turn delays the link, causing traffic to fail. Driver has been modified to not to ask MFW for any EEE config if EEE isn't to be enabled. Fixes: 645874e5 ("qed: Add support for Energy efficient ethernet.") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21net: caif: Add a missing rcu_read_unlock() in caif_flow_cbYueHaibing1-1/+3
Add a missing rcu_read_unlock in the error path Fixes: c95567c80352 ("caif: added check for potential null return") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21mm: make vm_area_alloc() initialize core fieldsLinus Torvalds7-26/+17
Like vm_area_dup(), it initializes the anon_vma_chain head, and the basic mm pointer. The rest of the fields end up being different for different users, although the plan is to also initialize the 'vm_ops' field to a dummy entry. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-21mm: make vm_area_dup() actually copy the old vma dataLinus Torvalds3-11/+7
.. and re-initialize th eanon_vma_chain head. This removes some boiler-plate from the users, and also makes it clear why it didn't need use the 'zalloc()' version. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-21mm: use helper functions for allocating and freeing vm_area structsLinus Torvalds7-27/+44
The vm_area_struct is one of the most fundamental memory management objects, but the management of it is entirely open-coded evertwhere, ranging from allocation and freeing (using kmem_cache_[z]alloc and kmem_cache_free) to initializing all the fields. We want to unify this in order to end up having some unified initialization of the vmas, and the first step to this is to at least have basic allocation functions. Right now those functions are literally just wrappers around the kmem_cache_*() calls. This is a purely mechanical conversion: # new vma: kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL) -> vm_area_alloc() # copy old vma kmem_cache_alloc(vm_area_cachep, GFP_KERNEL) -> vm_area_dup(old) # free vma kmem_cache_free(vm_area_cachep, vma) -> vm_area_free(vma) to the point where the old vma passed in to the vm_area_dup() function isn't even used yet (because I've left all the old manual initialization alone). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-21Merge branch 'akpm' (patches from Andrew)Linus Torvalds5-9/+20
Merge fixes from Andrew Morton: "5 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: memcg: fix use after free in mem_cgroup_iter() mm/huge_memory.c: fix data loss when splitting a file pmd fat: fix memory allocation failure handling of match_strdup() MAINTAINERS: Peter has moved mm/memblock: add missing include <linux/bootmem.h>
2018-07-21mm: memcg: fix use after free in mem_cgroup_iter()Jing Xia1-1/+1
It was reported that a kernel crash happened in mem_cgroup_iter(), which can be triggered if the legacy cgroup-v1 non-hierarchical mode is used. Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f ...... Call trace: mem_cgroup_iter+0x2e0/0x6d4 shrink_zone+0x8c/0x324 balance_pgdat+0x450/0x640 kswapd+0x130/0x4b8 kthread+0xe8/0xfc ret_from_fork+0x10/0x20 mem_cgroup_iter(): ...... if (css_tryget(css)) <-- crash here break; ...... The crashing reason is that mem_cgroup_iter() uses the memcg object whose pointer is stored in iter->position, which has been freed before and filled with POISON_FREE(0x6b). And the root cause of the use-after-free issue is that invalidate_reclaim_iterators() fails to reset the value of iter->position to NULL when the css of the memcg is released in non- hierarchical mode. Link: http://lkml.kernel.org/r/1531994807-25639-1-git-send-email-jing.xia@unisoc.com Fixes: 6df38689e0e9 ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim") Signed-off-by: Jing Xia <jing.xia.mail@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <chunyan.zhang@unisoc.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-21mm/huge_memory.c: fix data loss when splitting a file pmdHugh Dickins1-0/+2
__split_huge_pmd_locked() must check if the cleared huge pmd was dirty, and propagate that to PageDirty: otherwise, data may be lost when a huge tmpfs page is modified then split then reclaimed. How has this taken so long to be noticed? Because there was no problem when the huge page is written by a write system call (shmem_write_end() calls set_page_dirty()), nor when the page is allocated for a write fault (fault_dirty_shared_page() calls set_page_dirty()); but when allocated for a read fault (which MAP_POPULATE simulates), no set_page_dirty(). Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1807111741430.1106@eggly.anvils Fixes: d21b9e57c74c ("thp: handle file pages in split_huge_pmd()") Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Ashwin Chaugule <ashwinch@google.com> Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: <stable@vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-21fat: fix memory allocation failure handling of match_strdup()OGAWA Hirofumi1-7/+13
In parse_options(), if match_strdup() failed, parse_options() leaves opts->iocharset in unexpected state (i.e. still pointing the freed string). And this can be the cause of double free. To fix, this initialize opts->iocharset always when freeing. Link: http://lkml.kernel.org/r/8736wp9dzc.fsf@mail.parknet.co.jp Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reported-by: syzbot+90b8e10515ae88228a92@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-21MAINTAINERS: Peter has movedPeter Senna Tschudin1-1/+1
Update my E-mail address in the MAINTAINERS file. Link: http://lkml.kernel.org/r/20180710144702.1308-1-peter.senna@gmail.com Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk> Acked-by: Martyn Welch <martyn.welch@collabora.co.uk> Cc: David S. Miller <davem@davemloft.net> Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Martin Donnelly <martin.donnelly@ge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-21mm/memblock: add missing include <linux/bootmem.h>Mathieu Malaterre1-0/+3
Commit 26f09e9b3a06 ("mm/memblock: add memblock memory allocation apis") introduced two new function definitions: memblock_virt_alloc_try_nid_nopanic() memblock_virt_alloc_try_nid() and commit ea1f5f3712af ("mm: define memblock_virt_alloc_try_nid_raw") introduced the following function definition: memblock_virt_alloc_try_nid_raw() This commit adds an include of header file <linux/bootmem.h> to provide the missing function prototypes. This silences the following gcc warning (W=1): mm/memblock.c:1334:15: warning: no previous prototype for `memblock_virt_alloc_try_nid_raw' [-Wmissing-prototypes] mm/memblock.c:1371:15: warning: no previous prototype for `memblock_virt_alloc_try_nid_nopanic' [-Wmissing-prototypes] mm/memblock.c:1407:15: warning: no previous prototype for `memblock_virt_alloc_try_nid' [-Wmissing-prototypes] Also adds #ifdef blockers to prevent compilation failure on mips/ia64 where CONFIG_NO_BOOTMEM=n as could be seen in commit commit 6cc22dc08a24 ("revert "mm/memblock: add missing include <linux/bootmem.h>""). Because Makefile already does: obj-$(CONFIG_HAVE_MEMBLOCK) += memblock.o The #ifdef has been simplified from: #if defined(CONFIG_HAVE_MEMBLOCK) && defined(CONFIG_NO_BOOTMEM) to simply: #if defined(CONFIG_NO_BOOTMEM) Link: http://lkml.kernel.org/r/20180626184422.24974-1-malat@debian.org Signed-off-by: Mathieu Malaterre <malat@debian.org> Suggested-by: Tony Luck <tony.luck@intel.com> Suggested-by: Michal Hocko <mhocko@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-21bonding: set default miimon value for non-arp modes if not setJarod Wilson1-9/+14
For some time now, if you load the bonding driver and configure bond parameters via sysfs using minimal config options, such as specifying nothing but the mode, relying on defaults for everything else, modes that cannot use arp monitoring (802.3ad, balance-tlb, balance-alb) all wind up with both arp_interval=0 (as it should be) and miimon=0, which means the miimon monitor thread never actually runs. This is particularly problematic for 802.3ad. For example, from an LNST recipe I've set up: $ modprobe bonding max_bonds=0" $ echo "+t_bond0" > /sys/class/net/bonding_masters" $ ip link set t_bond0 down" $ echo "802.3ad" > /sys/class/net/t_bond0/bonding/mode" $ ip link set ens1f1 down" $ echo "+ens1f1" > /sys/class/net/t_bond0/bonding/slaves" $ ip link set ens1f0 down" $ echo "+ens1f0" > /sys/class/net/t_bond0/bonding/slaves" $ ethtool -i t_bond0" $ ip link set ens1f1 up" $ ip link set ens1f0 up" $ ip link set t_bond0 up" $ ip addr add 192.168.9.1/24 dev t_bond0" $ ip addr add 2002::1/64 dev t_bond0" This bond comes up okay, but things look slightly suspect in /proc/net/bonding/t_bond0 output: $ grep -i mii /proc/net/bonding/t_bond0 MII Status: up MII Polling Interval (ms): 0 MII Status: up MII Status: up Now, pull a cable on one of the ports in the bond, then reconnect it, and you'll see: Slave Interface: ens1f0 MII Status: down Speed: 1000 Mbps Duplex: full I believe this became a major issue as of commit 4d2c0cda0744, which for 802.3ad bonds, sets slave->link = BOND_LINK_DOWN, with a comment about relying on link monitoring via miimon to set it correctly, but since the miimon work queue never runs, the link just stays marked down. If we simply tweak bond_option_mode_set() slightly, we can check for the non-arp modes having no miimon value set, and insert BOND_DEFAULT_MIIMON, which gets things back in full working order. This problem exists as far back as 4.14, and might be worth fixing in all stable trees since, though the work-around is to simply specify an miimon value yourself. Reported-by: Bob Ball <ball@umich.edu> Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21Merge tag 'mlx5-fixes-2018-07-18' of ↵David S. Miller9-31/+67
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2018-07-18 The following series provides fixes to mlx5 core and net device driver. Please pull and let me know if there's any problem. For -stable v4.7 net/mlx5e: Don't allow aRFS for encapsulated packets net/mlx5e: Fix quota counting in aRFS expire flow For -stable v4.15 net/mlx5e: Only allow offloading decap egress (egdev) flows net/mlx5e: Refine ets validation function net/mlx5: Adjust clock overflow work period For -stable v4.17 net/mlx5: E-Switch, UBSAN fix undefined behavior in mlx5_eswitch_mode ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller6-35/+65
Daniel Borkmann says: ==================== pull-request: bpf 2018-07-20 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix in BPF Makefile to detect llvm-objcopy in a more robust way which is needed for pahole's BTF converter and minor UAPI tweaks in BTF_INT_BITS() to shrink the mask before eventual UAPI freeze, from Martin. 2) Fix a segfault in bpftool when prog pin id has no further arguments such as id value or file specified, from Taeung. 3) Fix powerpc JIT handling of XADD which has jumps to exit path that would potentially bypass verifier expectations e.g. with subprog calls. Also add a test case to make sure XADD is not mangling src/dst register, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20tls: check RCV_SHUTDOWN in tls_wait_dataDoron Roberts-Kedes1-0/+3
The current code does not check sk->sk_shutdown & RCV_SHUTDOWN. tls_sw_recvmsg may return a positive value in the case where bytes have already been copied when the socket is shutdown. sk->sk_err has been cleared, causing the tls_wait_data to hang forever on a subsequent invocation. Checking sk->sk_shutdown & RCV_SHUTDOWN, as in tcp_recvmsg, fixes this problem. Fixes: c46234ebb4d1 ("tls: RX path for ktls") Acked-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20Merge branch 'tcp-fix-DCTCP-ECE-Ack-series'David S. Miller4-45/+44
Yuchung Cheng says: ==================== fix DCTCP ECE Ack series This patch set address that the existing DCTCP implementation does not fully implement the ACK policy specified in the RFC. This improves the responsiveness of CE status change particularly on flows with small inflight. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20tcp: do not delay ACK in DCTCP upon CE status changeYuchung Cheng3-13/+21
Per DCTCP RFC8257 (Section 3.2) the ACK reflecting the CE status change has to be sent immediately so the sender can respond quickly: """ When receiving packets, the CE codepoint MUST be processed as follows: 1. If the CE codepoint is set and DCTCP.CE is false, set DCTCP.CE to true and send an immediate ACK. 2. If the CE codepoint is not set and DCTCP.CE is true, set DCTCP.CE to false and send an immediate ACK. """ Previously DCTCP implementation may continue to delay the ACK. This patch fixes that to implement the RFC by forcing an immediate ACK. Tested with this packetdrill script provided by Larry Brakmo 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0 0.000 bind(3, ..., ...) = 0 0.000 listen(3, 1) = 0 0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> 0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8> 0.110 < [ect0] . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_SOCKET, SO_DEBUG, [1], 4) = 0 0.200 < [ect0] . 1:1001(1000) ack 1 win 257 0.200 > [ect01] . 1:1(0) ack 1001 0.200 write(4, ..., 1) = 1 0.200 > [ect01] P. 1:2(1) ack 1001 0.200 < [ect0] . 1001:2001(1000) ack 2 win 257 +0.005 < [ce] . 2001:3001(1000) ack 2 win 257 +0.000 > [ect01] . 2:2(0) ack 2001 // Previously the ACK below would be delayed by 40ms +0.000 > [ect01] E. 2:2(0) ack 3001 +0.500 < F. 9501:9501(0) ack 4 win 257 Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20tcp: do not cancel delay-AcK on DCTCP special ACKYuchung Cheng3-33/+12
Currently when a DCTCP receiver delays an ACK and receive a data packet with a different CE mark from the previous one's, it sends two immediate ACKs acking previous and latest sequences respectly (for ECN accounting). Previously sending the first ACK may mark off the delayed ACK timer (tcp_event_ack_sent). This may subsequently prevent sending the second ACK to acknowledge the latest sequence (tcp_ack_snd_check). The culprit is that tcp_send_ack() assumes it always acknowleges the latest sequence, which is not true for the first special ACK. The fix is to not make the assumption in tcp_send_ack and check the actual ack sequence before cancelling the delayed ACK. Further it's safer to pass the ack sequence number as a local variable into tcp_send_ack routine, instead of intercepting tp->rcv_nxt to avoid future bugs like this. Reported-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20tcp: helpers to send special DCTCP ackYuchung Cheng1-5/+17
Refactor and create helpers to send the special ACK in DCTCP. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20Merge tag 'vfio-v4.18-rc6' of git://github.com/awilliam/linux-vfioLinus Torvalds1-0/+4
Pull VFIO fix from Alex Williamson: "Harden potential Spectre v1 issue (Gustavo A. R. Silva)" * tag 'vfio-v4.18-rc6' of git://github.com/awilliam/linux-vfio: vfio/pci: Fix potential Spectre v1
2018-07-20Merge tag 'for-4.18/dm-fixes-2' of ↵Linus Torvalds2-14/+31
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fix from Mike Snitzer: "Fix DM writecache target to allow an optional offset to the start of the data and metadata area. This allows userspace tools (e.g. LVM2) to place a header and metadata at the front of the writecache device for its use" * tag 'for-4.18/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm writecache: support optional offset for start of device
2018-07-20Merge tag 'imx-fixes-4.18-4' of ↵Olof Johansson1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes i.MX fixes for 4.18, round 4: - A fix for i.MX6 RDU2 board on the wrong IRQ type of Marvell switch, which might result in a race condition in the interrupt handler and cause the OS to miss all future events. * tag 'imx-fixes-4.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: imx6: RDU2: fix irq type for mv88e6xxx switch Signed-off-by: Olof Johansson <olof@lixom.net>
2018-07-20Merge tag 'scsi-fixes' of ↵Linus Torvalds10-30/+88
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "A set of 8 obvious fixes. Three (2 qla2xxx and the cxlflash oopses) are regressions, two from 4.17 and one from the merge window. The hpsa change is user visible, but it fixes an error users have complained about" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: cxlflash: fix assignment of the backend operations scsi: qedi: Send driver state to MFW scsi: qedf: Send the driver state to MFW scsi: hpsa: correct enclosure sas address scsi: sd_zbc: Fix variable type and bogus comment scsi: qla2xxx: Fix NULL pointer dereference for fcport search scsi: qla2xxx: Fix kernel crash due to late workqueue allocation scsi: qla2xxx: Fix inconsistent DMA mem alloc/free
2018-07-20Merge tag 'iommu-fixes-v4.18-rc5' of ↵Linus Torvalds2-2/+31
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fix from Joerg Roedel: "Only one revert, for an an Intel VT-d patch that caused issues with the i915 GPU driver" * tag 'iommu-fixes-v4.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: Revert "iommu/vt-d: Clean up pasid quirk for pre-production devices"
2018-07-20Merge tag 'platform-drivers-x86-v4.18-2' of ↵Linus Torvalds1-1/+1
git://git.infradead.org/linux-platform-drivers-x86 Pull x86 platform driver fixes from Andy Shevchenko: "The Dell laptop ACPI video brightness control is now back after fixing a regression brought by SMM refactoring" * tag 'platform-drivers-x86-v4.18-2' of git://git.infradead.org/linux-platform-drivers-x86: platform/x86: dell-laptop: Fix backlight detection
2018-07-20Merge tag 'arc-4.18-rc6' of ↵Linus Torvalds24-47/+112
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: "ARC is back after radio silence in 4.17: - Fix CONFIG_SWAP [Alexey] - Robustify cmpxchg emulation for systems w/o atomics [Alexey / PeterZ] - Allow mprotext(PROT_EXEC) for stack mappings [Vineet] - HSDK platform enable PCIe, APG GPIO [Gustavo] - miscll other fixes, config updates etc" * tag 'arc-4.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARCv2: [plat-hsdk]: Save accl reg pair by default ARC: mm: allow mprotect to make stack mappings executable ARC: Fix CONFIG_SWAP ARC: [arcompact] entry.S: minor code movement ARC: configs: Remove CONFIG_INITRAMFS_SOURCE from defconfigs ARC: configs: remove no longer needed CONFIG_DEVPTS_MULTIPLE_INSTANCES ARC: Improve cmpxchg syscall implementation ARC: [plat-hsdk]: Configure APB GPIO controller on ARC HSDK platform ARC: [plat-hsdk] Add PCIe support ARC: Enable machine_desc->init_per_cpu for !CONFIG_SMP ARC: Explicitly add -mmedium-calls to CFLAGS
2018-07-20Merge tag 'nds32-for-linus-4.18' of ↵Linus Torvalds6-70/+58
git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux Pull nds32 updates from Greentime Hu: "Bug fixes and build ixes for nds32" * tag 'nds32-for-linus-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux: nds32: fix build error "relocation truncated to fit: R_NDS32_25_PCREL_RELA" when make allyesconfig nds32: To simplify the implementation of update_mmu_cache() nds32: Fix the dts pointer is not passed correctly issue. nds32: To implement these icache invalidation APIs since nds32 cores don't snoop data cache. This issue is found by Guo Ren. Based on the Documentation/core-api/cachetlb.rst and it says: nds32: Fix build error caused by configuration flag rename nds32: define __NDS32_E[BL]__ for sparse
2018-07-20Merge tag 'pm-4.18-rc6' of ↵Linus Torvalds2-1/+20
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Fix a relatively old initialization issue in intel_pstate causing the pcc-cpufreq driver to be used instead of it on some HP Proliant systems. This turned into a functional regression during the 4.17 cycle, because pcc-cpufreq is a scalability disaster and that was amplified by the idle loop rework done at that time (Rafael Wysocki). * tag 'pm-4.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: intel_pstate: Register when ACPI PCCH is present
2018-07-20Merge tag 'acpi-4.18-rc6' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Extend the recently added suspend-to-idle quirk for Thinkpad X1 Carbon 6th to other systems from that familiy which turned out to need it too (Robin Johnson)" * tag 'acpi-4.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / EC: Use ec_no_wakeup on more Thinkpad X1 Carbon 6th systems
2018-07-20MIPS: ath79: fix register address in ath79_ddr_wb_flush()Felix Fietkau1-1/+1
ath79_ddr_wb_flush_base has the type void __iomem *, so register offsets need to be a multiple of 4 in order to access the intended register. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: 24b0e3e84fbf ("MIPS: ath79: Improve the DDR controller interface") Patchwork: https://patchwork.linux-mips.org/patch/19912/ Cc: Alban Bedel <albeu@free.fr> Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # 4.2+
2018-07-20nvme: fix handling of metadata_len for NVME_IOCTL_IO_CMDRoland Dreier1-1/+1
The old code in nvme_user_cmd() passed the userspace virtual address from nvme_passthru_cmd.metadata as the length of the metadata buffer as well as the address to nvme_submit_user_cmd(). Fixes: 63263d60 ("nvme: Use metadata for passthrough commands") Signed-off-by: Roland Dreier <roland@purestorage.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-20netfilter: conntrack: dccp: treat SYNC/SYNCACK as invalid if no prior stateFlorian Westphal1-4/+4
When first DCCP packet is SYNC or SYNCACK, we insert a new conntrack that has an un-initialized timeout value, i.e. such entry could be reaped at any time. Mark them as INVALID and only ignore SYNC/SYNCACK when connection had an old state. Reported-by: syzbot+6f18401420df260e37ed@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-20netfilter: nf_tables: don't allow to rename to already-pending nameFlorian Westphal1-13/+29
Its possible to rename two chains to the same name in one transaction: nft add chain t c1 nft add chain t c2 nft 'rename chain t c1 c3;rename chain t c2 c3' This creates two chains named 'c3'. Appears to be harmless, both chains can still be deleted both by name or handle, but, nevertheless, its a bug. Walk transaction log and also compare vs. the pending renames. Both chains can still be deleted, but nevertheless it is a bug as we don't allow to create chains with identical names, so we should prevent this from happening-by-rename too. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-20netfilter: nf_tables: fix memory leaks on chain renameFlorian Westphal1-6/+11
The new name is stored in the transaction metadata, on commit, the pointers to the old and new names are swapped. Therefore in abort and commit case we have to free the pointer in the chain_trans container. In commit case, the pointer can be used by another cpu that is currently dumping the renamed chain, thus kfree needs to happen after waiting for rcu readers to complete. Fixes: b7263e071a ("netfilter: nf_tables: Allow chain name of up to 255 chars") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-20netfilter: nf_tables: free flow table struct tooFlorian Westphal1-0/+1
Fixes: 3b49e2e94e6ebb ("netfilter: nf_tables: add flow table netlink frontend") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-20netfilter: nf_tables: use dev->name directlyFlorian Westphal2-10/+5
no need to store the name in separate area. Furthermore, it uses kmalloc but not kfree and most accesses seem to treat it as char[IFNAMSIZ] not char *. Remove this and use dev->name instead. In case event zeroed dev, just omit the name in the dump. Fixes: d92191aa84e5f1 ("netfilter: nf_tables: cache device name in flowtable object") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-20platform/x86: dell-laptop: Fix backlight detectionDamien Thébault1-1/+1
Fix return code check for "max brightness" ACPI call. The Dell laptop ACPI video brightness control is not present on dell laptops anymore, but was present in older kernel versions. The code that checks the return value is incorrect since the SMM refactoring. The old code was: if (buffer->output[0] == 0) Which was changed to: ret = dell_send_request(...) if (ret) However, dell_send_request() will return 0 if buffer->output[0] == 0, so we must change the check to: if (ret == 0) This issue was found on a Dell M4800 laptop, and the fix tested on it as well. Fixes: 549b4930f057 ("dell-smbios: Introduce dispatcher for SMM calls") Signed-off-by: Damien Thébault <damien@dtbo.net> Tested-by: Damien Thébault <damien@dtbo.net> Reviewed-by: Pali Rohár <pali.rohar@gmail.com> Reviewed-by: Mario Limonciello <mario.limonciello@dell.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-07-20Revert "iommu/vt-d: Clean up pasid quirk for pre-production devices"Lu Baolu2-2/+31
This reverts commit ab96746aaa344fb720a198245a837e266fad3b62. The commit ab96746aaa34 ("iommu/vt-d: Clean up pasid quirk for pre-production devices") triggers ECS mode on some platforms which have broken ECS support. As the result, graphic device will be inoperable on boot. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107017 Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-07-20bpf: Use option "help" in the llvm-objcopy testMartin KaFai Lau1-1/+1
I noticed the "--version" option of the llvm-objcopy command has recently disappeared from the master llvm branch. It is currently used as a BTF support test in tools/testing/selftests/bpf/Makefile. This patch replaces it with "--help" which should be less error prone in the future. Fixes: c0fa1b6c3efc ("bpf: btf: Add BTF tests") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-20bpf: btf: Clean up BTF_INT_BITS() in uapi btf.hMartin KaFai Lau2-7/+11
This patch shrinks the BTF_INT_BITS() mask. The current btf_int_check_meta() ensures the nr_bits of an integer cannot exceed 64. Hence, it is mostly an uapi cleanup. The actual btf usage (i.e. seq_show()) is also modified to use u8 instead of u16. The verification (e.g. btf_int_check_meta()) path stays as is to deal with invalid BTF situation. Fixes: 69b693f0aefa ("bpf: btf: Introduce BPF Type Format (BTF)") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-20tools/bpftool: Fix segfault case regarding 'pin' argumentsTaeung Song1-3/+8
Arguments of 'pin' subcommand should be checked at the very beginning of do_pin_any(). Otherwise segfault errors can occur when using 'map pin' or 'prog pin' commands, so fix it. # bpftool prog pin id Segmentation fault Fixes: 71bb428fe2c1 ("tools: bpf: add bpftool") Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reported-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-19net-next/hinic: fix a problem in hinic_xmit_frame()Zhao Chen1-0/+1
The calculation of "wqe_size" is not correct when the tx queue is busy in hinic_xmit_frame(). When there are no free WQEs, the tx flow will unmap the skb buffer, then ring the doobell for the pending packets. But the "wqe_size" which used to calculate the doorbell address is not correct. The wqe size should be cleared to 0, otherwise, it will cause a doorbell error. This patch fixes the problem. Reported-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Zhao Chen <zhaochen6@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19net/page_pool: Fix inconsistent lock state warningTariq Toukan1-1/+1
Fix the warning below by calling the ptr_ring_consume_bh, which uses spin_[un]lock_bh. [ 179.064300] ================================ [ 179.069073] WARNING: inconsistent lock state [ 179.073846] 4.18.0-rc2+ #18 Not tainted [ 179.078133] -------------------------------- [ 179.082907] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 179.089637] swapper/21/0 [HC0[0]:SC1[1]:HE1:SE0] takes: [ 179.095478] 00000000963d1995 (&(&r->consumer_lock)->rlock){+.?.}, at: __page_pool_empty_ring+0x61/0x100 [ 179.105988] {SOFTIRQ-ON-W} state was registered at: [ 179.111443] _raw_spin_lock+0x35/0x50 [ 179.115634] __page_pool_empty_ring+0x61/0x100 [ 179.120699] page_pool_destroy+0x32/0x50 [ 179.125204] mlx5e_free_rq+0x38/0xc0 [mlx5_core] [ 179.130471] mlx5e_close_channel+0x20/0x120 [mlx5_core] [ 179.136418] mlx5e_close_channels+0x26/0x40 [mlx5_core] [ 179.142364] mlx5e_close_locked+0x44/0x50 [mlx5_core] [ 179.148509] mlx5e_close+0x42/0x60 [mlx5_core] [ 179.153936] __dev_close_many+0xb1/0x120 [ 179.158749] dev_close_many+0xa2/0x170 [ 179.163364] rollback_registered_many+0x148/0x460 [ 179.169047] rollback_registered+0x56/0x90 [ 179.174043] unregister_netdevice_queue+0x7e/0x100 [ 179.179816] unregister_netdev+0x18/0x20 [ 179.184623] mlx5e_remove+0x2a/0x50 [mlx5_core] [ 179.190107] mlx5_remove_device+0xe5/0x110 [mlx5_core] [ 179.196274] mlx5_unregister_interface+0x39/0x90 [mlx5_core] [ 179.203028] cleanup+0x5/0xbfc [mlx5_core] [ 179.208031] __x64_sys_delete_module+0x16b/0x240 [ 179.213640] do_syscall_64+0x5a/0x210 [ 179.218151] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 179.224218] irq event stamp: 334398 [ 179.228438] hardirqs last enabled at (334398): [<ffffffffa511d8b7>] rcu_process_callbacks+0x1c7/0x790 [ 179.239178] hardirqs last disabled at (334397): [<ffffffffa511d872>] rcu_process_callbacks+0x182/0x790 [ 179.249931] softirqs last enabled at (334386): [<ffffffffa509732e>] irq_enter+0x5e/0x70 [ 179.259306] softirqs last disabled at (334387): [<ffffffffa509741c>] irq_exit+0xdc/0xf0 [ 179.268584] [ 179.268584] other info that might help us debug this: [ 179.276572] Possible unsafe locking scenario: [ 179.276572] [ 179.283877] CPU0 [ 179.286954] ---- [ 179.290033] lock(&(&r->consumer_lock)->rlock); [ 179.295546] <Interrupt> [ 179.298830] lock(&(&r->consumer_lock)->rlock); [ 179.304550] [ 179.304550] *** DEADLOCK *** Fixes: ff7d6b27f894 ("page_pool: refurbish version of page_pool code") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19Merge tag 'drm-fixes-2018-07-20' of git://anongit.freedesktop.org/drm/drmLinus Torvalds18-64/+152
Pull drm fixes from Dave Airlie: "Just two sets of driver fixes this week to follow up on the set from earlier in the week and hopefully get me realigned schedule wise. amdgpu: - ACP fix for boards with multiple I2S instances - DP fix for CZ, vega - hybrid laptop fixes - Resume regression fix nouveau: - large memory systems and Pascal fix - MST race fixes - runtime PM fix" * tag 'drm-fixes-2018-07-20' of git://anongit.freedesktop.org/drm/drm: drm/nouveau/fb/gp100-: disable address remapper drm/amd/amdgpu: creating two I2S instances for stoney/cz (v2) drm/amdgpu: add another ATPX quirk for TOPAZ drm/amd/display: Fix DP HBR2 Eye Diagram Pattern on Carrizo drm/amdgpu: Make sure IB tests flushed after IP resume drm/nouveau: Set DRIVER_ATOMIC cap earlier to fix debugfs drm/nouveau: Remove bogus crtc check in pmops_runtime_idle drm/nouveau/drm/nouveau: Fix runtime PM leak in nv50_disp_atomic_commit() drm/nouveau: Avoid looping through fake MST connectors drm/nouveau: Use drm_connector_list_iter_* for iterating connectors drm/nouveau/gem: off by one bugs in nouveau_gem_pushbuf_reloc_apply() drm/nouveau/kms/nv50-: ensure window updates are submitted when flushing mst disables
2018-07-20ARM: dts: imx6: RDU2: fix irq type for mv88e6xxx switchUwe Kleine-König1-1/+1
The Marvell switches report their interrupts in a level sensitive way. When using edge sensitive detection a race condition in the interrupt handler of the swich might result in the OS to miss all future events which might make the switch non-functional. The problem is that both mv88e6xxx_g2_irq_thread_fn() and mv88e6xxx_g1_irq_thread_work() sample the irq cause register (MV88E6XXX_G2_INT_SRC and MV88E6XXX_G1_STS respectively) once and then handle the observed sources. If after sampling but before all observed irq sources are handled a new irq source gets active this is not noticed by the handler which returns unsuspecting, but the interrupt line stays active which prevents the edge detector to kick in. All device trees but imx6qdl-zii-rdu2 get this right (most of them by not specifying an interrupt parent). So fix imx6qdl-zii-rdu2 accordingly. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Fixes: f64992d1a916 ("ARM: dts: imx6: RDU2: Add Switch interrupts") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2018-07-20Merge branch 'linux-4.18' of git://github.com/skeggsb/linux into drm-fixesDave Airlie12-49/+105
- fix problem with pascal and large memory systems - fix a bunch of MST problems - fix a runtime PM interaction with MST Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/CACAvsv79O8deSts2fxJ_oS6=q8yA+OgwBSEpp5R=BQBmWa+oyg@mail.gmail.com
2018-07-20Merge branch 'drm-fixes-4.18' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie6-15/+47
into drm-fixes Fixes for 4.18. The ACP patch is a bit bigger than I would like at this point, but it should have gone in long ago, it just fell through the cracks. The others are pretty small and straight-forward. - ACP fix for boards with 2 I2S instances - DP fix for CZ, vega - Fix for a hybrid graphics laptop - Fix a resume regression Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180718162603.2747-1-alexander.deucher@amd.com
2018-07-19Merge branch 'ppc-fix'Alexei Starovoitov2-24/+45
Daniel Borkmann says: ==================== This set adds a ppc64 JIT fix for xadd as well as a missing test case for verifying whether xadd messes with src/dst reg. Thanks! ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-07-19bpf: test case to check whether src/dst regs got mangled by xaddDaniel Borkmann1-0/+40
We currently do not have such a test case in test_verifier selftests but it's important to test under bpf_jit_enable=1 to make sure JIT implementations do not mistakenly mess with src/dst reg for xadd/{w,dw}. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-07-19bpf, ppc64: fix unexpected r0=0 exit path inside bpf_xaddDaniel Borkmann1-24/+5
None of the JITs is allowed to implement exit paths from the BPF insn mappings other than BPF_JMP | BPF_EXIT. In the BPF core code we have a couple of rewrites in eBPF (e.g. LD_ABS / LD_IND) and in eBPF to cBPF translation to retain old existing behavior where exceptions may occur; they are also tightly controlled by the verifier where it disallows some of the features such as BPF to BPF calls when legacy LD_ABS / LD_IND ops are present in the BPF program. During recent review of all BPF_XADD JIT implementations I noticed that the ppc64 one is buggy in that it contains two jumps to exit paths. This is problematic as this can bypass verifier expectations e.g. pointed out in commit f6b1b3bf0d5f ("bpf: fix subprog verifier bypass by div/mod by 0 exception"). The first exit path is obsoleted by the fix in ca36960211eb ("bpf: allow xadd only on aligned memory") anyway, and for the second one we need to do a fetch, add and store loop if the reservation from lwarx/ldarx was lost in the meantime. Fixes: 156d0e290e96 ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF") Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Reviewed-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Tested-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-07-19Merge tag 'imx-fixes-4.18-3' of ↵Olof Johansson1-0/+21
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes i.MX fixes for 4.18, round 3: - Restrict GPC driver on register range that is accessible by regmap, so that we can avoid user space from triggering imprecise external abort exception. * tag 'imx-fixes-4.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: soc: imx: gpc: restrict register range for regmap access Signed-off-by: Olof Johansson <olof@lixom.net>
2018-07-19Merge tag 'omap-for-v4.18/fixes-rc5-signed' of ↵Olof Johansson1-6/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes One omap dts mismerge fix The dts patch for droid4 PWM vibrator has added gpio6 entries to the wrong node. Let's fix it with a note that there seems to be also other GPIO PWM issues to fix still to get the PWM vibrator working. So this can wait for v4.19 merge cycle if necessary. * tag 'omap-for-v4.18/fixes-rc5-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: omap4-droid4: fix dts w.r.t. pwm Signed-off-by: Olof Johansson <olof@lixom.net>
2018-07-19Merge tag 'pci-v4.18-fixes-3' of ↵Linus Torvalds12-30/+97
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix crashes that happen when PHY drivers are left disabled in the V3 Semiconductor, MediaTek, Faraday, Aardvark, DesignWare, Versatile, and X-Gene host controller drivers (Sergei Shtylyov) - Fix a NULL pointer dereference in the endpoint library configfs support (Kishon Vijay Abraham I) - Fix a race condition in Hyper-V IRQ handling (Dexuan Cui) * tag 'pci-v4.18-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: v3-semi: Fix I/O space page leak PCI: mediatek: Fix I/O space page leak PCI: faraday: Fix I/O space page leak PCI: aardvark: Fix I/O space page leak PCI: designware: Fix I/O space page leak PCI: versatile: Fix I/O space page leak PCI: xgene: Fix I/O space page leak PCI: OF: Fix I/O space page leak PCI: endpoint: Fix NULL pointer dereference error when CONFIGFS is disabled PCI: hv: Disable/enable IRQs rather than BH in hv_compose_msi_msg()
2018-07-19ARCv2: [plat-hsdk]: Save accl reg pair by defaultVineet Gupta2-1/+3
This manifsted as strace segfaulting on HSDK because gcc was targetting the accumulator registers as GPRs, which kernek was not saving/restoring by default. Cc: stable@vger.kernel.org #4.14+ Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-07-19Merge tag 'sound-4.18-rc6' of ↵Linus Torvalds3-6/+17
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A rawmidi race fix and three trivial HD-audio quirks" * tag 'sound-4.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek - Yet another Clevo P950 quirk entry ALSA: rawmidi: Change resized buffers atomically ALSA: hda/realtek - Add Panasonic CF-SZ6 headset jack quirk ALSA: hda: add mute led support for HP ProBook 455 G5
2018-07-19ACPI / EC: Use ec_no_wakeup on more Thinkpad X1 Carbon 6th systemsRobin H. Johnson1-1/+1
The ec_no_wakeup matcher added for Thinkpad X1 Carbon 6th gen systems beyond matched only a single DMI model (20KGS3JF01), that didn't cover my laptop (20KH002JUS). Change to match based on DMI product family to cover all X1 6th gen systems. Signed-off-by: Robin H. Johnson <robbat2@gentoo.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-07-19Merge branch 'linus' of ↵Linus Torvalds1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes an allocation error-path bug in af_alg discovered by syzkaller" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: af_alg - Initialize sg_num_bytes in error code path
2018-07-19Btrfs: fix file data corruption after cloning a range and fsyncFilipe Manana1-2/+5
When we clone a range into a file we can end up dropping existing extent maps (or trimming them) and replacing them with new ones if the range to be cloned overlaps with a range in the destination inode. When that happens we add the new extent maps to the list of modified extents in the inode's extent map tree, so that a "fast" fsync (the flag BTRFS_INODE_NEEDS_FULL_SYNC not set in the inode) will see the extent maps and log corresponding extent items. However, at the end of range cloning operation we do truncate all the pages in the affected range (in order to ensure future reads will not get stale data). Sometimes this truncation will release the corresponding extent maps besides the pages from the page cache. If this happens, then a "fast" fsync operation will miss logging some extent items, because it relies exclusively on the extent maps being present in the inode's extent tree, leading to data loss/corruption if the fsync ends up using the same transaction used by the clone operation (that transaction was not committed in the meanwhile). An extent map is released through the callback btrfs_invalidatepage(), which gets called by truncate_inode_pages_range(), and it calls __btrfs_releasepage(). The later ends up calling try_release_extent_mapping() which will release the extent map if some conditions are met, like the file size being greater than 16Mb, gfp flags allow blocking and the range not being locked (which is the case during the clone operation) nor being the extent map flagged as pinned (also the case for cloning). The following example, turned into a test for fstests, reproduces the issue: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ xfs_io -f -c "pwrite -S 0x18 9000K 6908K" /mnt/foo $ xfs_io -f -c "pwrite -S 0x20 2572K 156K" /mnt/bar $ xfs_io -c "fsync" /mnt/bar # reflink destination offset corresponds to the size of file bar, # 2728Kb minus 4Kb. $ xfs_io -c ""reflink ${SCRATCH_MNT}/foo 0 2724K 15908K" /mnt/bar $ xfs_io -c "fsync" /mnt/bar $ md5sum /mnt/bar 95a95813a8c2abc9aa75a6c2914a077e /mnt/bar <power fail> $ mount /dev/sdb /mnt $ md5sum /mnt/bar 207fd8d0b161be8a84b945f0df8d5f8d /mnt/bar # digest should be 95a95813a8c2abc9aa75a6c2914a077e like before the # power failure In the above example, the destination offset of the clone operation corresponds to the size of the "bar" file minus 4Kb. So during the clone operation, the extent map covering the range from 2572Kb to 2728Kb gets trimmed so that it ends at offset 2724Kb, and a new extent map covering the range from 2724Kb to 11724Kb is created. So at the end of the clone operation when we ask to truncate the pages in the range from 2724Kb to 2724Kb + 15908Kb, the page invalidation callback ends up removing the new extent map (through try_release_extent_mapping()) when the page at offset 2724Kb is passed to that callback. Fix this by setting the bit BTRFS_INODE_NEEDS_FULL_SYNC whenever an extent map is removed at try_release_extent_mapping(), forcing the next fsync to search for modified extents in the fs/subvolume tree instead of relying on the presence of extent maps in memory. This way we can continue doing a "fast" fsync if the destination range of a clone operation does not overlap with an existing range or if any of the criteria necessary to remove an extent map at try_release_extent_mapping() is not met (file size not bigger then 16Mb or gfp flags do not allow blocking). CC: stable@vger.kernel.org # 3.16+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-07-19drm/nouveau/fb/gp100-: disable address remapperBen Skeggs4-0/+15
This was causing problems on a system with a large amount of RAM, where display push buffers were being fetched incorrectly when placed in high system memory addresses. While this commit will resolve the issue on that particular system, the issue will be avoided completely with another patch to more fully solve problems with display and large amounts of system memory on Pascal. It's still probably a good idea to disable this to prevent weird issues in the future. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds171-1128/+1739
Pull networking fixes from David Miller: "Lots of fixes, here goes: 1) NULL deref in qtnfmac, from Gustavo A. R. Silva. 2) Kernel oops when fw download fails in rtlwifi, from Ping-Ke Shih. 3) Lost completion messages in AF_XDP, from Magnus Karlsson. 4) Correct bogus self-assignment in rhashtable, from Rishabh Bhatnagar. 5) Fix regression in ipv6 route append handling, from David Ahern. 6) Fix masking in __set_phy_supported(), from Heiner Kallweit. 7) Missing module owner set in x_tables icmp, from Florian Westphal. 8) liquidio's timeouts are HZ dependent, fix from Nicholas Mc Guire. 9) Link setting fixes for sh_eth and ravb, from Vladimir Zapolskiy. 10) Fix NULL deref when using chains in act_csum, from Davide Caratti. 11) XDP_REDIRECT needs to check if the interface is up and whether the MTU is sufficient. From Toshiaki Makita. 12) Net diag can do a double free when killing TCP_NEW_SYN_RECV connections, from Lorenzo Colitti. 13) nf_defrag in ipv6 can unnecessarily hold onto dst entries for a full minute, delaying device unregister. From Eric Dumazet. 14) Update MAC entries in the correct order in ixgbe, from Alexander Duyck. 15) Don't leave partial mangles bpf program in jit_subprogs, from Daniel Borkmann. 16) Fix pfmemalloc SKB state propagation, from Stefano Brivio. 17) Fix ACK handling in DCTCP congestion control, from Yuchung Cheng. 18) Use after free in tun XDP_TX, from Toshiaki Makita. 19) Stale ipv6 header pointer in ipv6 gre code, from Prashant Bhole. 20) Don't reuse remainder of RX page when XDP is set in mlx4, from Saeed Mahameed. 21) Fix window probe handling of TCP rapair sockets, from Stefan Baranoff. 22) Missing socket locking in smc_ioctl(), from Ursula Braun. 23) IPV6_ILA needs DST_CACHE, from Arnd Bergmann. 24) Spectre v1 fix in cxgb3, from Gustavo A. R. Silva. 25) Two spots in ipv6 do a rol32() on a hash value but ignore the result. Fixes from Colin Ian King" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (176 commits) tcp: identify cryptic messages as TCP seq # bugs ptp: fix missing break in switch hv_netvsc: Fix napi reschedule while receive completion is busy MAINTAINERS: Drop inactive Vitaly Bordug's email net: cavium: Add fine-granular dependencies on PCI net: qca_spi: Fix log level if probe fails net: qca_spi: Make sure the QCA7000 reset is triggered net: qca_spi: Avoid packet drop during initial sync ipv6: fix useless rol32 call on hash ipv6: sr: fix useless rol32 call on hash net: sched: Using NULL instead of plain integer net: usb: asix: replace mii_nway_restart in resume path net: cxgb3_main: fix potential Spectre v1 lib/rhashtable: consider param->min_size when setting initial table size net/smc: reset recv timeout after clc handshake net/smc: add error handling for get_user() net/smc: optimize consumer cursor updates net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL. ipv6: ila: select CONFIG_DST_CACHE net: usb: rtl8150: demote allmulti message to dev_dbg() ...
2018-07-19soc: imx: gpc: restrict register range for regmap accessAnson Huang1-0/+21
GPC registers are NOT continuous, some registers are reserved and accessing them from userspace will trigger external abort, add regmap register access table to avoid below abort: root@imx6slevk:~# cat /sys/kernel/debug/regmap/20dc000.gpc/registers [ 108.480477] Unhandled fault: imprecise external abort (0x1406) at 0xb6db5004 [ 108.487985] pgd = 42b54bfd [ 108.490741] [b6db5004] *pgd=ba1b7831 [ 108.494386] Internal error: : 1406 [#1] SMP ARM [ 108.498943] Modules linked in: [ 108.502043] CPU: 0 PID: 389 Comm: cat Not tainted 4.18.0-rc1-00074-gc9f1f60-dirty #482 [ 108.509982] Hardware name: Freescale i.MX6 SoloLite (Device Tree) [ 108.516123] PC is at regmap_mmio_read32le+0x20/0x24 [ 108.521031] LR is at regmap_mmio_read+0x40/0x60 [ 108.525586] pc : [<c059cf74>] lr : [<c059d1ac>] psr: 20060093 [ 108.531875] sp : eccf1d98 ip : eccf1da8 fp : eccf1da4 [ 108.537122] r10: ec2d3800 r9 : eccf1f60 r8 : ecfc0000 [ 108.542370] r7 : eccf1e2c r6 : eccf1e2c r5 : 00000028 r4 : ec338e00 [ 108.548920] r3 : 00000000 r2 : eccf1e2c r1 : f0980028 r0 : 00000000 [ 108.555474] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none [ 108.562720] Control: 10c5387d Table: acf4004a DAC: 00000051 [ 108.568491] Process cat (pid: 389, stack limit = 0xd4318a65) [ 108.574174] Stack: (0xeccf1d98 to 0xeccf2000) Fixes: 721cabf6c660 ("soc: imx: move PGC handling to a new GPC driver") Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2018-07-18net/mlx5e: Only allow offloading decap egress (egdev) flowsRoi Dayan1-0/+4
We get egress rules through the egdev mechanism when the ingress device is not supporting offload, with the expected use-case of tunnel decap ingress rule set on shared tunnel device. Make sure to offload egress/egdev rules only if decap action (tunnel key unset) exists there and err otherwise. Fixes: 717503b9cf57 ("net: sched: convert cls_flower->egress_dev users to tc_setup_cb_egdev infra") Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18net/mlx5: Fix QP fragmented buffer allocationTariq Toukan3-16/+38
Fix bad alignment of SQ buffer in fragmented QP allocation. It should start directly after RQ buffer ends. Take special care of the end case where the RQ buffer does not occupy a whole page. RQ size is a power of two, so would be the case only for small RQ sizes (RQ size < PAGE_SIZE). Fix wrong assignments for sqb->size (mistakenly assigned RQ size), and for npages value of RQ and SQ. Fixes: 3a2f70331226 ("net/mlx5: Use order-0 allocations for all WQ types") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18net/mlx5: Fix 'DON'T_TRAP' functionalityRaed Salem1-1/+1
The flow counters binding support commit introduced a code change where none NULL 'rule_dest' is always passed to mlx5_add_flow_rules, this breaks 'DON'T_TRAP' rules insertion. The fix uses the equivalent 'dest_num' value instead of dest pointer at the failed check. fixes: 3b3233fbf02e ('IB/mlx5: Add flow counters binding support') Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18net/mlx5: E-Switch, UBSAN fix undefined behavior in mlx5_eswitch_modeSaeed Mahameed1-1/+1
With debug kernel UBSAN detects the following issue, which might happen when eswitch instance is not created, fix this by testing the eswitch pointer before returning the eswitch mode, if not set return mode = SRIOV_NONE. [ 32.528951] UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx5/core/eswitch.c:2219:12 [ 32.528951] member access within null pointer of type 'struct mlx5_eswitch' [ 32.528951] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc3-dirty #181 [ 32.528951] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [ 32.528951] Call Trace: [ 32.528951] dump_stack+0xc7/0x13b [ 32.528951] ? show_regs_print_info+0x5/0x5 [ 32.528951] ? __pm_runtime_use_autosuspend+0x140/0x140 [ 32.528951] ubsan_epilogue+0x9/0x49 [ 32.528951] ubsan_type_mismatch_common+0x1f9/0x2c0 [ 32.528951] ? ucs2_as_utf8+0x310/0x310 [ 32.528951] ? device_initialize+0x229/0x2e0 [ 32.528951] __ubsan_handle_type_mismatch+0x9f/0xc9 [ 32.528951] ? __ubsan_handle_divrem_overflow+0x19b/0x19b [ 32.578008] ? ib_device_get_by_index+0xf0/0xf0 [ 32.578008] mlx5_eswitch_mode+0x30/0x40 [ 32.578008] mlx5_ib_add+0x1e0/0x4a0 Fixes: 57cbd893c4c5 ("net/mlx5: E-Switch, Move representors definition to a global scope") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-18net/mlx5e: Don't allow aRFS for encapsulated packetsEran Ben Elisha1-0/+3
Driver is yet to support aRFS for encapsulated packets, return early error in such case. Fixes: 18c908e477dc ("net/mlx5e: Add accelerated RFS support") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18net/mlx5e: Fix quota counting in aRFS expire flowEran Ben Elisha1-2/+2
Quota should follow the amount of rules which do expire, and not the number of rules that were examined, fixed that. Fixes: 18c908e477dc ("net/mlx5e: Add accelerated RFS support") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18net/mlx5: Adjust clock overflow work periodAriel Levkovich1-2/+10
When driver converts HW timestamp to wall clock time it subtracts the last saved cycle counter from the HW timestamp and converts the difference to nanoseconds. The conversion is done by multiplying the cycles difference with the clock multiplier value as a first step and therefore the cycles difference should be small enough so that the multiplication product doesn't exceed 64bit. The overflow handling routine is in charge of updating the last saved cycle counter in driver and it is called periodically using kernel delayed workqueue. The delay period for this work is calculated using the max HW cycle counter value (a 41 bit mask) as a base which doesn't take the 64bit limit into account so the delay period may be incorrect and too long to prevent a large difference between the HW counter and the last saved counter in SW. This change adjusts the work period for the HW clock overflow work by taking the minimum between the previous value and the quotient of max u64 value and the clock multiplier value. Fixes: ef9814deafd0 ("net/mlx5e: Add HW timestamping (TS) support") Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18net/mlx5e: Refine ets validation functionShay Agroskin1-9/+8
Removed an error message received when configuring ETS total bandwidth to be zero. Our hardware doesn't support such configuration, so we shall reject it in the driver. Nevertheless, we removed the error message in order to eliminate error messages caused by old userspace tools who try to pass such configuration. Fixes: ff0891915cd7 ("net/mlx5e: Fix ETS BW check") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Reviewed-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18Merge tag 'devicetree-fixes-for-4.18' of ↵Linus Torvalds18-18/+31
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull DeviceTree fixes from Rob Herring: - Fix phandle cache to work with overlays - Correct the default clock-frequency for QCom geni-i2c - Binding doc quote and spelling fixes * tag 'devicetree-fixes-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: overlay: update phandle cache on overlay apply and remove dt-bindings: Fix unbalanced quotation marks dt-bindings: soc: qcom: Fix default clock-freq for qcom,geni-i2c dt-bindings: w1-gpio: Remove unneeded unit address Documentation: devicetree: tilcdc: fix spelling mistake "suppors" -> "supports"
2018-07-18tcp: identify cryptic messages as TCP seq # bugsRandy Dunlap1-2/+2
Attempt to make cryptic TCP seq number error messages clearer by (1) identifying the source of the message as "TCP", (2) identifying the errors as "seq # bug", and (3) grouping the field identifiers and values by separating them with commas. E.g., the following message is changed from: recvmsg bug 2: copied 73BCB6CD seq 70F17CBE rcvnxt 73BCB9AA fl 0 WARNING: CPU: 2 PID: 1501 at /linux/net/ipv4/tcp.c:1881 tcp_recvmsg+0x649/0xb90 to: TCP recvmsg seq # bug 2: copied 73BCB6CD, seq 70F17CBE, rcvnxt 73BCB9AA, fl 0 WARNING: CPU: 2 PID: 1501 at /linux/net/ipv4/tcp.c:2011 tcp_recvmsg+0x694/0xba0 Suggested-by: 積丹尼 Dan Jacobson <jidanni@jidanni.org> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18ptp: fix missing break in switchGustavo A. R. Silva1-0/+1
It seems that a *break* is missing in order to avoid falling through to the default case. Otherwise, checking *chan* makes no sense. Fixes: 72df7a7244c0 ("ptp: Allow reassigning calibration pin function") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18hv_netvsc: Fix napi reschedule while receive completion is busyHaiyang Zhang1-7/+10
If out ring is full temporarily and receive completion cannot go out, we may still need to reschedule napi if certain conditions are met. Otherwise the napi poll might be stopped forever, and cause network disconnect. Fixes: 7426b1a51803 ("netvsc: optimize receive completions") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18MAINTAINERS: Drop inactive Vitaly Bordug's emailKrzysztof Kozlowski1-1/+0
The Vitaly Bordug's email bounces ("ru.mvista.com: Name or service not known") and there was no activity (ack, review, sign) since 2009. Cc: Vitaly Bordug <vitb@kernel.crashing.org> Cc: Pantelis Antoniou <pantelis.antoniou@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18net: cavium: Add fine-granular dependencies on PCIAlexander Sverdlin1-6/+6
Add dependencies on PCI where necessary. Fixes: 7e2bc7fb65 ("net: cavium: Drop dependency of NET_VENDOR_CAVIUM on PCI") Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18Merge branch 'net-qca_spi-Minor-bugfixes'David S. Miller1-9/+12
Stefan Wahren says: ==================== net: qca_spi: Minor bugfixes This patch series contains some minor bugfixes for the qca_spi driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18net: qca_spi: Fix log level if probe failsStefan Wahren1-8/+8
In cases the probing fails the log level of the messages should be an error. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18net: qca_spi: Make sure the QCA7000 reset is triggeredStefan Wahren1-0/+3
In case the SPI thread is not running, a simple reset of sync state won't fix the transmit timeout. We also need to wake up the kernel thread. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Fixes: ed7d42e24eff ("net: qca_spi: fix transmit queue timeout handling") Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18net: qca_spi: Avoid packet drop during initial syncStefan Wahren1-1/+1
As long as the synchronization with the QCA7000 isn't finished, we cannot accept packets from the upper layers. So let the SPI thread enable the TX queue after sync and avoid unwanted packet drop. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Fixes: 291ab06ecf67 ("net: qualcomm: new Ethernet over SPI driver for QCA7000") Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18ipv6: fix useless rol32 call on hashColin Ian King1-1/+1
The rol32 call is currently rotating hash but the rol'd value is being discarded. I believe the current code is incorrect and hash should be assigned the rotated value returned from rol32. Thanks to David Lebrun for spotting this. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18ipv6: sr: fix useless rol32 call on hashColin Ian King1-1/+1
The rol32 call is currently rotating hash but the rol'd value is being discarded. I believe the current code is incorrect and hash should be assigned the rotated value returned from rol32. Detected by CoverityScan, CID#1468411 ("Useless call") Fixes: b5facfdba14c ("ipv6: sr: Compute flowlabel for outer IPv6 header of seg6 encap mode") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: dlebrun@google.com Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18PCI: v3-semi: Fix I/O space page leakSergei Shtylyov1-1/+1
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. The V3 Semiconductor PCI driver has the same issue. Replace devm_pci_remap_iospace() with its devm_ managed version to fix the bug. Fixes: 68a15eb7bd0c ("PCI: v3-semi: Add V3 Semiconductor PCI host driver") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-18PCI: mediatek: Fix I/O space page leakSergei Shtylyov1-1/+1
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. The MediaTek PCIe driver has the same issue. Replace devm_pci_remap_iospace() with its devm_ managed counterpart to fix the bug. Fixes: 637cfacae96f ("PCI: mediatek: Add MediaTek PCIe host controller support") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-18PCI: faraday: Fix I/O space page leakSergei Shtylyov1-1/+1
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. The Faraday PCI driver has the same issue. Replace pci_remap_iospace() with its devm_ managed version to fix the bug. Fixes: d3c68e0a7e34 ("PCI: faraday: Add Faraday Technology FTPCI100 PCI Host Bridge driver") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-18PCI: aardvark: Fix I/O space page leakSergei Shtylyov1-1/+1
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. The Aardvark PCI controller driver has the same issue. Replace pci_remap_iospace() with its devm_ managed version to fix the bug. Fixes: 8c39d710363c ("PCI: aardvark: Add Aardvark PCI host controller driver") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-18PCI: designware: Fix I/O space page leakSergei Shtylyov1-1/+2
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver is left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. The DesignWare PCIe controller driver has the same issue. Replace devm_pci_remap_iospace() with a devm_ managed version to fix the bug. Fixes: cbce7900598c ("PCI: designware: Make driver arch-agnostic") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jingoo Han <jingoohan1@gmail.com>
2018-07-18PCI: versatile: Fix I/O space page leakSergei Shtylyov1-1/+1
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. The Versatile PCI controller driver has the same issue. Replace pci_remap_iospace() with the devm_ managed version to fix the bug. Fixes: b7e78170efd4 ("PCI: versatile: Add DT-based ARM Versatile PB PCIe host driver") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-18PCI: xgene: Fix I/O space page leakSergei Shtylyov1-1/+1
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. The X-Gene PCI controller driver has the same issue. Replace pci_remap_iospace() with the devm_ managed version so that the pages get unmapped automagically on any probe failure. Fixes: 5f6b6ccdbe1c ("PCI: xgene: Add APM X-Gene PCIe driver") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-18Merge tag 'batadv-net-for-davem-20180717' of git://git.open-mesh.org/linux-mergeDavid S. Miller6-10/+93
Simon Wunderlich says: ==================== Here are some batman-adv fixes: - Fix gateway refcounting in BATMAN IV and V, by Sven Eckelmann (2 patches) - Fix debugfs paths when renaming interfaces, by Sven Eckelmann (2 patches) - Fix TT flag issues, by Linus Luessing (2 patches) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18net: sched: Using NULL instead of plain integerYueHaibing1-2/+2
Fixes the following sparse warnings: net/sched/cls_api.c:1101:43: warning: Using plain integer as NULL pointer net/sched/cls_api.c:1492:75: warning: Using plain integer as NULL pointer Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18net: usb: asix: replace mii_nway_restart in resume pathAlexander Couzens1-1/+3
mii_nway_restart is not pm aware which results in a rtnl deadlock. Implement mii_nway_restart manual by setting BMCR_ANRESTART if BMCR_ANENABLE is set. To reproduce: * plug an asix based usb network interface * wait until the device enters PM (~5 sec) * `ip link set eth1 up` will never return Fixes: d9fe64e51114 ("net: asix: Add in_pm parameter") Signed-off-by: Alexander Couzens <lynxis@fe80.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18PCI: OF: Fix I/O space page leakSergei Shtylyov3-1/+41
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. Introduce the devm_pci_remap_iospace() managed API and replace the pci_remap_iospace() call with it to fix the bug. Fixes: dbf9826d5797 ("PCI: generic: Convert to DT resource parsing API") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: split commit/updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-18net: cxgb3_main: fix potential Spectre v1Gustavo A. R. Silva1-0/+2
t.qset_idx can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:2286 cxgb_extension_ioctl() warn: potential spectre issue 'adapter->msix_info' Fix this by sanitizing t.qset_idx before using it to index adapter->msix_info Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18lib/rhashtable: consider param->min_size when setting initial table sizeDavidlohr Bueso1-6/+11
rhashtable_init() currently does not take into account the user-passed min_size parameter unless param->nelem_hint is set as well. As such, the default size (number of buckets) will always be HASH_DEFAULT_SIZE even if the smallest allowed size is larger than that. Remediate this by unconditionally calling into rounded_hashtable_size() and handling things accordingly. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18vfio/pci: Fix potential Spectre v1Gustavo A. R. Silva1-0/+4
info.index can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/vfio/pci/vfio_pci.c:734 vfio_pci_ioctl() warn: potential spectre issue 'vdev->region' Fix this by sanitizing info.index before indirectly using it to index vdev->region Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-07-18Merge tag 'for-4.18-rc5-tag' of ↵Linus Torvalds3-8/+13
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Three regression fixes. They're few-liners and fixing some corner cases missed in the origial patches" * tag 'for-4.18-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: scrub: Don't use inode page cache in scrub_handle_errored_block() btrfs: fix use-after-free of cmp workspace pages btrfs: restore uuid_mutex in btrfs_open_devices
2018-07-18Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds5-33/+63
Pull kvm fixes from Paolo Bonzini: "Miscellaneous bugfixes, plus a small patchlet related to Spectre v2" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvmclock: fix TSC calibration for nested guests KVM: VMX: Mark VMXArea with revision_id of physical CPU even when eVMCS enabled KVM: irqfd: fix race between EPOLLHUP and irq_bypass_register_consumer KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in parallel. x86/kvmclock: set pvti_cpu0_va after enabling kvmclock x86/kvm/Kconfig: Ensure CRYPTO_DEV_CCP_DD state at minimum matches KVM_AMD kvm: nVMX: Restore exit qual for VM-entry failure due to MSR loading x86/kvm/vmx: don't read current->thread.{fs,gs}base of legacy tasks KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR
2018-07-18Merge branch 'smc-fixes'David S. Miller3-4/+14
Ursula Braun says: ==================== net/smc: fixes 2018-07-18 here are small fixes for SMC: The first patch speeds up unidirectional traffic, the second patch increases security, and the third patch fixes a problem for fallback cases. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18net/smc: reset recv timeout after clc handshakeKarsten Graul1-1/+2
During clc handshake the receive timeout is set to CLC_WAIT_TIME. Remember and reset the original timeout value after the receive calls, and remove a duplicate assignment of CLC_WAIT_TIME. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18net/smc: add error handling for get_user()Ursula Braun1-1/+2
For security reasons the return code of get_user() should always be checked. Fixes: 01d2f7e2cdd31 ("net/smc: sockopts TCP_NODELAY and TCP_CORK") Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18net/smc: optimize consumer cursor updatesUrsula Braun1-2/+10
The SMC protocol requires to send a separate consumer cursor update, if it cannot be piggybacked to updates of the producer cursor. Currently the decision to send a separate consumer cursor update just considers the amount of data already received by the socket program. It does not consider the amount of data already arrived, but not yet consumed by the receiver. Basing the decision on the difference between already confirmed and already arrived data (instead of difference between already confirmed and already consumed data), may lead to a somewhat earlier consumer cursor update send in fast unidirectional traffic scenarios, and thus to better throughput. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Suggested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL.Tetsuo Handa1-3/+6
syzbot is reporting stalls at nfc_llcp_send_ui_frame() [1]. This is because nfc_llcp_send_ui_frame() is retrying the loop without any delay when nonblocking nfc_alloc_send_skb() returned NULL. Since there is no need to use MSG_DONTWAIT if we retry until sock_alloc_send_pskb() succeeds, let's use blocking call. Also, in case an unexpected error occurred, let's break the loop if blocking nfc_alloc_send_skb() failed. [1] https://syzkaller.appspot.com/bug?id=4a131cc571c3733e0eff6bc673f4e36ae48f19c6 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+d29d18215e477cfbfbdd@syzkaller.appspotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18ipv6: ila: select CONFIG_DST_CACHEArnd Bergmann1-0/+1
My randconfig builds came across an old missing dependency for ILA: ERROR: "dst_cache_set_ip6" [net/ipv6/ila/ila.ko] undefined! ERROR: "dst_cache_get" [net/ipv6/ila/ila.ko] undefined! ERROR: "dst_cache_init" [net/ipv6/ila/ila.ko] undefined! ERROR: "dst_cache_destroy" [net/ipv6/ila/ila.ko] undefined! We almost never run into this by accident because randconfig builds end up selecting DST_CACHE from some other tunnel protocol, and this one appears to be the only one missing the explicit 'select'. >From all I can tell, this problem first appeared in linux-4.9 when dst_cache support got added to ILA. Fixes: 79ff2fc31e0f ("ila: Cache a route to translated address") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18netfilter: nft_set_rbtree: fix panic when destroying set by GCTaehee Yoo1-2/+5
This patch fixes below. 1. check null pointer of rb_next. rb_next can return null. so null check routine should be added. 2. add rcu_barrier in destroy routine. GC uses call_rcu to remove elements. but all elements should be removed before destroying set and chains. so that rcu_barrier is added. test script: %cat test.nft table inet aa { map map1 { type ipv4_addr : verdict; flags interval, timeout; elements = { 0-1 : jump a0, 3-4 : jump a0, 6-7 : jump a0, 9-10 : jump a0, 12-13 : jump a0, 15-16 : jump a0, 18-19 : jump a0, 21-22 : jump a0, 24-25 : jump a0, 27-28 : jump a0, } timeout 1s; } chain a0 { } } flush ruleset table inet aa { map map1 { type ipv4_addr : verdict; flags interval, timeout; elements = { 0-1 : jump a0, 3-4 : jump a0, 6-7 : jump a0, 9-10 : jump a0, 12-13 : jump a0, 15-16 : jump a0, 18-19 : jump a0, 21-22 : jump a0, 24-25 : jump a0, 27-28 : jump a0, } timeout 1s; } chain a0 { } } flush ruleset splat looks like: [ 2402.419838] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 2402.428433] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 2402.429343] CPU: 1 PID: 1350 Comm: kworker/1:1 Not tainted 4.18.0-rc2+ #1 [ 2402.429343] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 03/23/2017 [ 2402.429343] Workqueue: events_power_efficient nft_rbtree_gc [nft_set_rbtree] [ 2402.429343] RIP: 0010:rb_next+0x1e/0x130 [ 2402.429343] Code: e9 de f2 ff ff 0f 1f 80 00 00 00 00 41 55 48 89 fa 41 54 55 53 48 c1 ea 03 48 b8 00 00 00 0 [ 2402.429343] RSP: 0018:ffff880105f77678 EFLAGS: 00010296 [ 2402.429343] RAX: dffffc0000000000 RBX: ffff8801143e3428 RCX: 1ffff1002287c69c [ 2402.429343] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000000 [ 2402.429343] RBP: 0000000000000000 R08: ffffed0016aabc24 R09: ffffed0016aabc24 [ 2402.429343] R10: 0000000000000001 R11: ffffed0016aabc23 R12: 0000000000000000 [ 2402.429343] R13: ffff8800b6933388 R14: dffffc0000000000 R15: ffff8801143e3440 [ 2402.534486] kasan: CONFIG_KASAN_INLINE enabled [ 2402.534212] FS: 0000000000000000(0000) GS:ffff88011b600000(0000) knlGS:0000000000000000 [ 2402.534212] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2402.534212] CR2: 0000000000863008 CR3: 00000000a3c16000 CR4: 00000000001006e0 [ 2402.534212] Call Trace: [ 2402.534212] nft_rbtree_gc+0x2b5/0x5f0 [nft_set_rbtree] [ 2402.534212] process_one_work+0xc1b/0x1ee0 [ 2402.540329] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 2402.534212] ? _raw_spin_unlock_irq+0x29/0x40 [ 2402.534212] ? pwq_dec_nr_in_flight+0x3e0/0x3e0 [ 2402.534212] ? set_load_weight+0x270/0x270 [ 2402.534212] ? __schedule+0x6ea/0x1fb0 [ 2402.534212] ? __sched_text_start+0x8/0x8 [ 2402.534212] ? save_trace+0x320/0x320 [ 2402.534212] ? sched_clock_local+0xe2/0x150 [ 2402.534212] ? find_held_lock+0x39/0x1c0 [ 2402.534212] ? worker_thread+0x35f/0x1150 [ 2402.534212] ? lock_contended+0xe90/0xe90 [ 2402.534212] ? __lock_acquire+0x4520/0x4520 [ 2402.534212] ? do_raw_spin_unlock+0xb1/0x350 [ 2402.534212] ? do_raw_spin_trylock+0x111/0x1b0 [ 2402.534212] ? do_raw_spin_lock+0x1f0/0x1f0 [ 2402.534212] worker_thread+0x169/0x1150 Fixes: 8d8540c4f5e0("netfilter: nft_set_rbtree: add timeout support") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-18netfilter: nft_set_hash: add rcu_barrier() in the nft_rhash_destroy()Taehee Yoo1-0/+1
GC of set uses call_rcu() to destroy elements. So that elements would be destroyed after destroying sets and chains. But, elements should be destroyed before destroying sets and chains. In order to wait calling call_rcu(), a rcu_barrier() is added. In order to test correctly, below patch should be applied. https://patchwork.ozlabs.org/patch/940883/ test scripts: %cat test.nft table ip aa { map map1 { type ipv4_addr : verdict; flags timeout; elements = { 0 : jump a0, 1 : jump a0, 2 : jump a0, 3 : jump a0, 4 : jump a0, 5 : jump a0, 6 : jump a0, 7 : jump a0, 8 : jump a0, 9 : jump a0, } timeout 1s; } chain a0 { } } flush ruleset [ ... ] table ip aa { map map1 { type ipv4_addr : verdict; flags timeout; elements = { 0 : jump a0, 1 : jump a0, 2 : jump a0, 3 : jump a0, 4 : jump a0, 5 : jump a0, 6 : jump a0, 7 : jump a0, 8 : jump a0, 9 : jump a0, } timeout 1s; } chain a0 { } } flush ruleset Splat looks like: [ 200.795603] kernel BUG at net/netfilter/nf_tables_api.c:1363! [ 200.806944] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 200.812253] CPU: 1 PID: 1582 Comm: nft Not tainted 4.17.0+ #24 [ 200.820297] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015 [ 200.830309] RIP: 0010:nf_tables_chain_destroy.isra.34+0x62/0x240 [nf_tables] [ 200.838317] Code: 43 50 85 c0 74 26 48 8b 45 00 48 8b 4d 08 ba 54 05 00 00 48 c7 c6 60 6d 29 c0 48 c7 c7 c0 65 29 c0 4c 8b 40 08 e8 58 e5 fd f8 <0f> 0b 48 89 da 48 b8 00 00 00 00 00 fc ff [ 200.860366] RSP: 0000:ffff880118dbf4d0 EFLAGS: 00010282 [ 200.866354] RAX: 0000000000000061 RBX: ffff88010cdeaf08 RCX: 0000000000000000 [ 200.874355] RDX: 0000000000000061 RSI: 0000000000000008 RDI: ffffed00231b7e90 [ 200.882361] RBP: ffff880118dbf4e8 R08: ffffed002373bcfb R09: ffffed002373bcfa [ 200.890354] R10: 0000000000000000 R11: ffffed002373bcfb R12: dead000000000200 [ 200.898356] R13: dead000000000100 R14: ffffffffbb62af38 R15: dffffc0000000000 [ 200.906354] FS: 00007fefc31fd700(0000) GS:ffff88011b800000(0000) knlGS:0000000000000000 [ 200.915533] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 200.922355] CR2: 0000557f1c8e9128 CR3: 0000000106880000 CR4: 00000000001006e0 [ 200.930353] Call Trace: [ 200.932351] ? nf_tables_commit+0x26f6/0x2c60 [nf_tables] [ 200.939525] ? nf_tables_setelem_notify.constprop.49+0x1a0/0x1a0 [nf_tables] [ 200.947525] ? nf_tables_delchain+0x6e0/0x6e0 [nf_tables] [ 200.952383] ? nft_add_set_elem+0x1700/0x1700 [nf_tables] [ 200.959532] ? nla_parse+0xab/0x230 [ 200.963529] ? nfnetlink_rcv_batch+0xd06/0x10d0 [nfnetlink] [ 200.968384] ? nfnetlink_net_init+0x130/0x130 [nfnetlink] [ 200.975525] ? debug_show_all_locks+0x290/0x290 [ 200.980363] ? debug_show_all_locks+0x290/0x290 [ 200.986356] ? sched_clock_cpu+0x132/0x170 [ 200.990352] ? find_held_lock+0x39/0x1b0 [ 200.994355] ? sched_clock_local+0x10d/0x130 [ 200.999531] ? memset+0x1f/0x40 Fixes: 9d0982927e79 ("netfilter: nft_hash: add support for timeouts") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-18drm/amd/amdgpu: creating two I2S instances for stoney/cz (v2)Vijendar Mukunda1-10/+37
Creating two I2S instances for Stoney/cz platforms. v2: squash in: "drm/amdgpu/acp: Fix slab-out-of-bounds in mfd_add_device in acp_hw_init" From Daniel Kurtz <djkurtz@chromium.org>. Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Signed-off-by: Akshu Agrawal <akshu.agrawal@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-18drm/amdgpu: add another ATPX quirk for TOPAZAlex Deucher1-0/+1
Needs ATPX rather than _PR3. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=200517 Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2018-07-18cpufreq: intel_pstate: Register when ACPI PCCH is presentRafael J. Wysocki2-1/+20
Currently, intel_pstate doesn't register if _PSS is not present on HP Proliant systems, because it expects the firmware to take over CPU performance scaling in that case. However, if ACPI PCCH is present, the firmware expects the kernel to use it for CPU performance scaling and the pcc-cpufreq driver is loaded for that. Unfortunately, the firmware interface used by that driver is not scalable for fundamental reasons, so pcc-cpufreq is way suboptimal on systems with more than just a few CPUs. In fact, it is better to avoid using it at all. For this reason, modify intel_pstate to look for ACPI PCCH if _PSS is not present and register if it is there. Also prevent the pcc-cpufreq driver from trying to initialize itself if intel_pstate has been registered already. Fixes: fbbcdc0744da (intel_pstate: skip the driver if ACPI has power mgmt option) Reported-by: Andreas Herrmann <aherrmann@suse.com> Reviewed-by: Andreas Herrmann <aherrmann@suse.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Tested-by: Andreas Herrmann <aherrmann@suse.com> Cc: 4.16+ <stable@vger.kernel.org> # 4.16+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-07-18powerpc/powernv: Fix save/restore of SPRG3 on entry/exit from stop (idle)Gautham R. Shenoy1-0/+2
On 64-bit servers, SPRN_SPRG3 and its userspace read-only mirror SPRN_USPRG3 are used as userspace VDSO write and read registers respectively. SPRN_SPRG3 is lost when we enter stop4 and above, and is currently not restored. As a result, any read from SPRN_USPRG3 returns zero on an exit from stop4 (Power9 only) and above. Thus in this situation, on POWER9, any call from sched_getcpu() always returns zero, as on powerpc, we call __kernel_getcpu() which relies upon SPRN_USPRG3 to report the CPU and NUMA node information. Fix this by restoring SPRN_SPRG3 on wake up from a deep stop state with the sprg_vdso value that is cached in PACA. Fixes: e1c1cfed5432 ("powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle") Cc: stable@vger.kernel.org # v4.14+ Reported-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-18powerpc/Makefile: Assemble with -me500 when building for E500James Clarke1-0/+1
Some of the assembly files use instructions specific to BookE or E500, which are rejected with the now-default -mcpu=powerpc, so we must pass -me500 to the assembler just as we pass -me200 for E200. Fixes: 4bf4f42a2feb ("powerpc/kbuild: Set default generic machine type for 32-bit compile") Signed-off-by: James Clarke <jrtc27@jrtc27.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-18ALSA: hda/realtek - Yet another Clevo P950 quirk entryTakashi Iwai1-0/+1
The PCI SSID 1558:95e1 needs the same quirk for other Clevo P950 models, too. Otherwise no sound comes out of speakers. Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=1101143 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-07-18kvmclock: fix TSC calibration for nested guestsPeng Hao1-0/+1
Inside a nested guest, access to hardware can be slow enough that tsc_read_refs always return ULLONG_MAX, causing tsc_refine_calibration_work to be called periodically and the nested guest to spend a lot of time reading the ACPI timer. However, if the TSC frequency is available from the pvclock page, we can just set X86_FEATURE_TSC_KNOWN_FREQ and avoid the recalibration. 'refine' operation. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Peng Hao <peng.hao2@zte.com.cn> [Commit message rewritten. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-18KVM: VMX: Mark VMXArea with revision_id of physical CPU even when eVMCS enabledLiran Alon1-6/+21
When eVMCS is enabled, all VMCS allocated to be used by KVM are marked with revision_id of KVM_EVMCS_VERSION instead of revision_id reported by MSR_IA32_VMX_BASIC. However, even though not explictly documented by TLFS, VMXArea passed as VMXON argument should still be marked with revision_id reported by physical CPU. This issue was found by the following setup: * L0 = KVM which expose eVMCS to it's L1 guest. * L1 = KVM which consume eVMCS reported by L0. This setup caused the following to occur: 1) L1 execute hardware_enable(). 2) hardware_enable() calls kvm_cpu_vmxon() to execute VMXON. 3) L0 intercept L1 VMXON and execute handle_vmon() which notes vmxarea->revision_id != VMCS12_REVISION and therefore fails with nested_vmx_failInvalid() which sets RFLAGS.CF. 4) L1 kvm_cpu_vmxon() don't check RFLAGS.CF for failure and therefore hardware_enable() continues as usual. 5) L1 hardware_enable() then calls ept_sync_global() which executes INVEPT. 6) L0 intercept INVEPT and execute handle_invept() which notes !vmx->nested.vmxon and thus raise a #UD to L1. 7) Raised #UD caused L1 to panic. Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Cc: stable@vger.kernel.org Fixes: 773e8a0425c923bc02668a2d6534a5ef5a43cc69 Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-18KVM: irqfd: fix race between EPOLLHUP and irq_bypass_register_consumerPaolo Bonzini1-5/+6
A comment warning against this bug is there, but the code is not doing what the comment says. Therefore it is possible that an EPOLLHUP races against irq_bypass_register_consumer. The EPOLLHUP handler schedules irqfd_shutdown, and if that runs soon enough, you get a use-after-free. Reported-by: syzbot <syzkaller@googlegroups.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>
2018-07-18KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in parallel.Lan Tianyu1-1/+5
Syzbot reports crashes in kvm_irqfd_assign(), caused by use-after-free when kvm_irqfd_assign() and kvm_irqfd_deassign() run in parallel for one specific eventfd. When the assign path hasn't finished but irqfd has been added to kvm->irqfds.items list, another thead may deassign the eventfd and free struct kvm_kernel_irqfd(). The assign path then uses the struct kvm_kernel_irqfd that has been freed by deassign path. To avoid such issue, keep irqfd under kvm->irq_srcu protection after the irqfd has been added to kvm->irqfds.items list, and call synchronize_srcu() in irq_shutdown() to make sure that irqfd has been fully initialized in the assign path. Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Tianyu Lan <tianyu.lan@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-18KVM: PPC: Check if IOMMU page is contained in the pinned physical pageAlexey Kardashevskiy5-8/+43
A VM which has: - a DMA capable device passed through to it (eg. network card); - running a malicious kernel that ignores H_PUT_TCE failure; - capability of using IOMMU pages bigger that physical pages can create an IOMMU mapping that exposes (for example) 16MB of the host physical memory to the device when only 64K was allocated to the VM. The remaining 16MB - 64K will be some other content of host memory, possibly including pages of the VM, but also pages of host kernel memory, host programs or other VMs. The attacking VM does not control the location of the page it can map, and is only allowed to map as many pages as it has pages of RAM. We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that an IOMMU page is contained in the physical page so the PCI hardware won't get access to unassigned host memory; however this check is missing in the KVM fastpath (H_PUT_TCE accelerated code). We were lucky so far and did not hit this yet as the very first time when the mapping happens we do not have tbl::it_userspace allocated yet and fall back to the userspace which in turn calls VFIO IOMMU driver, this fails and the guest does not retry, This stores the smallest preregistered page size in the preregistered region descriptor and changes the mm_iommu_xxx API to check this against the IOMMU page size. This calculates maximum page size as a minimum of the natural region alignment and compound page size. For the page shift this uses the shift returned by find_linux_pte() which indicates how the page is mapped to the current userspace - if the page is huge and this is not a zero, then it is a leaf pte and the page is mapped within the range. Fixes: 121f80ba68f1 ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-18vfio/spapr: Use IOMMU pageshift rather than pagesizeAlexey Kardashevskiy1-4/+4
The size is always equal to 1 page so let's use this. Later on this will be used for other checks which use page shifts to check the granularity of access. This should cause no behavioral change. Cc: stable@vger.kernel.org # v4.12+ Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-18net: usb: rtl8150: demote allmulti message to dev_dbg()David Lechner1-1/+1
This driver can spam the kernel log with multiple messages of: net eth0: eth0: allmulti set Usually 4 or 8 at a time (probably because of using ConnMan). This message doesn't seem useful, so let's demote it from dev_info() to dev_dbg(). Signed-off-by: David Lechner <david@lechnology.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18octeon_mgmt: Fix MIX registers configuration on MTU setupAlexander Sverdlin1-3/+11
octeon_mgmt driver doesn't drop RX frames that are 1-4 bytes bigger than MTU set for the corresponding interface. The problem is in the AGL_GMX_RX0/1_FRM_MAX register setting, which should not account for VLAN tagging. According to Octeon HW manual: "For tagged frames, MAX increases by four bytes for each VLAN found up to a maximum of two VLANs, or MAX + 8 bytes." OCTEON_FRAME_HEADER_LEN "define" is fine for ring buffer management, but should not be used for AGL_GMX_RX0/1_FRM_MAX. The problem could be easily reproduced using "ping" command. If affected system has default MTU 1500, other host (having MTU >= 1504) can successfully "ping" the affected system with payload size 1473-1476, resulting in IP packets of size 1501-1504 accepted by the mgmt driver. Fixed system still accepts IP packets of 1500 bytes even with VLAN tagging, because the limits are lifted in HW as expected, for every VLAN tag. Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-17aio: don't expose __aio_sigset in uapiChristoph Hellwig3-6/+6
glibc uses a different defintion of sigset_t than the kernel does, and the current version would pull in both. To fix this just do not expose the type at all - this somewhat mirrors pselect() where we do not even have a type for the magic sigmask argument, but just use pointer arithmetics. Fixes: 7a074e96 ("aio: implement io_pgetevents") Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Adrian Reber <adrian@lisas.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-07-17drm/amd/display: Fix DP HBR2 Eye Diagram Pattern on CarrizoHersen Wu3-5/+6
[why] dp hbr2 eye diagram pattern for raven asic is not stabled. workaround is to use tp4 pattern. But this should not be applied to asic before raven. [how] add new bool varilable in asic caps. for raven asic, use the workaround. for carrizo, vega, do not use workaround. Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Reviewed-by: Harry Wentland <Harry.Wentland@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-17drm/amdgpu: Make sure IB tests flushed after IP resumeLeo Liu1-0/+3
Fixes: 2c773de2 (drm/amdgpu: defer test IBs on the rings at boot (V3)) Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-17netfilter: nf_tables: fix jumpstack depth validationTaehee Yoo4-11/+20
The level of struct nft_ctx is updated by nf_tables_check_loops(). That is used to validate jumpstack depth. But jumpstack validation routine doesn't update and validate recursively. So, in some cases, chain depth can be bigger than the NFT_JUMP_STACK_SIZE. After this patch, The jumpstack validation routine is located in the nft_chain_validate(). When new rules or new set elements are added, the nft_table_validate() is called by the nf_tables_newrule and the nf_tables_newsetelem. The nft_table_validate() calls the nft_chain_validate() that visit all their children chains recursively. So it can update depth of chain certainly. Reproducer: %cat ./test.sh #!/bin/bash nft add table ip filter nft add chain ip filter input { type filter hook input priority 0\; } for ((i=0;i<20;i++)); do nft add chain ip filter a$i done nft add rule ip filter input jump a1 for ((i=0;i<10;i++)); do nft add rule ip filter a$i jump a$((i+1)) done for ((i=11;i<19;i++)); do nft add rule ip filter a$i jump a$((i+1)) done nft add rule ip filter a10 jump a11 Result: [ 253.931782] WARNING: CPU: 1 PID: 0 at net/netfilter/nf_tables_core.c:186 nft_do_chain+0xacc/0xdf0 [nf_tables] [ 253.931915] Modules linked in: nf_tables nfnetlink ip_tables x_tables [ 253.932153] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.18.0-rc3+ #48 [ 253.932153] RIP: 0010:nft_do_chain+0xacc/0xdf0 [nf_tables] [ 253.932153] Code: 83 f8 fb 0f 84 c7 00 00 00 e9 d0 00 00 00 83 f8 fd 74 0e 83 f8 ff 0f 84 b4 00 00 00 e9 bd 00 00 00 83 bd 64 fd ff ff 0f 76 09 <0f> 0b 31 c0 e9 bc 02 00 00 44 8b ad 64 fd [ 253.933807] RSP: 0018:ffff88011b807570 EFLAGS: 00010212 [ 253.933807] RAX: 00000000fffffffd RBX: ffff88011b807660 RCX: 0000000000000000 [ 253.933807] RDX: 0000000000000010 RSI: ffff880112b39d78 RDI: ffff88011b807670 [ 253.933807] RBP: ffff88011b807850 R08: ffffed0023700ece R09: ffffed0023700ecd [ 253.933807] R10: ffff88011b80766f R11: ffffed0023700ece R12: ffff88011b807898 [ 253.933807] R13: ffff880112b39d80 R14: ffff880112b39d60 R15: dffffc0000000000 [ 253.933807] FS: 0000000000000000(0000) GS:ffff88011b800000(0000) knlGS:0000000000000000 [ 253.933807] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 253.933807] CR2: 00000000014f1008 CR3: 000000006b216000 CR4: 00000000001006e0 [ 253.933807] Call Trace: [ 253.933807] <IRQ> [ 253.933807] ? sched_clock_cpu+0x132/0x170 [ 253.933807] ? __nft_trace_packet+0x180/0x180 [nf_tables] [ 253.933807] ? sched_clock_cpu+0x132/0x170 [ 253.933807] ? debug_show_all_locks+0x290/0x290 [ 253.933807] ? __lock_acquire+0x4835/0x4af0 [ 253.933807] ? inet_ehash_locks_alloc+0x1a0/0x1a0 [ 253.933807] ? unwind_next_frame+0x159e/0x1840 [ 253.933807] ? __read_once_size_nocheck.constprop.4+0x5/0x10 [ 253.933807] ? nft_do_chain_ipv4+0x197/0x1e0 [nf_tables] [ 253.933807] ? nft_do_chain+0x5/0xdf0 [nf_tables] [ 253.933807] nft_do_chain_ipv4+0x197/0x1e0 [nf_tables] [ 253.933807] ? nft_do_chain_arp+0xb0/0xb0 [nf_tables] [ 253.933807] ? __lock_is_held+0x9d/0x130 [ 253.933807] nf_hook_slow+0xc4/0x150 [ 253.933807] ip_local_deliver+0x28b/0x380 [ 253.933807] ? ip_call_ra_chain+0x3e0/0x3e0 [ 253.933807] ? ip_rcv_finish+0x1610/0x1610 [ 253.933807] ip_rcv+0xbcc/0xcc0 [ 253.933807] ? debug_show_all_locks+0x290/0x290 [ 253.933807] ? ip_local_deliver+0x380/0x380 [ 253.933807] ? __lock_is_held+0x9d/0x130 [ 253.933807] ? ip_local_deliver+0x380/0x380 [ 253.933807] __netif_receive_skb_core+0x1c9c/0x2240 Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-17Mark HI and TASKLET softirq synchronousLinus Torvalds1-4/+8
Way back in 4.9, we committed 4cd13c21b207 ("softirq: Let ksoftirqd do its job"), and ever since we've had small nagging issues with it. For example, we've had: 1ff688209e2e ("watchdog: core: make sure the watchdog_worker is not deferred") 8d5755b3f77b ("watchdog: softdog: fire watchdog even if softirqs do not get to run") 217f69743681 ("net: busy-poll: allow preemption in sk_busy_loop()") all of which worked around some of the effects of that commit. The DVB people have also complained that the commit causes excessive USB URB latencies, which seems to be due to the USB code using tasklets to schedule USB traffic. This seems to be an issue mainly when already living on the edge, but waiting for ksoftirqd to handle it really does seem to cause excessive latencies. Now Hanna Hawa reports that this issue isn't just limited to USB URB and DVB, but also causes timeout problems for the Marvell SoC team: "I'm facing kernel panic issue while running raid 5 on sata disks connected to Macchiatobin (Marvell community board with Armada-8040 SoC with 4 ARMv8 cores of CA72) Raid 5 built with Marvell DMA engine and async_tx mechanism (ASYNC_TX_DMA [=y]); the DMA driver (mv_xor_v2) uses a tasklet to clean the done descriptors from the queue" The latency problem causes a panic: mv_xor_v2 f0400000.xor: dma_sync_wait: timeout! Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for transaction We've discussed simply just reverting the original commit entirely, and also much more involved solutions (with per-softirq threads etc). This patch is intentionally stupid and fairly limited, because the issue still remains, and the other solutions either got sidetracked or had other issues. We should probably also consider the timer softirqs to be synchronous and not be delayed to ksoftirqd (since they were the issue with the earlier watchdog problems), but that should be done as a separate patch. This does only the tasklet cases. Reported-and-tested-by: Hanna Hawa <hannah@marvell.com> Reported-and-tested-by: Josef Griebichler <griebichler.josef@gmx.at> Reported-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-17x86/MCE: Remove min interval polling limitationDewet Thibaut1-3/+0
commit b3b7c4795c ("x86/MCE: Serialize sysfs changes") introduced a min interval limitation when setting the check interval for polled MCEs. However, the logic is that 0 disables polling for corrected MCEs, see Documentation/x86/x86_64/machinecheck. The limitation prevents disabling. Remove this limitation and allow the value 0 to disable polling again. Fixes: b3b7c4795c ("x86/MCE: Serialize sysfs changes") Signed-off-by: Dewet Thibaut <thibaut.dewet@nokia.com> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180716084927.24869-1-alexander.sverdlin@nokia.com
2018-07-17ALSA: rawmidi: Change resized buffers atomicallyTakashi Iwai1-6/+14
The SNDRV_RAWMIDI_IOCTL_PARAMS ioctl may resize the buffers and the current code is racy. For example, the sequencer client may write to buffer while it being resized. As a simple workaround, let's switch to the resized buffer inside the stream runtime lock. Reported-by: syzbot+52f83f0ea8df16932f7f@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-07-17nvme: don't enable AEN if not supportedWeiping Zhang1-4/+7
Avoid excuting set_feature command if there is no supported bit in Optional Asynchronous Events Supported (OAES). Fixes: c0561f82 ("nvme: submit AEN event configuration on startup") Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Weiping Zhang <zhangweiping@didichuxing.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-17nvme: ensure forward progress during Admin passthruScott Bauer1-24/+26
If the controller supports effects and goes down during the passthru admin command we will deadlock during namespace revalidation. [ 363.488275] INFO: task kworker/u16:5:231 blocked for more than 120 seconds. [ 363.488290] Not tainted 4.17.0+ #2 [ 363.488296] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 363.488303] kworker/u16:5 D 0 231 2 0x80000000 [ 363.488331] Workqueue: nvme-reset-wq nvme_reset_work [nvme] [ 363.488338] Call Trace: [ 363.488385] schedule+0x75/0x190 [ 363.488396] rwsem_down_read_failed+0x1c3/0x2f0 [ 363.488481] call_rwsem_down_read_failed+0x14/0x30 [ 363.488504] down_read+0x1d/0x80 [ 363.488523] nvme_stop_queues+0x1e/0xa0 [nvme_core] [ 363.488536] nvme_dev_disable+0xae4/0x1620 [nvme] [ 363.488614] nvme_reset_work+0xd1e/0x49d9 [nvme] [ 363.488911] process_one_work+0x81a/0x1400 [ 363.488934] worker_thread+0x87/0xe80 [ 363.488955] kthread+0x2db/0x390 [ 363.488977] ret_from_fork+0x35/0x40 Fixes: 84fef62d135b6 ("nvme: check admin passthru command effects") Signed-off-by: Scott Bauer <scott.bauer@intel.com> Reviewed-by: Keith Busch <keith.busch@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>