kernel/git/benh/linux-fsi.git - Benjamin Herrenschmidt's fork of linux.git

Age	Commit message (Collapse)	Author	Files	Lines
2018-11-26	fsi: fsi-scom.c: Remove duplicate headerHEAD fsi-updates-2018-11-26 master	Brajeswar Ghosh	1	-1/+0
	Remove linux/cdev.h which is included more than once Signed-off-by: Brajeswar Ghosh <brajeswar.linux@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2018-11-26	fsi: master-ast-cf: select GENERIC_ALLOCATOR	Arnd Bergmann	1	-0/+1
	In randconfig builds without CONFIG_GENERIC_ALLOCATOR, this driver fails to link: ERROR: "gen_pool_alloc_algo" [drivers/fsi/fsi-master-ast-cf.ko] undefined! ERROR: "gen_pool_fixed_alloc" [drivers/fsi/fsi-master-ast-cf.ko] undefined! ERROR: "of_gen_pool_get" [drivers/fsi/fsi-master-ast-cf.ko] undefined! ERROR: "gen_pool_free" [drivers/fsi/fsi-master-ast-cf.ko] undefined! Select the dependency as all other users do. Fixes: 6a794a27daca ("fsi: master-ast-cf: Add new FSI master using Aspeed ColdFire") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2018-11-25	Linux 4.20-rc4	Linus Torvalds	1	-2/+2

2018-11-25	Merge tag 'dma-mapping-4.20-3' of git://git.infradead.org/users/hch/dma-mapping	Linus Torvalds	2	-2/+3
	Pull dma-mapping fixes from Christoph Hellwig: "Two dma-direct / swiotlb regressions fixes: - zero is a valid physical address on some arm boards, we can't use it as the error value - don't try to cache flush the error return value (no matter what it is)" * tag 'dma-mapping-4.20-3' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: Skip cache maintenance on map error dma-direct: Make DIRECT_MAPPING_ERROR viable for SWIOTLB
2018-11-25	Merge tag 'nfs-for-4.20-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs	Linus Torvalds	7	-37/+66
	Pull NFS client bugfixes from Trond Myklebust: - Fix a NFSv4 state manager deadlock when returning a delegation - NFSv4.2 copy do not allocate memory under the lock - flexfiles: Use the correct stateid for IO in the tightly coupled case * tag 'nfs-for-4.20-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: flexfiles: use per-mirror specified stateid for IO NFSv4.2 copy do not allocate memory under the lock NFSv4: Fix a NFSv4 state manager deadlock
2018-11-25	MAINTAINERS: change Sparse's maintainer	Luc Van Oostenryck	2	-2/+5
	I'm taking over the maintainance of Sparse so add myself as maintainer and move Christopher's info to CREDITS. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-24	Merge tag 'xarray-4.20-rc4' of git://git.infradead.org/users/willy/linux-dax	Linus Torvalds	6	-185/+387
	Pull XArray updates from Matthew Wilcox: "We found some bugs in the DAX conversion to XArray (and one bug which predated the XArray conversion). There were a couple of bugs in some of the higher-level functions, which aren't actually being called in today's kernel, but surfaced as a result of converting existing radix tree & IDR users over to the XArray. Some of the other changes to how the higher-level APIs work were also motivated by converting various users; again, they're not in use in today's kernel, so changing them has a low probability of introducing a bug. Dan can still trigger a bug in the DAX code with hot-offline/online, and we're working on tracking that down" * tag 'xarray-4.20-rc4' of git://git.infradead.org/users/willy/linux-dax: XArray tests: Add missing locking dax: Avoid losing wakeup in dax_lock_mapping_entry dax: Fix huge page faults dax: Fix dax_unlock_mapping_entry for PMD pages dax: Reinstate RCU protection of inode dax: Make sure the unlocking entry isn't locked dax: Remove optimisation from dax_lock_mapping_entry XArray tests: Correct some 64-bit assumptions XArray: Correct xa_store_range XArray: Fix Documentation XArray: Handle NULL pointers differently for allocation XArray: Unify xa_store and __xa_store XArray: Add xa_store_bh() and xa_store_irq() XArray: Turn xa_erase into an exported function XArray: Unify xa_cmpxchg and __xa_cmpxchg XArray: Regularise xa_reserve nilfs2: Use xa_erase_irq XArray: Export __xa_foo to non-GPL modules XArray: Fix xa_for_each with a single element at 0
2018-11-24	Merge branch 'for-linus' of ↵	Linus Torvalds	11	-444/+159
	git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - revert of the high-resolution scrolling feature, as it breaks certain hardware due to incompatibilities between Logitech and Microsoft worlds. Peter Hutterer is working on a fixed implementation. Until that is finished, revert by Benjamin Tissoires. - revert of incorrect strncpy->strlcpy conversion in uhid, from David Herrmann - fix for buggy sendfile() implementation on uhid device node, from Eric Biggers - a few assorted device-ID specific quirks * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: Revert "Input: Add the `REL_WHEEL_HI_RES` event code" Revert "HID: input: Create a utility class for counting scroll events" Revert "HID: logitech: Add function to enable HID++ 1.0 "scrolling acceleration"" Revert "HID: logitech: Enable high-resolution scrolling on Logitech mice" Revert "HID: logitech: Use LDJ_DEVICE macro for existing Logitech mice" Revert "HID: logitech: fix a used uninitialized GCC warning" Revert "HID: input: simplify/fix high-res scroll event handling" HID: Add quirk for Primax PIXART OEM mice HID: i2c-hid: Disable runtime PM for LG touchscreen HID: multitouch: Add pointstick support for Cirque Touchpad HID: steam: remove input device when a hid client is running. Revert "HID: uhid: use strlcpy() instead of strncpy()" HID: uhid: forbid UHID_CREATE under KERNEL_DS or elevated privileges HID: input: Ignore battery reported by Symbol DS4308 HID: Add quirk for Microsoft PIXART OEM mouse
2018-11-24	Merge tag 'arm64-fixes' of ↵	Linus Torvalds	2	-3/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas:: - Fix wrong conflict resolution around CONFIG_ARM64_SSBD - Fix sparse warning on unsigned long constant * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: cpufeature: Fix mismerge of CONFIG_ARM64_SSBD block arm64: sysreg: fix sparse warnings
2018-11-24	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds	67	-335/+537
	Pull networking fixes from David Miller: 1) Need to take mutex in ath9k_add_interface(), from Dan Carpenter. 2) Fix mt76 build without CONFIG_LEDS_CLASS, from Arnd Bergmann. 3) Fix socket wmem accounting in SCTP, from Xin Long. 4) Fix failed resume crash in ena driver, from Arthur Kiyanovski. 5) qed driver passes bytes instead of bits into second arg of bitmap_weight(). From Denis Bolotin. 6) Fix reset deadlock in ibmvnic, from Juliet Kim. 7) skb_scrube_packet() needs to scrub the fwd marks too, from Petr Machata. 8) Make sure older TCP stacks see enough dup ACKs, and avoid doing SACK compression during this period, from Eric Dumazet. 9) Add atomicity to SMC protocol cursor handling, from Ursula Braun. 10) Don't leave dangling error pointer if bpf_prog_add() fails in thunderx driver, from Lorenzo Bianconi. Also, when we unmap TSO headers, set sq->tso_hdrs to NULL. 11) Fix race condition over state variables in act_police, from Davide Caratti. 12) Disable guest csum in the presence of XDP in virtio_net, from Jason Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (64 commits) net: gemini: Fix copy/paste error net: phy: mscc: fix deadlock in vsc85xx_default_config dt-bindings: dsa: Fix typo in "probed" net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue net: amd: add missing of_node_put() team: no need to do team_notify_peers or team_mcast_rejoin when disabling port virtio-net: fail XDP set if guest csum is negotiated virtio-net: disable guest csum during XDP set net/sched: act_police: add missing spinlock initialization net: don't keep lonely packets forever in the gro hash net/ipv6: re-do dad when interface has IFF_NOARP flag change packet: copy user buffers before orphan or clone ibmvnic: Update driver queues after change in ring size support ibmvnic: Fix RX queue buffer cleanup net: thunderx: set xdp_prog to NULL if bpf_prog_add fails net/dim: Update DIM start sample after each DIM iteration net: faraday: ftmac100: remove netif_running(netdev) check before disabling interrupts net/smc: use after free fix in smc_wr_tx_put_slot() net/smc: atomic SMCD cursor handling net/smc: add SMC-D shutdown signal ...
2018-11-24	Merge tag 'xfs-4.20-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux	Linus Torvalds	10	-46/+104
	Pull xfs fixes from Darrick Wong: "Dave and I have continued our work fixing corruption problems that can be found when running long-term burn-in exercisers on xfs. Here are some patches fixing most of the problems, but there will likely be more. :/ - Numerous corruption fixes for copy on write - Numerous corruption fixes for blocksize < pagesize writes - Don't miscalculate AG reservations for small final AGs - Fix page cache truncation to work properly for reflink and extent shifting - Fix use-after-free when retrying failed inode/dquot buffer logging - Fix corruptions seen when using copy_file_range in directio mode" * tag 'xfs-4.20-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: iomap: readpages doesn't zero page tail beyond EOF vfs: vfs_dedupe_file_range() doesn't return EOPNOTSUPP iomap: dio data corruption and spurious errors when pipes fill iomap: sub-block dio needs to zeroout beyond EOF iomap: FUA is wrong for DIO O_DSYNC writes into unwritten extents xfs: delalloc -> unwritten COW fork allocation can go wrong xfs: flush removing page cache in xfs_reflink_remap_prep xfs: extent shifting doesn't fully invalidate page cache xfs: finobt AG reserves don't consider last AG can be a runt xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers xfs: uncached buffer tracing needs to print bno xfs: make xfs_file_remap_range() static xfs: fix shared extent data corruption due to missing cow reservation
2018-11-23	net: gemini: Fix copy/paste error	Andreas Fiedler	1	-1/+1
	The TX stats should be started with the tx_stats_syncp, there seems to be a copy/paste error in the driver. Signed-off-by: Andreas Fiedler <andreas.fiedler@gmx.net> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23	net: phy: mscc: fix deadlock in vsc85xx_default_config	Quentin Schulz	1	-9/+5
	The vsc85xx_default_config function called in the vsc85xx_config_init function which is used by VSC8530, VSC8531, VSC8540 and VSC8541 PHYs mistakenly calls phy_read and phy_write in-between phy_select_page and phy_restore_page. phy_select_page and phy_restore_page actually take and release the MDIO bus lock and phy_write and phy_read take and release the lock to write or read to a PHY register. Let's fix this deadlock by using phy_modify_paged which handles correctly a read followed by a write in a non-standard page. Fixes: 6a0bfbbe20b0 ("net: phy: mscc: migrate to phy_select/restore_page functions") Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23	dt-bindings: dsa: Fix typo in "probed"	Fabio Estevam	1	-1/+1
	The correct form is "can be probed", so fix the typo. Signed-off-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23	net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue	Lorenzo Bianconi	1	-1/+3
	Reset snd_queue tso_hdrs pointer to NULL in nicvf_free_snd_queue routine since it is used to check if tso dma descriptor queue has been previously allocated. The issue can be triggered with the following reproducer: $ip link set dev enP2p1s0v0 xdpdrv obj xdp_dummy.o $ip link set dev enP2p1s0v0 xdpdrv off [ 341.467649] WARNING: CPU: 74 PID: 2158 at mm/vmalloc.c:1511 __vunmap+0x98/0xe0 [ 341.515010] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018 [ 341.521874] pstate: 60400005 (nZCv daif +PAN -UAO) [ 341.526654] pc : __vunmap+0x98/0xe0 [ 341.530132] lr : __vunmap+0x98/0xe0 [ 341.533609] sp : ffff00001c5db860 [ 341.536913] x29: ffff00001c5db860 x28: 0000000000020000 [ 341.542214] x27: ffff810feb5090b0 x26: ffff000017e57000 [ 341.547515] x25: 0000000000000000 x24: 00000000fbd00000 [ 341.552816] x23: 0000000000000000 x22: ffff810feb5090b0 [ 341.558117] x21: 0000000000000000 x20: 0000000000000000 [ 341.563418] x19: ffff000017e57000 x18: 0000000000000000 [ 341.568719] x17: 0000000000000000 x16: 0000000000000000 [ 341.574020] x15: 0000000000000010 x14: ffffffffffffffff [ 341.579321] x13: ffff00008985eb27 x12: ffff00000985eb2f [ 341.584622] x11: ffff0000096b3000 x10: ffff00001c5db510 [ 341.589923] x9 : 00000000ffffffd0 x8 : ffff0000086868e8 [ 341.595224] x7 : 3430303030303030 x6 : 00000000000006ef [ 341.600525] x5 : 00000000003fffff x4 : 0000000000000000 [ 341.605825] x3 : 0000000000000000 x2 : ffffffffffffffff [ 341.611126] x1 : ffff0000096b3728 x0 : 0000000000000038 [ 341.616428] Call trace: [ 341.618866] __vunmap+0x98/0xe0 [ 341.621997] vunmap+0x3c/0x50 [ 341.624961] arch_dma_free+0x68/0xa0 [ 341.628534] dma_direct_free+0x50/0x80 [ 341.632285] nicvf_free_resources+0x160/0x2d8 [nicvf] [ 341.637327] nicvf_config_data_transfer+0x174/0x5e8 [nicvf] [ 341.642890] nicvf_stop+0x298/0x340 [nicvf] [ 341.647066] __dev_close_many+0x9c/0x108 [ 341.650977] dev_close_many+0xa4/0x158 [ 341.654720] rollback_registered_many+0x140/0x530 [ 341.659414] rollback_registered+0x54/0x80 [ 341.663499] unregister_netdevice_queue+0x9c/0xe8 [ 341.668192] unregister_netdev+0x28/0x38 [ 341.672106] nicvf_remove+0xa4/0xa8 [nicvf] [ 341.676280] nicvf_shutdown+0x20/0x30 [nicvf] [ 341.680630] pci_device_shutdown+0x44/0x88 [ 341.684720] device_shutdown+0x144/0x250 [ 341.688640] kernel_restart_prepare+0x44/0x50 [ 341.692986] kernel_restart+0x20/0x68 [ 341.696638] __se_sys_reboot+0x210/0x238 [ 341.700550] __arm64_sys_reboot+0x24/0x30 [ 341.704555] el0_svc_handler+0x94/0x110 [ 341.708382] el0_svc+0x8/0xc [ 341.711252] ---[ end trace 3f4019c8439959c9 ]--- [ 341.715874] page:ffff7e0003ef4000 count:0 mapcount:0 mapping:0000000000000000 index:0x4 [ 341.723872] flags: 0x1fffe000000000() [ 341.727527] raw: 001fffe000000000 ffff7e0003f1a008 ffff7e0003ef4048 0000000000000000 [ 341.735263] raw: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000 [ 341.742994] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) where xdp_dummy.c is a simple bpf program that forwards the incoming frames to the network stack (available here: https://github.com/altoor/xdp_walkthrough_examples/blob/master/sample_1/xdp_dummy.c) Fixes: 05c773f52b96 ("net: thunderx: Add basic XDP support") Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network controller") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23	net: amd: add missing of_node_put()	Yangtao Li	1	-1/+3
	of_find_node_by_path() acquires a reference to the node returned by it and that reference needs to be dropped by its caller. This place doesn't do that, so fix it. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23	team: no need to do team_notify_peers or team_mcast_rejoin when disabling port	Hangbin Liu	1	-2/+0
	team_notify_peers() will send ARP and NA to notify peers. team_mcast_rejoin() will send multicast join group message to notify peers. We should do this when enabling/changed to a new port. But it doesn't make sense to do it when a port is disabled. On the other hand, when we set mcast_rejoin_count to 2, and do a failover, team_port_disable() will increase mcast_rejoin.count_pending to 2 and then team_port_enable() will increase mcast_rejoin.count_pending to 4. We will send 4 mcast rejoin messages at latest, which will make user confused. The same with notify_peers.count. Fix it by deleting team_notify_peers() and team_mcast_rejoin() in team_port_disable(). Reported-by: Liang Li <liali@redhat.com> Fixes: fc423ff00df3a ("team: add peer notification") Fixes: 492b200efdd20 ("team: add support for sending multicast rejoins") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23	virtio-net: fail XDP set if guest csum is negotiated	Jason Wang	1	-2/+3
	We don't support partial csumed packet since its metadata will be lost or incorrect during XDP processing. So fail the XDP set if guest_csum feature is negotiated. Fixes: f600b6905015 ("virtio_net: Add XDP support") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Pavel Popa <pashinho1990@gmail.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23	virtio-net: disable guest csum during XDP set	Jason Wang	1	-6/+2
	We don't disable VIRTIO_NET_F_GUEST_CSUM if XDP was set. This means we can receive partial csumed packets with metadata kept in the vnet_hdr. This may have several side effects: - It could be overridden by header adjustment, thus is might be not correct after XDP processing. - There's no way to pass such metadata information through XDP_REDIRECT to another driver. - XDP does not support checksum offload right now. So simply disable guest csum if possible in this the case of XDP. Fixes: 3f93522ffab2d ("virtio-net: switch off offloads on demand if possible on XDP set") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Pavel Popa <pashinho1990@gmail.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23	Merge tag 'ceph-for-4.20-rc4' of https://github.com/ceph/ceph-client	Linus Torvalds	1	-3/+9
	Pullk ceph fix from Ilya Dryomov: "A messenger fix, marked for stable" * tag 'ceph-for-4.20-rc4' of https://github.com/ceph/ceph-client: libceph: fall back to sendmsg for slab pages
2018-11-23	Merge tag 'for-linus-20181123' of git://git.kernel.dk/linux-block	Linus Torvalds	1	-10/+63
	Pull block fix from Jens Axboe: "Just a single fix for this week, fixing an issue with nvme-fc" * tag 'for-linus-20181123' of git://git.kernel.dk/linux-block: nvme-fc: resolve io failures during connect
2018-11-23	net/sched: act_police: add missing spinlock initialization	Davide Caratti	1	-0/+1
	commit f2cbd4852820 ("net/sched: act_police: fix race condition on state variables") introduces a new spinlock, but forgets its initialization. Ensure that tcf_police_init() initializes 'tcfp_lock' every time a 'police' action is newly created, to avoid the following lockdep splat: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. <...> Call Trace: dump_stack+0x85/0xcb register_lock_class+0x581/0x590 __lock_acquire+0xd4/0x1330 ? tcf_police_init+0x2fa/0x650 [act_police] ? lock_acquire+0x9e/0x1a0 lock_acquire+0x9e/0x1a0 ? tcf_police_init+0x2fa/0x650 [act_police] ? tcf_police_init+0x55a/0x650 [act_police] _raw_spin_lock_bh+0x34/0x40 ? tcf_police_init+0x2fa/0x650 [act_police] tcf_police_init+0x2fa/0x650 [act_police] tcf_action_init_1+0x384/0x4c0 tcf_action_init+0xf6/0x160 tcf_action_add+0x73/0x170 tc_ctl_action+0x122/0x160 rtnetlink_rcv_msg+0x2a4/0x490 ? netlink_deliver_tap+0x99/0x400 ? validate_linkmsg+0x370/0x370 netlink_rcv_skb+0x4d/0x130 netlink_unicast+0x196/0x230 netlink_sendmsg+0x2e5/0x3e0 sock_sendmsg+0x36/0x40 ___sys_sendmsg+0x280/0x2f0 ? _raw_spin_unlock+0x24/0x30 ? handle_pte_fault+0xafe/0xf30 ? find_held_lock+0x2d/0x90 ? syscall_trace_enter+0x1df/0x360 ? __sys_sendmsg+0x5e/0xa0 __sys_sendmsg+0x5e/0xa0 do_syscall_64+0x60/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f1841c7cf10 Code: c3 48 8b 05 82 6f 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d 8d d0 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24 RSP: 002b:00007ffcf9df4d68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f1841c7cf10 RDX: 0000000000000000 RSI: 00007ffcf9df4dc0 RDI: 0000000000000003 RBP: 000000005bf56105 R08: 0000000000000002 R09: 00007ffcf9df8edc R10: 00007ffcf9df47e0 R11: 0000000000000246 R12: 0000000000671be0 R13: 00007ffcf9df4e84 R14: 0000000000000008 R15: 0000000000000000 Fixes: f2cbd4852820 ("net/sched: act_police: fix race condition on state variables") Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23	net: don't keep lonely packets forever in the gro hash	Paolo Abeni	1	-2/+5
	Eric noted that with UDP GRO and NAPI timeout, we could keep a single UDP packet inside the GRO hash forever, if the related NAPI instance calls napi_gro_complete() at an higher frequency than the NAPI timeout. Willem noted that even TCP packets could be trapped there, till the next retransmission. This patch tries to address the issue, flushing the old packets - those with a NAPI_GRO_CB age before the current jiffy - before scheduling the NAPI timeout. The rationale is that such a timeout should be well below a jiffy and we are not flushing packets eligible for sane GRO. v1 -> v2: - clarified the commit message and comment RFC -> v1: - added 'Fixes tags', cleaned-up the wording. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: 3b47d30396ba ("net: gro: add a per device gro flush timer") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23	net/ipv6: re-do dad when interface has IFF_NOARP flag change	Hangbin Liu	1	-6/+13
	When we add a new IPv6 address, we should also join corresponding solicited-node multicast address, unless the interface has IFF_NOARP flag, as function addrconf_join_solict() did. But if we remove IFF_NOARP flag later, we do not do dad and add the mcast address. So we will drop corresponding neighbour discovery message that came from other nodes. A typical example is after creating a ipvlan with mode l3, setting up an ipv6 address and changing the mode to l2. Then we will not be able to ping this address as the interface doesn't join related solicited-node mcast address. Fix it by re-doing dad when interface changed IFF_NOARP flag. Then we will add corresponding mcast group and check if there is a duplicate address on the network. Reported-by: Jianlin Shi <jishi@redhat.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23	Merge tag 'iommu-fixes-v4.20-rc3' of ↵	Linus Torvalds	4	-3/+7
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fixes from Joerg Roedel: - Two fixes for the Intel VT-d driver to fix a NULL-ptr dereference and an unbalance in an allocate/free path (allocated with memremap, freed with iounmap) - Fix for a crash in the Renesas IOMMU driver - Fix for the Advanced Virtual Interrupt Controler (AVIC) code in the AMD IOMMU driver * tag 'iommu-fixes-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Use memunmap to free memremap amd/iommu: Fix Guest Virtual APIC Log Tail Address Register iommu/ipmmu-vmsa: Fix crash on early domain free iommu/vt-d: Fix NULL pointer dereference in prq_event_thread()
2018-11-23	packet: copy user buffers before orphan or clone	Willem de Bruijn	2	-3/+19
	tpacket_snd sends packets with user pages linked into skb frags. It notifies that pages can be reused when the skb is released by setting skb->destructor to tpacket_destruct_skb. This can cause data corruption if the skb is orphaned (e.g., on transmit through veth) or cloned (e.g., on mirror to another psock). Create a kernel-private copy of data in these cases, same as tun/tap zerocopy transmission. Reuse that infrastructure: mark the skb as SKBTX_ZEROCOPY_FRAG, which will trigger copy in skb_orphan_frags(_rx). Unlike other zerocopy packets, do not set shinfo destructor_arg to struct ubuf_info. tpacket_destruct_skb already uses that ptr to notify when the original skb is released and a timestamp is recorded. Do not change this timestamp behavior. The ubuf_info->callback is not needed anyway, as no zerocopy notification is expected. Mark destructor_arg as not-a-uarg by setting the lower bit to 1. The resulting value is not a valid ubuf_info pointer, nor a valid tpacket_snd frame address. Add skb_zcopy_.._nouarg helpers for this. The fix relies on features introduced in commit 52267790ef52 ("sock: add MSG_ZEROCOPY"), so can be backported as is only to 4.14. Tested with from `./in_netns.sh ./txring_overwrite` from http://github.com/wdebruij/kerneltools/tests Fixes: 69e3c75f4d54 ("net: TX_RING and packet mmap") Reported-by: Anand H. Krishnan <anandhkrishnan@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-23	Merge tag 'acpi-4.20-rc4' of ↵	Linus Torvalds	1	-0/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Prevent the ACPI core from registering a platform device for the SMB0001 HID to avoid IRQ allocation issues (Hans de Goede)" * tag 'acpi-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / platform: Add SMB0001 HID to forbidden_id_list
2018-11-23	Merge tag 'pm-4.20-rc4' of ↵	Linus Torvalds	10	-22/+42
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix two issues in the Operating Performance Points (OPP) framework, one cpufreq driver issue, one problem related to the tasks freezer and a few build-related issues in the cpupower utility. Specifics: - Fix tasks freezer deadlock in de_thread() that occurs if one of its sub-threads has been frozen already (Chanho Min). - Avoid registering a platform device by the ti-cpufreq driver on platforms that cannot use it (Dave Gerlach). - Fix a mistake in the ti-opp-supply operating performance points (OPP) driver that caused an incorrect reference voltage to be used and make it adjust the minimum voltage dynamically to avoid hangs or crashes in some cases (Keerthy). - Fix issues related to compiler flags in the cpupower utility and correct a linking problem in it by renaming a file with a duplicate name (Jiri Olsa, Konstantin Khlebnikov)" * tag 'pm-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: exec: make de_thread() freezable cpufreq: ti-cpufreq: Only register platform_device when supported opp: ti-opp-supply: Correct the supply in _get_optimal_vdd_voltage call opp: ti-opp-supply: Dynamically update u_volt_min tools cpupower: Override CFLAGS assignments tools cpupower debug: Allow to use outside build flags tools/power/cpupower: fix compilation with STATIC=true
2018-11-23	arm64: cpufeature: Fix mismerge of CONFIG_ARM64_SSBD block	Will Deacon	1	-1/+1
	When merging support for SSBD and the CRC32 instructions, the conflict resolution for the new capability entries in arm64_features[] inadvertedly predicated the availability of the CRC32 instructions on CONFIG_ARM64_SSBD, despite the functionality being entirely unrelated. Move the #ifdef CONFIG_ARM64_SSBD down so that it only covers the SSBD capability. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-11-23	Merge tag 'gpio-v4.20-2' of ↵	Linus Torvalds	4	-19/+30
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Minor stuff except the IDA leak which was kind of important to fix. Also new maintainers, yay. - Do not lose an IDA on the gpiochip register errorpath. - Fix the PXA non-pincontrol GPIO-using platforms. - Fix the direction on the mockup GPIO driver. - Add some MAINTAINERS stuff: Bartosz stepped up as GPIO co-maintainer, and Andy established an Intel git tree" * tag 'gpio-v4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: MAINTAINERS: Do maintain Intel GPIO drivers via separate tree gpio: mockup: fix indicated direction gpio: pxa: fix legacy non pinctrl aware builds again gpio: don't free unallocated ida on gpiochip_add_data_with_key() error path MAINTAINERS: add myself as co-maintainer of gpiolib
2018-11-23	Merge tag 'mmc-v4.20-rc2' of ↵	Linus Torvalds	1	-3/+83
	git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "MMC host: - sdhci-pci: Fixup card detect lookup - sdhci-pci: Workaround GLK firmware bug for tuning" * tag 'mmc-v4.20-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-pci: Workaround GLK firmware failing to restore the tuning value mmc: sdhci-pci: Try "cd" for card-detect lookup before using NULL
2018-11-23	Merge tag 'drm-fixes-2018-11-23' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds	24	-63/+273
	Pull drm fixes from Dave Airlie: "Regular drm fixes: amdgpu: - Vega20 fixes - firmware loading fix - panel display fix - override fix i915: - Sandybridge lockup fix - fastboot DSI panel fix - GPU hang on Broxton - GPU reloc fixes on pineview/bearlake ast: - screen blurring fix - cursor appearance fix udmabuf: - mmap fix vc4: - NULL deref fix - async cursor update fix All seems pretty normal at this stage" * tag 'drm-fixes-2018-11-23' of git://anongit.freedesktop.org/drm/drm: drm/ast: fixed cursor may disappear sometimes drm/ast: change resolution may cause screen blurred drm/i915: Add rotation readout for plane initial config drm/i915: Force a LUT update in intel_initial_commit() drm/fb-helper: Blacklist writeback when adding connectors to fbdev drm/i915: Write GPU relocs harder with gen3 drm/amdgpu: Enable HDP memory light sleep drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture drm/amd/pp: handle negative values when reading OD drm/amdgpu: Add missing firmware entry for HAINAN drm/amd/powerplay: disable Vega20 DS related features drm/amdgpu: Fix oops when pp_funcs->switch_power_profile is unset drm/i915: Disable LP3 watermarks on all SNB machines drm/ast: Remove existing framebuffers before loading driver udmabuf: set read/write flag when exporting drm/amd/display: Support amdgpu "max bpc" connector property (v2) drm/amdgpu: Add amdgpu "max bpc" connector property (v2) drm/vc4: Set ->legacy_cursor_update to false when doing non-async updates drm/vc4: Fix NULL pointer dereference in the async update path
2018-11-23	arm64: sysreg: fix sparse warnings	Sergey Matyukevich	1	-2/+2
	Specify correct type for the constants to avoid the following sparse complaints: ./arch/arm64/include/asm/sysreg.h:471:42: warning: constant 0xffffffffffffffff is so big it is unsigned long ./arch/arm64/include/asm/sysreg.h:512:42: warning: constant 0xffffffffffffffff is so big it is unsigned long Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-11-23	Merge branches 'pm-cpufreq' and 'pm-sleep'	Rafael J. Wysocki	2	-7/+24
	* pm-cpufreq: cpufreq: ti-cpufreq: Only register platform_device when supported * pm-sleep: exec: make de_thread() freezable
2018-11-23	Merge branches 'pm-opp' and 'pm-tools'	Rafael J. Wysocki	7	-14/+14
	* pm-opp: opp: ti-opp-supply: Correct the supply in _get_optimal_vdd_voltage call opp: ti-opp-supply: Dynamically update u_volt_min * pm-tools: tools cpupower: Override CFLAGS assignments tools cpupower debug: Allow to use outside build flags tools/power/cpupower: fix compilation with STATIC=true
2018-11-23	Merge tag 'drm-intel-fixes-2018-11-22' of ↵	Dave Airlie	7	-4/+112
	git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Fix for fastboot DSI panel boot time flicker regression, also fixes Bugzilla #108225 - Fix Bugzilla #101269 to avoid GPU hangs on Sandybridge machines - Avoid GPU hang on error capture on Broxton with Vt-d enabled - Avoid missing GPU relocations on Pineview and Bearlake (Gen3) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181122120555.GA18282@jlahtine-desk.ger.corp.intel.com
2018-11-22	Merge branch 'ibmvnic-Fix-queue-and-buffer-accounting-errors'	David S. Miller	1	-3/+10
	Thomas Falcon says: ==================== ibmvnic: Fix queue and buffer accounting errors This series includes two small fixes. The first resolves a typo bug in the code to clean up unused RX buffers during device queue removal. The second ensures that device queue memory is updated to reflect new supported queue ring sizes after migration to other backing hardware. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-22	ibmvnic: Update driver queues after change in ring size support	Thomas Falcon	1	-1/+8
	During device reset, queue memory is not being updated to accommodate changes in ring buffer sizes supported by backing hardware. Track any differences in ring buffer sizes following the reset and update queue memory when possible. Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-22	ibmvnic: Fix RX queue buffer cleanup	Thomas Falcon	1	-2/+2
	The wrong index is used when cleaning up RX buffer objects during release of RX queues. Update to use the correct index counter. Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-22	net: thunderx: set xdp_prog to NULL if bpf_prog_add fails	Lorenzo Bianconi	1	-2/+7
	Set xdp_prog pointer to NULL if bpf_prog_add fails since that routine reports the error code instead of NULL in case of failure and xdp_prog pointer value is used in the driver to verify if XDP is currently enabled. Moreover report the error code to userspace if nicvf_xdp_setup fails Fixes: 05c773f52b96 ("net: thunderx: Add basic XDP support") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-22	net/dim: Update DIM start sample after each DIM iteration	Tal Gilboa	1	-0/+2
	On every iteration of net_dim, the algorithm may choose to check for the system state by comparing current data sample with previous data sample. After each of these comparison, regardless of the action taken, the sample used as baseline is needed to be updated. This patch fixes a bug that causes DIM to take wrong decisions, due to never updating the baseline sample for comparison between iterations. This way, DIM always compares current sample with zeros. Although this is a functional fix, it also improves and stabilizes performance as the algorithm works properly now. Performance: Tested single UDP TX stream with pktgen: samples/pktgen/pktgen_sample03_burst_single_flow.sh -i p4p2 -d 1.1.1.1 -m 24:8a:07:88:26:8b -f 3 -b 128 ConnectX-5 100GbE packet rate improved from 15-19Mpps to 19-20Mpps. Also, toggling between profiles is less frequent with the fix. Fixes: 8115b750dbcb ("net/dim: use struct net_dim_sample as arg to net_dim") Signed-off-by: Tal Gilboa <talgi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-22	flexfiles: use per-mirror specified stateid for IO	Tigran Mkrtchyan	3	-12/+32
	rfc8435 says: For tight coupling, ffds_stateid provides the stateid to be used by the client to access the file. However current implementation replaces per-mirror provided stateid with by open or lock stateid. Ensure that per-mirror stateid is used by ff_layout_write_prepare_v4 and nfs4_ff_layout_prepare_ds. Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-11-22	NFSv4.2 copy do not allocate memory under the lock	Olga Kornievskaia	2	-20/+21
	Bruce pointed out that we shouldn't allocate memory while holding a lock in the nfs4_callback_offload() and handle_async_copy() that deal with a racing CB_OFFLOAD and reply to COPY case. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-11-22	Merge tag 'sound-4.20-rc4' of ↵	Linus Torvalds	4	-8/+10
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "The only significant change is for OSS PCM emulation to convert with kvcalloc() to address both performance and security issues. It's a pretty straightforward change, which should be safe. The rest are, as usual, device-specific small fixes for HD-audio" * tag 'sound-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/ca0132 - fix AE-5 pincfg ALSA: hda/ca0132 - Add new ZxR quirk ALSA: hda/ca0132 - Call pci_iounmap() instead of iounmap() ALSA: hda/realtek - Add quirk entry for HP Pavilion 15 ALSA: oss: Use kvzalloc() for local buffer allocations
2018-11-22	Merge tag 'char-misc-4.20-rc4' of ↵	Linus Torvalds	12	-32/+55
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some small char/misc driver fixes for issues that have been reported. Nothing major, highlights include: - gnss sync write fixes - uio oops fix - nvmem fixes - other minor fixes and some documentation/maintainers updates Full details are in the shortlog. All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: Documentation/security-bugs: Postpone fix publication in exceptional cases MAINTAINERS: Add Sasha as a stable branch maintainer gnss: sirf: fix synchronous write timeout gnss: serial: fix synchronous write timeout uio: Fix an Oops on load test_firmware: fix error return getting clobbered nvmem: core: fix regression in of_nvmem_cell_get() misc: atmel-ssc: Fix section annotation on atmel_ssc_get_driver_data drivers/misc/sgi-gru: fix Spectre v1 vulnerability Drivers: hv: kvp: Fix the recent regression caused by incorrect clean-up slimbus: ngd: remove unnecessary check
2018-11-22	Merge tag 'usb-4.20-rc4' of ↵	Linus Torvalds	20	-55/+167
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a number of small USB fixes for 4.20-rc4. There's the usual xhci and dwc2/3 fixes as well as a few minor other issues resolved for problems that have been reported. Full details are in the shortlog. All have been in linux-next for a while with no reported issues" * tag 'usb-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: cdc-acm: add entry for Hiro (Conexant) modem usb: xhci: Prevent bus suspend if a port connect change or polling state is detected usb: core: Fix hub port connection events lost usb: dwc3: gadget: fix ISOC TRB type on unaligned transfers Revert "usb: gadget: ffs: Fix BUG when userland exits with submitted AIO transfers" usb: dwc2: pci: Fix an error code in probe usb: dwc3: Fix NULL pointer exception in dwc3_pci_remove() xhci: Add quirk to workaround the errata seen on Cavium Thunder-X2 Soc usb: xhci: fix timeout for transition from RExit to U0 usb: xhci: fix uninitialized completion when USB3 port got wrong status xhci: Add check for invalid byte size error when UAS devices are connected. xhci: handle port status events for removed USB3 hcd xhci: Fix leaking USB3 shared_hcd at xhci removal USB: misc: appledisplay: add 20" Apple Cinema Display USB: quirks: Add no-lpm quirk for Raydium touchscreens usb: quirks: Add delay-init quirk for Corsair K70 LUX RGB USB: Wait for extra delay time after USB_PORT_FEAT_RESET for quirky hub usb: dwc3: gadget: Properly check last unaligned/zero chain TRB usb: dwc3: core: Clean up ULPI device
2018-11-22	Merge tag 'mtd/fixes-for-4.20-rc4' of git://git.infradead.org/linux-mtd	Linus Torvalds	4	-55/+137
	Pull mtd fixes from Boris Brezillon: "SPI NOR fixes: - Various fixes related to the SFDP parsing code merged in 4.20 - Fix for a page fault in the cadence-qspi NAND fixes: - Fix a macro name conflict between the QCOM NAND controller driver and the RISC-V asm headers - Fix of-node handling in the atmel driver" * tag 'mtd/fixes-for-4.20-rc4' of git://git.infradead.org/linux-mtd: mtd: spi-nor: fix selection of uniform erase type in flexible conf mtd: spi-nor: Fix Cadence QSPI page fault kernel panic mtd: rawnand: qcom: Namespace prefix some commands mtd: rawnand: atmel: fix OF child-node lookup mtd: spi_nor: pass DMA-able buffer to spi_nor_read_raw() mtd: spi-nor: don't overwrite errno in spi_nor_get_map_in_use() mtd: spi-nor: fix iteration over smpt array mtd: spi-nor: don't drop sfdp data if optional parsers fail
2018-11-22	Merge tag 'scsi-fixes' of ↵	Linus Torvalds	4	-2/+25
	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two small fixes. The qla2xxx is a regression from 4.18 and the ufs one is a device enablement fix" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: Fix hynix ufs bug with quirk on hi36xx SoC scsi: qla2xxx: Timeouts occur on surprise removal of QLogic adapter
2018-11-22	iommu/vt-d: Use memunmap to free memremap	Pan Bian	1	-1/+1
	memunmap() should be used to free the return of memremap(), not iounmap(). Fixes: dfddb969edf0 ('iommu/vt-d: Switch from ioremap_cache to memremap') Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-11-22	Revert "Input: Add the `REL_WHEEL_HI_RES` event code"	Benjamin Tissoires	2	-20/+1
	This reverts commit aaf9978c3c0291ef3beaa97610bc9c3084656a85. Quoting Peter: There is a HID feature report called "Resolution Multiplier" Described in the "Enhanced Wheel Support in Windows" doc and the "USB HID Usage Tables" page 30. http://download.microsoft.com/download/b/d/1/bd1f7ef4-7d72-419e-bc5c-9f79ad7bb66e/wheel.docx https://www.usb.org/sites/default/files/documents/hut1_12v2.pdf This was new for Windows Vista, so we're only a decade behind here. I only accidentally found this a few days ago while debugging a stuck button on a Microsoft mouse. The docs above describe it like this: a wheel control by default sends value 1 per notch. If the resolution multiplier is active, the wheel is expected to send a value of $multiplier per notch (e.g. MS Sculpt mouse) or just send events more often, i.e. for less physical motion (e.g. MS Comfort mouse). For the latter, you need the right HW of course. The Sculpt mouse has tactile wheel clicks, so nothing really changes. The Comfort mouse has continuous motion with no tactile clicks. Similar to the free-wheeling Logitech mice but without any inertia. Note that the doc also says that Vista and onwards always enable this feature where available. An example HID definition looks like this: Usage Page Generic Desktop (0x01) Usage Resolution Multiplier (0x48) Logical Minimum 0 Logical Maximum 1 Physical Minimum 1 Physical Maximum 16 Report Size 2 # in bits Report Count 1 Feature (Data, Var, Abs) So the actual bits have values 0 or 1 and that reflects real values 1 or 16. We've only seen single-bits so far, so there's low-res and hi-res, but nothing in between. The multiplier is available for HID usages "Wheel" and "AC Pan" (horiz wheel). Microsoft suggests that > Vendors should ship their devices with smooth scrolling disabled and allow > Windows to enable it. This ensures that the device works like a regular HID > device on legacy operating systems that do not support smooth scrolling. (see the wheel doc linked above) The mice that we tested so far do reset on unplug. Device Support looks to be all (?) Microsoft mice but nothing else Not supported: - Logitech G500s, G303 - Roccat Kone XTD - all the cheap Lenovo, HP, Dell, Logitech USB mice that come with a workstation that I could find don't have it. - Etekcity something something - Razer Imperator Supported: - Microsoft Comfort Optical Mouse 3000 - yes, physical: 1:4 - Microsoft Sculpt Ergonomic Mouse - yes, physical: 1:12 - Microsoft Surface mouse - yes, physical: 1:4 So again, I think this is really just available on Microsoft mice, but probably all decent MS mice released over the last decade. Looking at the hardware itself: - no noticeable notches in the weel - low-res: 18 events per 360deg rotation (click angle 20 deg) - high-res: 72 events per 360deg → matches multiplier of 4 - I can feel the notches during wheel turns - low-res: 24 events per 360 deg rotation (click angle 15 deg) - horiz wheel is tilt-based, continuous output value 1 - high-res: 24 events per 360deg with value 12 → matches multiplier of 12 - horiz wheel output rate doubles/triples?, values is 3 - It's a touch strip, not a wheel so no notches - high-res: events have value 4 instead of 1 a bit strange given that it doesn't actually have notches. Ok, why is this an issue for the current API? First, because the logitech multiplier used in Harry's patches looks suspiciously like the Resolution Multiplier so I think we should assume it's the same thing. Nestor, can you shed some light on that? - `REL_WHEEL` is defined as the number of notches, emulated where needed. - `REL_WHEEL_HI_RES` is the movement of the user's finger in microns. - `WM_MOUSEWHEEL` (Windows) is is a multiple of 120, defined as "the threshold for action to be taken and one such action" https://docs.microsoft.com/en-us/windows/desktop/inputdev/wm-mousewheel If the multiplier is set to M, this means we need an accumulated value of M until we can claim there was a wheel click. So after enabling the multiplier and setting it to the maximum (like Windows): - M units are 15deg rotation → 1 unit is 2620/M micron (see below). This is the `REL_WHEEL_HI_RES` value. - wheel diameter 20mm: 15 deg rotation is 2.62mm, 2620 micron (pi * 20mm / (360deg/15deg)) - For every M units accumulated, send one `REL_WHEEL` event The problem here is that we've now hardcoded 20mm/15 deg into the kernel and we have no way of getting the size of the wheel or the click angle into the kernel. In userspace we now have to undo the kernel's calculation. If our click angle is e.g. 20 degree we have to undo the (lossy) calculation from the kernel and calculate the correct angle instead. This also means the 15 is a hardcoded option forever and cannot be changed. In hid-logitech-hidpp.c, the microns per unit is hardcoded per device. Harry, did you measure those by hand? We'd need to update the kernel for every device and there are 10 years worth of devices from MS alone. The multiplier default is 8 which is in the right ballpark, so I'm pretty sure this is the same as the Resolution Multiplier, just in HID++ lingo. And given that the 120 magic factor is what Windows uses in the end, I can't imagine Logitech rolling their own thing here. Nestor? And we're already fairly inaccurate with the microns anyway. The MX Anywhere 2S has a click angle of 20 degrees (18 stops) and a 17mm wheel, so a wheel notch is approximately 2.67mm, one event at multiplier 8 (1/8 of a notch) would be 334 micron. That's only 80% of the fallback value of 406 in the kernel. Multiplier 6 gives us 445micron (10% off). I'm assuming multiplier 7 doesn't exist because it's not a factor of 120. Summary: Best option may be to simply do what Windows is doing, all the HW manufacturers have to use that approach after all. Switch `REL_WHEEL_HI_RES` to report in fractions of 120, with 120 being one notch and divide that by the multiplier for the actual events. So e.g. the Logitech multiplier 8 would send value 15 for each event in hi-res mode. This can be converted in userspace to whatever userspace needs (combined with a hwdb there that tells you wheel size/click angle/...). Conflicts: include/uapi/linux/input-event-codes.h -> I kept the new reserved event in the code, so I had to adapt the revert slightly Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Harry Cutts <hcutts@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Jiri Kosina <jkosina@suse.cz>
2018-11-22	Revert "HID: input: Create a utility class for counting scroll events"	Benjamin Tissoires	2	-73/+0
	This reverts commit 1ff2e1a44e02d4bdbb9be67c7d9acc240a67141f. It turns out the current API is not that compatible with some Microsoft mice, so better start again from scratch. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Harry Cutts <hcutts@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Jiri Kosina <jkosina@suse.cz>
2018-11-22	Revert "HID: logitech: Add function to enable HID++ 1.0 "scrolling ↵	Benjamin Tissoires	1	-34/+13
	acceleration"" This reverts commit 051dc9b0579602bd63e9df74d0879b5293e71581. It turns out the current API is not that compatible with some Microsoft mice, so better start again from scratch. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Harry Cutts <hcutts@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Jiri Kosina <jkosina@suse.cz>
2018-11-22	Revert "HID: logitech: Enable high-resolution scrolling on Logitech mice"	Benjamin Tissoires	1	-245/+4
	This reverts commit d56ca9855bf924f3bc9807a3e42f38539df3f41f. It turns out the current API is not that compatible with some Microsoft mice, so better start again from scratch. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Harry Cutts <hcutts@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Jiri Kosina <jkosina@suse.cz>
2018-11-22	Revert "HID: logitech: Use LDJ_DEVICE macro for existing Logitech mice"	Benjamin Tissoires	1	-5/+10
	This reverts commit 3fe1d6bbcd16f384d2c7dab2caf8e4b2df9ea7e6. It turns out the current API is not that compatible with some Microsoft mice, so better start again from scratch. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Harry Cutts <hcutts@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Jiri Kosina <jkosina@suse.cz>
2018-11-22	Revert "HID: logitech: fix a used uninitialized GCC warning"	Benjamin Tissoires	1	-3/+5
	This reverts commit 5fe2ccbef9d7aecf5c4402c753444f1a12096cfd. It turns out the current API is not that compatible with some Microsoft mice, so better start again from scratch. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Harry Cutts <hcutts@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Jiri Kosina <jkosina@suse.cz>
2018-11-22	Revert "HID: input: simplify/fix high-res scroll event handling"	Benjamin Tissoires	1	-21/+22
	This reverts commit 044ee890286153a1aefb40cb8b6659921aecb38b. It turns out the current API is not that compatible with some Microsoft mice, so better start again from scratch. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Harry Cutts <hcutts@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Jiri Kosina <jkosina@suse.cz>
2018-11-22	Merge branch 'drm-fixes-4.20' of git://people.freedesktop.org/~agd5f/linux ↵	Dave Airlie	11	-56/+115
	into drm-fixes - OD fixes for powerplay - Vega20 fixes - KFD fix for Kaveri - add missing firmware declaration for hainan (SI chip) - Fix DC user experience regressions related to panels that support >8 bpc Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181121163647.2847-1-alexander.deucher@amd.com
2018-11-22	Merge tag 'drm-misc-fixes-2018-11-21' of ↵	Dave Airlie	4	-2/+23
	git://anongit.freedesktop.org/drm/drm-misc into drm-fixes - vc4: Fix NULL deref in async path (Boris) - vc4: Avoid taking async path for cursor updates when impossible (Boris) - udmabuf: Fix mmap with PROT_WRITE (Gerd) - fb-helper: Don't use writeback connectors for fbdev (Paul) Cc: Boris Brezillon <boris.brezillon@bootlin.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Signed-off-by: Dave Airlie <airlied@redhat.com> From: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20181121155248.GA241511@art_vandelay
2018-11-22	drm/ast: fixed cursor may disappear sometimes	Y.C. Chen	1	-1/+1
	Signed-off-by: Y.C. Chen <yc_chen@aspeedtech.com> Cc: <stable@vger.kernel.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-11-21	net: faraday: ftmac100: remove netif_running(netdev) check before disabling ↵	Vincent Chen	1	-4/+3
	interrupts In the original ftmac100_interrupt(), the interrupts are only disabled when the condition "netif_running(netdev)" is true. However, this condition causes kerenl hang in the following case. When the user requests to disable the network device, kernel will clear the bit __LINK_STATE_START from the dev->state and then call the driver's ndo_stop function. Network device interrupts are not blocked during this process. If an interrupt occurs between clearing __LINK_STATE_START and stopping network device, kernel cannot disable the interrupts due to the condition "netif_running(netdev)" in the ISR. Hence, kernel will hang due to the continuous interruption of the network device. In order to solve the above problem, the interrupts of the network device should always be disabled in the ISR without being restricted by the condition "netif_running(netdev)". [V2] Remove unnecessary curly braces. Signed-off-by: Vincent Chen <vincentc@andestech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-22	drm/ast: change resolution may cause screen blurred	Y.C. Chen	1	-0/+1
	The value of pitches is not correct while calling mode_set. The issue we found so far on following system: - Debian8 with XFCE Desktop - Ubuntu with KDE Desktop - SUSE15 with KDE Desktop Signed-off-by: Y.C. Chen <yc_chen@aspeedtech.com> Cc: <stable@vger.kernel.org> Tested-by: Jean Delvare <jdelvare@suse.de> Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-11-21	Merge branch 'smc-fixes'	David S. Miller	8	-50/+120
	Ursula Braun says: ==================== net/smc: fixes 2018-11-12 here is V4 of some net/smc fixes in different areas for the net tree. v1->v2: do not define 8-byte alignment for union smcd_cdc_cursor in patch 4/5 "net/smc: atomic SMCD cursor handling" v2->v3: stay with 8-byte alignment for union smcd_cdc_cursor in patch 4/5 "net/smc: atomic SMCD cursor handling", but get rid of __packed for struct smcd_cdc_msg v3->v4: get rid of another __packed for struct smc_cdc_msg in patch 4/5 "net/smc: atomic SMCD cursor handling" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21	net/smc: use after free fix in smc_wr_tx_put_slot()	Ursula Braun	1	-1/+3
	In smc_wr_tx_put_slot() field pend->idx is used after being cleared. That means always idx 0 is cleared in the wr_tx_mask. This results in a broken administration of available WR send payload buffers. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21	net/smc: atomic SMCD cursor handling	Ursula Braun	2	-26/+60
	Running uperf tests with SMCD on LPARs results in corrupted cursors. SMCD cursors should be treated atomically to fix cursor corruption. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21	net/smc: add SMC-D shutdown signal	Hans Wippel	4	-14/+43
	When a SMC-D link group is freed, a shutdown signal should be sent to the peer to indicate that the link group is invalid. This patch adds the shutdown signal to the SMC code. Signed-off-by: Hans Wippel <hwippel@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21	net/smc: use queue pair number when matching link group	Karsten Graul	3	-9/+12
	When searching for an existing link group the queue pair number is also to be taken into consideration. When the SMC server sends a new number in a CLC packet (keeping all other values equal) then a new link group is to be created on the SMC client side. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21	net/smc: abort CLC connection in smc_release	Hans Wippel	1	-0/+2
	In case of a non-blocking SMC socket, the initial CLC handshake is performed over a blocking TCP connection in a worker. If the SMC socket is released, smc_release has to wait for the blocking CLC socket operations (e.g., kernel_connect) inside the worker. This patch aborts a CLC connection when the respective non-blocking SMC socket is released to avoid waiting on socket operations or timeouts. Signed-off-by: Hans Wippel <hwippel@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21	Merge tag 'wireless-drivers-for-davem-2018-11-20' of ↵	David S. Miller	16	-38/+82
	git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.20 First set of fixes for 4.20, this time we have quite a few them but all very small. ath9k * fix a locking regression found by a static checker wlcore * fix a crash which was a regression with wakeirq handling brcm80211 * yet another fix for 160 MHz channel handling mt76 * fix a longstaning build problem when CONFIG_LEDS_CLASS is disabled * don't use uninitialised mutex iwlwifi * do note that the iwlwifi merge tag (commit 4ec321c14693) seems to contain wrong list of changes so ignore that * fix ACPI data handling, a memory leak and other smaller fixes ath10k * fix a crash during suspend which was a recent regression ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21	tcp: defer SACK compression after DupThresh	Eric Dumazet	4	-6/+17
	Jean-Louis reported a TCP regression and bisected to recent SACK compression. After a loss episode (receiver not able to keep up and dropping packets because its backlog is full), linux TCP stack is sending a single SACK (DUPACK). Sender waits a full RTO timer before recovering losses. While RFC 6675 says in section 5, "Algorithm Details", (2) If DupAcks < DupThresh but IsLost (HighACK + 1) returns true -- indicating at least three segments have arrived above the current cumulative acknowledgment point, which is taken to indicate loss -- go to step (4). ... (4) Invoke fast retransmit and enter loss recovery as follows: there are old TCP stacks not implementing this strategy, and still counting the dupacks before starting fast retransmit. While these stacks probably perform poorly when receivers implement LRO/GRO, we should be a little more gentle to them. This patch makes sure we do not enable SACK compression unless 3 dupacks have been sent since last rcv_nxt update. Ideally we should even rearm the timer to send one or two more DUPACK if no more packets are coming, but that will be work aiming for linux-4.21. Many thanks to Jean-Louis for bisecting the issue, providing packet captures and testing this patch. Fixes: 5d9f4262b7ea ("tcp: add SACK compression") Reported-by: Jean-Louis Dupond <jean-louis@dupond.be> Tested-by: Jean-Louis Dupond <jean-louis@dupond.be> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21	net: skb_scrub_packet(): Scrub offload_fwd_mark	Petr Machata	1	-0/+5
	When a packet is trapped and the corresponding SKB marked as already-forwarded, it retains this marking even after it is forwarded across veth links into another bridge. There, since it ingresses the bridge over veth, which doesn't have offload_fwd_mark, it triggers a warning in nbp_switchdev_frame_mark(). Then nbp_switchdev_allowed_egress() decides not to allow egress from this bridge through another veth, because the SKB is already marked, and the mark (of 0) of course matches. Thus the packet is incorrectly blocked. Solve by resetting offload_fwd_mark() in skb_scrub_packet(). That function is called from tunnels and also from veth, and thus catches the cases where traffic is forwarded between bridges and transformed in a way that invalidates the marking. Fixes: 6bc506b4fb06 ("bridge: switchdev: Add forward mark support for stacked devices") Fixes: abf4bb6b63d0 ("skbuff: Add the offload_mr_fwd_mark field") Signed-off-by: Petr Machata <petrm@mellanox.com> Suggested-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21	Merge tag 'riscv-for-linus-4.20-rc4' of ↵	Linus Torvalds	11	-17/+150
	git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux Pull RISC-V fixes from Palmer Dabbelt: "This week is a bit bigger than I expected. That's my fault, as I missed a few patches while I was at Plumbers last week. We have: - A fix to a quite embarassing issue where raw_copy_to_user() was implemented with asm_copy_from_user() (and vice versa). - Improvements to our makefile to allow flat binaries to be generated. - A build fix that predeclares "struct module" at the top of <asm/module.h>, which triggers warnings later in that header. - The addition of our own <uapi/asm/unistd> header, which is necessary to align our stat ABI on 32-bit systems. - A fix to avoid printing a warning when the S or U bits are set in print_isa(). I already have one patch in the queue for next week" * tag 'riscv-for-linus-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux: RISC-V: recognize S/U mode bits in print_isa riscv: add asm/unistd.h UAPI header riscv: fix warning in arch/riscv/include/asm/module.h RISC-V: Build flat and compressed kernel images RISC-V: Fix raw_copy_{to,from}_user()
2018-11-21	iomap: readpages doesn't zero page tail beyond EOF	Dave Chinner	1	-3/+8
	When we read the EOF page of the file via readpages, we need to zero the region beyond EOF that we either do not read or should not contain data so that mmap does not expose stale data to user applications. However, iomap_adjust_read_range() fails to detect EOF correctly, and so fsx on 1k block size filesystems fails very quickly with mapreads exposing data beyond EOF. There are two problems here. Firstly, when calculating the end block of the EOF byte, we have to round the size by one to avoid a block aligned EOF from reporting a block too large. i.e. a size of 1024 bytes is 1 block, which in index terms is block 0. Therefore we have to calculate the end block from (isize - 1), not isize. The second bug is determining if the current page spans EOF, and so whether we need split it into two half, one for the IO, and the other for zeroing. Unfortunately, the code that checks whether we should split the block doesn't actually check if we span EOF, it just checks if the read spans the /offset in the page/ that EOF sits on. So it splits every read into two if EOF is not page aligned, regardless of whether we are reading the EOF block or not. Hence we need to restrict the "does the read span EOF" check to just the page that spans EOF, not every page we read. This patch results in correct EOF detection through readpages: xfs_vm_readpages: dev 259:0 ino 0x43 nr_pages 24 xfs_iomap_found: dev 259:0 ino 0x43 size 0x66c00 offset 0x4f000 count 98304 type hole startoff 0x13c startblock 1368 blockcount 0x4 iomap_readpage_actor: orig pos 323584 pos 323584, length 4096, poff 0 plen 4096, isize 420864 xfs_iomap_found: dev 259:0 ino 0x43 size 0x66c00 offset 0x50000 count 94208 type hole startoff 0x140 startblock 1497 blockcount 0x5c iomap_readpage_actor: orig pos 327680 pos 327680, length 94208, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 331776 pos 331776, length 90112, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 335872 pos 335872, length 86016, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 339968 pos 339968, length 81920, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 344064 pos 344064, length 77824, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 348160 pos 348160, length 73728, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 352256 pos 352256, length 69632, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 356352 pos 356352, length 65536, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 360448 pos 360448, length 61440, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 364544 pos 364544, length 57344, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 368640 pos 368640, length 53248, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 372736 pos 372736, length 49152, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 376832 pos 376832, length 45056, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 380928 pos 380928, length 40960, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 385024 pos 385024, length 36864, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 389120 pos 389120, length 32768, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 393216 pos 393216, length 28672, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 397312 pos 397312, length 24576, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 401408 pos 401408, length 20480, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 405504 pos 405504, length 16384, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 409600 pos 409600, length 12288, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 413696 pos 413696, length 8192, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 417792 pos 417792, length 4096, poff 0 plen 3072, isize 420864 iomap_readpage_actor: orig pos 420864 pos 420864, length 1024, poff 3072 plen 1024, isize 420864 As you can see, it now does full page reads until the last one which is split correctly at the block aligned EOF, reading 3072 bytes and zeroing the last 1024 bytes. The original version of the patch got this right, but it got another case wrong. The EOF detection crossing really needs to the the original length as plen, while it starts at the end of the block, will be shortened as up-to-date blocks are found on the page. This means "orig_pos + plen" no longer points to the end of the page, and so will not correctly detect EOF crossing. Hence we have to use the length passed in to detect this partial page case: xfs_filemap_fault: dev 259:1 ino 0x43 write_fault 0 xfs_vm_readpage: dev 259:1 ino 0x43 nr_pages 1 xfs_iomap_found: dev 259:1 ino 0x43 size 0x2cc00 offset 0x2c000 count 4096 type hole startoff 0xb0 startblock 282 blockcount 0x4 iomap_readpage_actor: orig pos 180224 pos 181248, length 4096, poff 1024 plen 2048, isize 183296 xfs_iomap_found: dev 259:1 ino 0x43 size 0x2cc00 offset 0x2cc00 count 1024 type hole startoff 0xb3 startblock 285 blockcount 0x1 iomap_readpage_actor: orig pos 183296 pos 183296, length 1024, poff 3072 plen 1024, isize 183296 Heere we see a trace where the first block on the EOF page is up to date, hence poff = 1024 bytes. The offset into the page of EOF is 3072, so the range we want to read is 1024 - 3071, and the range we want to zero is 3072 - 4095. You can see this is split correctly now. This fixes the stale data beyond EOF problem that fsx quickly uncovers on 1k block size filesystems. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-21	vfs: vfs_dedupe_file_range() doesn't return EOPNOTSUPP	Dave Chinner	1	-8/+7
	It returns EINVAL when the operation is not supported by the filesystem. Fix it to return EOPNOTSUPP to be consistent with the man page and clone_file_range(). Clean up the inconsistent error return handling while I'm there. (I know, lipstick on a pig, but every little bit helps...) Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-21	iomap: dio data corruption and spurious errors when pipes fill	Dave Chinner	1	-3/+19
	When doing direct IO to a pipe for do_splice_direct(), then pipe is trivial to fill up and overflow as it can only hold 16 pages. At this point bio_iov_iter_get_pages() then returns -EFAULT, and we abort the IO submission process. Unfortunately, iomap_dio_rw() propagates the error back up the stack. The error is converted from the EFAULT to EAGAIN in generic_file_splice_read() to tell the splice layers that the pipe is full. do_splice_direct() completely fails to handle EAGAIN errors (it aborts on error) and returns EAGAIN to the caller. copy_file_write() then completely fails to handle EAGAIN as well, and so returns EAGAIN to userspace, having failed to copy the data it was asked to. Avoid this whole steaming pile of fail by having iomap_dio_rw() silently swallow EFAULT errors and so do short reads. To make matters worse, iomap_dio_actor() has a stale data exposure bug bio_iov_iter_get_pages() fails - it does not zero the tail block that it may have been left uncovered by partial IO. Fix the error handling case to drop to the sub-block zeroing rather than immmediately returning the -EFAULT error. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-21	iomap: sub-block dio needs to zeroout beyond EOF	Dave Chinner	1	-1/+8
	If we are doing sub-block dio that extends EOF, we need to zero the unused tail of the block to initialise the data in it it. If we do not zero the tail of the block, then an immediate mmap read of the EOF block will expose stale data beyond EOF to userspace. Found with fsx running sub-block DIO sizes vs MAPREAD/MAPWRITE operations. Fix this by detecting if the end of the DIO write is beyond EOF and zeroing the tail if necessary. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-21	iomap: FUA is wrong for DIO O_DSYNC writes into unwritten extents	Dave Chinner	1	-5/+6
	When we write into an unwritten extent via direct IO, we dirty metadata on IO completion to convert the unwritten extent to written. However, when we do the FUA optimisation checks, the inode may be clean and so we issue a FUA write into the unwritten extent. This means we then bypass the generic_write_sync() call after unwritten extent conversion has ben done and we don't force the modified metadata to stable storage. This violates O_DSYNC semantics. The window of exposure is a single IO, as the next DIO write will see the inode has dirty metadata and hence will not use the FUA optimisation. Calling generic_write_sync() after completion of the second IO will also sync the first write and it's metadata. Fix this by avoiding the FUA optimisation when writing to unwritten extents. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-21	xfs: delalloc -> unwritten COW fork allocation can go wrong	Dave Chinner	1	-1/+4
	Long saga. There have been days spent following this through dead end after dead end in multi-GB event traces. This morning, after writing a trace-cmd wrapper that enabled me to be more selective about XFS trace points, I discovered that I could get just enough essential tracepoints enabled that there was a 50:50 chance the fsx config would fail at ~115k ops. If it didn't fail at op 115547, I stopped fsx at op 115548 anyway. That gave me two traces - one where the problem manifested, and one where it didn't. After refining the traces to have the necessary information, I found that in the failing case there was a real extent in the COW fork compared to an unwritten extent in the working case. Walking back through the two traces to the point where the CWO fork extents actually diverged, I found that the bad case had an extra unwritten extent in it. This is likely because the bug it led me to had triggered multiple times in those 115k ops, leaving stray COW extents around. What I saw was a COW delalloc conversion to an unwritten extent (as they should always be through xfs_iomap_write_allocate()) resulted in a /written extent/: xfs_writepage: dev 259:0 ino 0x83 pgoff 0x17000 size 0x79a00 offset 0 length 0 xfs_iext_remove: dev 259:0 ino 0x83 state RC\|LF\|RF\|COW cur 0xffff888247b899c0/2 offset 32 block 152 count 20 flag 1 caller xfs_bmap_add_extent_delay_real xfs_bmap_pre_update: dev 259:0 ino 0x83 state RC\|LF\|RF\|COW cur 0xffff888247b899c0/1 offset 1 block 4503599627239429 count 31 flag 0 caller xfs_bmap_add_extent_delay_real xfs_bmap_post_update: dev 259:0 ino 0x83 state RC\|LF\|RF\|COW cur 0xffff888247b899c0/1 offset 1 block 121 count 51 flag 0 caller xfs_bmap_add_ex Basically, Cow fork before: 0 1 32 52 +H+DDDDDDDDDDDD+UUUUUUUUUUU+ PREV RIGHT COW delalloc conversion allocates: 1 32 +uuuuuuuuuuuu+ NEW And the result according to the xfs_bmap_post_update trace was: 0 1 32 52 +H+wwwwwwwwwwwwwwwwwwwwwwww+ PREV Which is clearly wrong - it should be a merged unwritten extent, not an unwritten extent. That lead me to look at the LEFT_FILLING\|RIGHT_FILLING\|RIGHT_CONTIG case in xfs_bmap_add_extent_delay_real(), and sure enough, there's the bug. It takes the old delalloc extent (PREV) and adds the length of the RIGHT extent to it, takes the start block from NEW, removes the RIGHT extent and then updates PREV with the new extent. What it fails to do is update PREV.br_state. For delalloc, this is always XFS_EXT_NORM, while in this case we are converting the delayed allocation to unwritten, so it needs to be updated to XFS_EXT_UNWRITTEN. This LF\|RF\|RC case does not do this, and so the resultant extent is always written. And that's the bug I've been chasing for a week - a bmap btree bug, not a reflink/dedupe/copy_file_range bug, but a BMBT bug introduced with the recent in core extent tree scalability enhancements. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-21	xfs: flush removing page cache in xfs_reflink_remap_prep	Dave Chinner	3	-5/+17
	On a sub-page block size filesystem, fsx is failing with a data corruption after a series of operations involving copying a file with the destination offset beyond EOF of the destination of the file: 8093(157 mod 256): TRUNCATE DOWN from 0x7a120 to 0x50000 ******WWWW 8094(158 mod 256): INSERT 0x25000 thru 0x25fff (0x1000 bytes) 8095(159 mod 256): COPY 0x18000 thru 0x1afff (0x3000 bytes) to 0x2f400 8096(160 mod 256): WRITE 0x5da00 thru 0x651ff (0x7800 bytes) HOLE 8097(161 mod 256): COPY 0x2000 thru 0x5fff (0x4000 bytes) to 0x6fc00 The second copy here is beyond EOF, and it is to sub-page (4k) but block aligned (1k) offset. The clone runs the EOF zeroing, landing in a pre-existing post-eof delalloc extent. This zeroes the post-eof extents in the page cache just fine, dirtying the pages correctly. The problem is that xfs_reflink_remap_prep() now truncates the page cache over the range that it is copying it to, and rounds that down to cover the entire start page. This removes the dirty page over the delalloc extent from the page cache without having written it back. Hence later, when the page cache is flushed, the page at offset 0x6f000 has not been written back and hence exposes stale data, which fsx trips over less than 10 operations later. Fix this by changing xfs_reflink_remap_prep() to use xfs_flush_unmap_range(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-21	swiotlb: Skip cache maintenance on map error	Robin Murphy	1	-1/+2
	If swiotlb_bounce_page() failed, calling arch_sync_dma_for_device() may lead to such delights as performing cache maintenance on whatever address phys_to_virt(SWIOTLB_MAP_ERROR) looks like, which is typically outside the kernel memory map and goes about as well as expected. Don't do that. Fixes: a4a4330db46a ("swiotlb: add support for non-coherent DMA") Tested-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-11-21	dma-direct: Make DIRECT_MAPPING_ERROR viable for SWIOTLB	Robin Murphy	1	-1/+1
	With the overflow buffer removed, we no longer have a unique address which is guaranteed not to be a valid DMA target to use as an error token. The DIRECT_MAPPING_ERROR value of 0 tries to at least represent an unlikely DMA target, but unfortunately there are already SWIOTLB users with DMA-able memory at physical address 0 which now gets falsely treated as a mapping failure and leads to all manner of misbehaviour. The best we can do to mitigate that is flip DIRECT_MAPPING_ERROR to the other commonly-used error value of all-bits-set, since the last single byte of memory is by far the least-likely-valid DMA target. Fixes: dff8d6c1ed58 ("swiotlb: remove the overflow buffer") Reported-by: John Stultz <john.stultz@linaro.org> Tested-by: John Stultz <john.stultz@linaro.org> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-11-21	Merge branch 'nvme-4.20' of git://git.infradead.org/nvme into for-linus	Jens Axboe	1	-10/+63
	Pull NVMe fix from Christoph. * 'nvme-4.20' of git://git.infradead.org/nvme: nvme-fc: resolve io failures during connect
2018-11-21	Merge tag 'linux-cpupower-4.20-rc4' of ↵	Rafael J. Wysocki	7	-14/+14
	git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux Pull cpupower utility updates for 4.20-rc4 from Shuah Khan: "This cpupower update for Linux 4.20-rc4 consists of compile fixes to allow use of outside build flags and override of CFLAGS from Jiri Olsa, and fix to compilation with STATIC=true from Konstantin Khlebnikov." * tag 'linux-cpupower-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux: tools cpupower: Override CFLAGS assignments tools cpupower debug: Allow to use outside build flags tools/power/cpupower: fix compilation with STATIC=true
2018-11-21	drm/i915: Add rotation readout for plane initial config	Ville Syrjälä	2	-0/+32
	If we need to force a full plane update before userspace/fbdev have given us a proper plane state we should try to maintain the current plane state as much as possible (apart from the parts of the state we're trying to fix up with the plane update). To that end add basic readout for the plane rotation and maintain it during the initial fb takeover. Cc: Hans de Goede <hdegoede@redhat.com> Fixes: 516a49cc1946 ("drm/i915: Fix assert_plane() warning on bootup with external display") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181120135450.3634-2-ville.syrjala@linux.intel.com Tested-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (cherry picked from commit f43348a3db89305bb1935da9fe4499fdcdde9796) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-11-21	drm/i915: Force a LUT update in intel_initial_commit()	Ville Syrjälä	1	-0/+8
	If we force a plane update to fix up our half populated plane state we'll also force on the pipe gamma for the plane (since we always enable pipe gamma currently). If the BIOS hasn't programmed a sensible LUT into the hardware this will cause the image to become corrupted. Typical symptoms are a purple/yellow/etc. flash when the driver loads. To avoid this let's program something sensible into the LUT when we do the plane update. In the future I plan to add proper plane gamma enable readout so this is just a temporary measure. Cc: Hans de Goede <hdegoede@redhat.com> Fixes: 516a49cc1946 ("drm/i915: Fix assert_plane() warning on bootup with external display") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181120135450.3634-1-ville.syrjala@linux.intel.com Tested-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit fa6af5145b4e87a30a530be0d80734a9dd40da77) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-11-21	ACPI / platform: Add SMB0001 HID to forbidden_id_list	Hans de Goede	1	-0/+1
	Many HP AMD based laptops contain an SMB0001 device like this: Device (SMBD) { Name (_HID, "SMB0001") // _HID: Hardware ID Name (_CRS, ResourceTemplate () // _CRS: Current Resource Settings { IO (Decode16, 0x0B20, // Range Minimum 0x0B20, // Range Maximum 0x20, // Alignment 0x20, // Length ) IRQ (Level, ActiveLow, Shared, ) {7} }) } The legacy style IRQ resource here causes acpi_dev_get_irqresource() to be called with legacy=true and this message to show in dmesg: ACPI: IRQ 7 override to edge, high This causes issues when later on the AMD0030 GPIO device gets enumerated: Device (GPIO) { Name (_HID, "AMDI0030") // _HID: Hardware ID Name (_CID, "AMDI0030") // _CID: Compatible ID Name (_UID, Zero) // _UID: Unique ID Method (_CRS, 0, NotSerialized) // _CRS: Current Resource Settings { Name (RBUF, ResourceTemplate () { Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, ) { 0x00000007, } Memory32Fixed (ReadWrite, 0xFED81500, // Address Base 0x00000400, // Address Length ) }) Return (RBUF) /* \_SB_.GPIO._CRS.RBUF */ } } Now acpi_dev_get_irqresource() gets called with legacy=false, but because of the earlier override of the trigger-type acpi_register_gsi() returns -EBUSY (because we try to register the same interrupt with a different trigger-type) and we end up setting IORESOURCE_DISABLED in the flags. The setting of IORESOURCE_DISABLED causes platform_get_irq() to call acpi_irq_get() which is not implemented on x86 and returns -EINVAL. resulting in the following in dmesg: amd_gpio AMDI0030:00: Failed to get gpio IRQ: -22 amd_gpio: probe of AMDI0030:00 failed with error -22 The SMB0001 is a "virtual" device in the sense that the only way the OS interacts with it is through calling a couple of methods to do SMBus transfers. As such it is weird that it has IO and IRQ resources at all, because the driver for it is not expected to ever access the hardware directly. The Linux driver for the SMB0001 device directly binds to the acpi_device through the acpi_bus, so we do not need to instantiate a platform_device for this ACPI device. This commit adds the SMB0001 HID to the forbidden_id_list, avoiding the instantiating of a platform_device for it. Not instantiating a platform_device means we will no longer call acpi_dev_get_irqresource() for the legacy IRQ resource fixing the probe of the AMDI0030 device failing. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1644013 BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=198715 BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=199523 Reported-by: Lukas Kahnert <openproggerfreak@gmail.com> Tested-by: Marc <suaefar@googlemail.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-11-21	drm/fb-helper: Blacklist writeback when adding connectors to fbdev	Paul Kocialkowski	1	-0/+3
	Writeback connectors do not produce any on-screen output and require special care for use. Such connectors are hidden from enumeration in DRM resources by default, but they are still picked-up by fbdev. This makes rather little sense since fbdev is not really adapted for dealing with writeback. Moreover, this is also a source of issues when userspace disables the CRTC (and associated plane) without detaching the CRTC from the connector (which is hidden by default). In this case, the connector is still using the CRTC, leading to am "enabled/connectors mismatch" and eventually the failure of the associated atomic commit. This situation happens with VC4 testing under IGT GPU Tools. Filter out writeback connectors in the fbdev helper to solve this. Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com> Reviewed-by: Maxime Ripard <maxime.ripard@bootlin.com> Tested-by: Maxime Ripard <maxime.ripard@bootlin.com> Fixes: 935774cd71fe ("drm: Add writeback connector type") Cc: <stable@vger.kernel.org> # v4.19+ Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20181115163248.21168-1-paul.kocialkowski@bootlin.com
2018-11-21	drm/i915: Write GPU relocs harder with gen3	Chris Wilson	1	-1/+6
	Under moderate amounts of GPU stress, we can observe on Bearlake and Pineview (later gen3 models) that we execute the following batch buffer before the write into the batch is coherent. Adding extra (tested with upto 32x) MI_FLUSH to either the invalidation, flush or both phases does not solve the incoherency issue with the relocations, but emitting the MI_STORE_DWORD_IMM twice does. So be it. Fixes: 7dd4f6729f92 ("drm/i915: Async GPU relocation processing") Testcase: igt/gem_tiled_fence_blits # blb/pnv Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181119154153.15327-1-chris@chris-wilson.co.uk (cherry picked from commit 7fa28e146994da1e8a4124623d7da97b798ea520) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-11-20	net/sched: act_police: fix race condition on state variables	Davide Caratti	1	-14/+21
	after 'police' configuration parameters were converted to use RCU instead of spinlock, the state variables used to compute the traffic rate (namely 'tcfp_toks', 'tcfp_ptoks' and 'tcfp_t_c') are erroneously read/updated in the traffic path without any protection. Use a dedicated spinlock to avoid race conditions on these variables, and ensure proper cache-line alignment. In this way, 'police' is still faster than what we observed when 'tcf_lock' was used in the traffic path _ i.e. reverting commit 2d550dbad83c ("net/sched: act_police: don't use spinlock in the data path"). Moreover, we preserve the throughput improvement that was obtained after 'police' started using per-cpu counters, when 'avrate' is used instead of 'rate'. Changes since v1 (thanks to Eric Dumazet): - call ktime_get_ns() before acquiring the lock in the traffic path - use a dedicated spinlock instead of tcf_lock - improve cache-line usage Fixes: 2d550dbad83c ("net/sched: act_police: don't use spinlock in the data path") Reported-and-suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com>
2018-11-20	Merge tag 'mips_fixes_4.20_3' of ↵	Linus Torvalds	5	-22/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Paul Burton: "A few MIPS fixes for 4.20: - Re-enable the Cavium Octeon USB driver in its defconfig after it was accidentally removed back in 4.14. - Have early memblock allocations be performed bottom-up to more closely match the behaviour we used to have with bootmem, which seems a safer choice since we've seen fallout from the change made in the 4.20 merge window. - Simplify max_low_pfn calculation in the NUMA code for the Loongson3 and SGI IP27 platforms to both clean up the code & ensure max_low_pfn has been set appropriately before it is used" * tag 'mips_fixes_4.20_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: Loongson3,SGI-IP27: Simplify max_low_pfn calculation MIPS: Let early memblock_alloc*() allocate memories bottom-up MIPS: OCTEON: cavium_octeon_defconfig: re-enable OCTEON USB driver
2018-11-20	MAINTAINERS: add myself as co-maintainer for r8169	Heiner Kallweit	1	-0/+1
	Meanwhile I know the driver quite well and I refactored bigger parts of it. As a result people contact me already with r8169 questions. Therefore I'd volunteer to become co-maintainer of the driver also officially. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	drm/amdgpu: Enable HDP memory light sleep	Kenneth Feng	1	-7/+32
	Due to the register name and setting change of HDP memory light sleep on Vega20,change accordingly in the driver. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-20	xfs: extent shifting doesn't fully invalidate page cache	Dave Chinner	1	-7/+1
	The extent shifting code uses a flush and invalidate mechainsm prior to shifting extents around. This is similar to what xfs_free_file_space() does, but it doesn't take into account things like page cache vs block size differences, and it will fail if there is a page that it currently busy. xfs_flush_unmap_range() handles all of these cases, so just convert xfs_prepare_shift() to us that mechanism rather than having it's own special sauce. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-20	xfs: finobt AG reserves don't consider last AG can be a runt	Dave Chinner	1	-4/+7
	The last AG may be very small comapred to all other AGs, and hence AG reservations based on the superblock AG size may actually consume more space than the AG actually has. This results on assert failures like: XFS: Assertion failed: xfs_perag_resv(pag, XFS_AG_RESV_METADATA)->ar_reserved + xfs_perag_resv(pag, XFS_AG_RESV_RMAPBT)->ar_reserved <= pag->pagf_freeblks + pag->pagf_flcount, file: fs/xfs/libxfs/xfs_ag_resv.c, line: 319 [ 48.932891] xfs_ag_resv_init+0x1bd/0x1d0 [ 48.933853] xfs_fs_reserve_ag_blocks+0x37/0xb0 [ 48.934939] xfs_mountfs+0x5b3/0x920 [ 48.935804] xfs_fs_fill_super+0x462/0x640 [ 48.936784] ? xfs_test_remount_options+0x60/0x60 [ 48.937908] mount_bdev+0x178/0x1b0 [ 48.938751] mount_fs+0x36/0x170 [ 48.939533] vfs_kern_mount.part.43+0x54/0x130 [ 48.940596] do_mount+0x20e/0xcb0 [ 48.941396] ? memdup_user+0x3e/0x70 [ 48.942249] ksys_mount+0xba/0xd0 [ 48.943046] __x64_sys_mount+0x21/0x30 [ 48.943953] do_syscall_64+0x54/0x170 [ 48.944835] entry_SYSCALL_64_after_hwframe+0x49/0xbe Hence we need to ensure the finobt per-ag space reservations take into account the size of the last AG rather than treat it like all the other full size AGs. Note that both refcountbt and rmapbt already take the size of the AG into account via reading the AGF length directly. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-20	xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers	Dave Chinner	1	-7/+21
	When retrying a failed inode or dquot buffer, xfs_buf_resubmit_failed_buffers() clears all the failed flags from the inde/dquot log items. In doing so, it also drops all the reference counts on the buffer that the failed log items hold. This means it can drop all the active references on the buffer and hence free the buffer before it queues it for write again. Putting the buffer on the delwri queue takes a reference to the buffer (so that it hangs around until it has been written and completed), but this goes bang if the buffer has already been freed. Hence we need to add the buffer to the delwri queue before we remove the failed flags from the log items attached to the buffer to ensure it always remains referenced during the resubmit process. Reported-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-20	xfs: uncached buffer tracing needs to print bno	Dave Chinner	1	-1/+4
	Useless: xfs_buf_get_uncached: dev 253:32 bno 0xffffffffffffffff nblks 0x1 ... xfs_buf_unlock: dev 253:32 bno 0xffffffffffffffff nblks 0x1 ... xfs_buf_submit: dev 253:32 bno 0xffffffffffffffff nblks 0x1 ... xfs_buf_hold: dev 253:32 bno 0xffffffffffffffff nblks 0x1 ... xfs_buf_iowait: dev 253:32 bno 0xffffffffffffffff nblks 0x1 ... xfs_buf_iodone: dev 253:32 bno 0xffffffffffffffff nblks 0x1 ... xfs_buf_iowait_done: dev 253:32 bno 0xffffffffffffffff nblks 0x1 ... xfs_buf_rele: dev 253:32 bno 0xffffffffffffffff nblks 0x1 ... Useful: xfs_buf_get_uncached: dev 253:32 bno 0xffffffffffffffff nblks 0x1 ... xfs_buf_unlock: dev 253:32 bno 0xffffffffffffffff nblks 0x1 ... xfs_buf_submit: dev 253:32 bno 0x200b5 nblks 0x1 ... xfs_buf_hold: dev 253:32 bno 0x200b5 nblks 0x1 ... xfs_buf_iowait: dev 253:32 bno 0x200b5 nblks 0x1 ... xfs_buf_iodone: dev 253:32 bno 0x200b5 nblks 0x1 ... xfs_buf_iowait_done: dev 253:32 bno 0x200b5 nblks 0x1 ... xfs_buf_rele: dev 253:32 bno 0x200b5 nblks 0x1 ... Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-20	tcp: Fix SOF_TIMESTAMPING_RX_HARDWARE to use the latest timestamp during TCP ↵	Stephen Mallon	1	-0/+1
	coalescing During tcp coalescing ensure that the skb hardware timestamp refers to the highest sequence number data. Previously only the software timestamp was updated during coalescing. Signed-off-by: Stephen Mallon <stephen.mallon@sydney.edu.au> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	tg3: Add PHY reset for 5717/5719/5720 in change ring and flow control paths	Siva Reddy Kallam	1	-2/+16
	This patch has the fix to avoid PHY lockup with 5717/5719/5720 in change ring and flow control paths. This patch solves the RX hang while doing continuous ring or flow control parameters with heavy traffic from peer. Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	Documentation/security-bugs: Postpone fix publication in exceptional cases	Will Deacon	1	-10/+11
	At the request of the reporter, the Linux kernel security team offers to postpone the publishing of a fix for up to 5 business days from the date of a report. While it is generally undesirable to keep a fix private after it has been developed, this short window is intended to allow distributions to package the fix into their kernel builds and permits early inclusion of the security team in the case of a co-ordinated disclosure with other parties. Unfortunately, discussions with major Linux distributions and cloud providers has revealed that 5 business days is not sufficient to achieve either of these two goals. As an example, cloud providers need to roll out KVM security fixes to a global fleet of hosts with sufficient early ramp-up and monitoring. An end-to-end timeline of less than two weeks dramatically cuts into the amount of early validation and increases the chance of guest-visible regressions. The consequence of this timeline mismatch is that security issues are commonly fixed without the involvement of the Linux kernel security team and are instead analysed and addressed by an ad-hoc group of developers across companies contributing to Linux. In some cases, mainline (and therefore the official stable kernels) can be left to languish for extended periods of time. This undermines the Linux kernel security process and puts upstream developers in a difficult position should they find themselves involved with an undisclosed security problem that they are unable to report due to restrictions from their employer. To accommodate the needs of these users of the Linux kernel and encourage them to engage with the Linux security team when security issues are first uncovered, extend the maximum period for which fixes may be delayed to 7 calendar days, or 14 calendar days in exceptional cases, where the logistics of QA and large scale rollouts specifically need to be accommodated. This brings parity with the linux-distros@ maximum embargo period of 14 calendar days. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Amit Shah <aams@amazon.com> Cc: Laura Abbott <labbott@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Co-developed-by: Thomas Gleixner <tglx@linutronix.de> Co-developed-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Tyler Hicks <tyhicks@canonical.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-20	MAINTAINERS: Add Sasha as a stable branch maintainer	Greg Kroah-Hartman	1	-0/+1
	Sasha has somehow been convinced into helping me with the stable kernel maintenance. Codify this slip in good judgement before he realizes what he really signed up for :) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-20	Merge tag 'media/v4.20-3' of ↵	Linus Torvalds	15	-43/+89
	git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - add a missing include at v4l2-controls uAPI header - minor kAPI update for the request API - some fixes at CEC core - use a lower minimum height for the virtual codec driver - cleanup a gcc warning due to the lack of a fall though markup - tc358743: Remove unnecessary self assignment - fix the V4L event subscription logic - docs: Document metadata format in struct v4l2_format - omap3isp and ipu3-cio2: fix unbinding logic * tag 'media/v4.20-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: ipu3-cio2: Use cio2_queues_exit media: ipu3-cio2: Unregister device nodes first, then release resources media: omap3isp: Unregister media device as first media: docs: Document metadata format in struct v4l2_format media: v4l: event: Add subscription to list before calling "add" operation media: dm365_ipipeif: better annotate a fall though media: Rename vb2_m2m_request_queue -> v4l2_m2m_request_queue media: cec: increase debug level for 'queue full' media: cec: check for non-OK/NACK conditions while claiming a LA media: vicodec: lower minimum height to 360 media: tc358743: Remove unnecessary self assignment media: v4l: fix uapi mpeg slice params definition v4l2-controls: add a missing include
2018-11-20	mtd: spi-nor: fix selection of uniform erase type in flexible conf	Tudor.Ambarus@microchip.com	1	-11/+36
	There are uniform, non-uniform and flexible erase flash configurations. The non-uniform erase types, are the erase types that can _not_ erase the entire flash by their own. As the code was, in case flashes had flexible erase capabilities (support both uniform and non-uniform erase types in the same flash configuration) and supported multiple uniform erase type sizes, the code did not sort the uniform erase types, and could select a wrong erase type size. Sort the uniform erase mask in case of flexible erase flash configurations, in order to select the best uniform erase type size. Uniform, non-uniform, and flexible configurations with just a valid uniform erase type, are not affected by this change. Uniform erase tested on mx25l3273fm2i-08g and sst26vf064B-104i/sn. Non uniform erase tested on sst26vf064B-104i/sn. Fixes: 5390a8df769e ("mtd: spi-nor: add support to non-uniform SFDP SPI NOR flash memories") Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-11-20	RISC-V: recognize S/U mode bits in print_isa	Patrick Stählin	1	-3/+6
	Removes the warning about an unsupported ISA when reading /proc/cpuinfo on QEMU. The "S" extension is not being returned as it is not accessible from userspace. Signed-off-by: Patrick Stählin <me@packi.ch> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-11-20	riscv: add asm/unistd.h UAPI header	David Abdurachmanov	2	-10/+21
	Marcin Juszkiewicz reported issues while generating syscall table for riscv using 4.20-rc1. The patch refactors our unistd.h files to match some other architectures. - Add asm/unistd.h UAPI header, which has __ARCH_WANT_NEW_STAT only for 64-bit - Remove asm/syscalls.h UAPI header and merge to asm/unistd.h - Adjust kernel asm/unistd.h So now asm/unistd.h UAPI header should show all syscalls for riscv. Before this, Makefile simply put `#include <asm-generic/unistd.h>` into generated asm/unistd.h UAPI header thus user didn't see: - __NR_riscv_flush_icache - __NR_newfstatat - __NR_fstat which are supported by riscv kernel. Signed-off-by: David Abdurachmanov <david.abdurachmanov@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org> Cc: Guenter Roeck <linux@roeck-us.net> Fixes: 67314ec7b025 ("RISC-V: Request newstat syscalls") Signed-off-by: David Abdurachmanov <david.abdurachmanov@gmail.com> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-11-20	riscv: fix warning in arch/riscv/include/asm/module.h	David Abdurachmanov	1	-0/+1
	Fixes warning: 'struct module' declared inside parameter list will not be visible outside of this definition or declaration Signed-off-by: David Abdurachmanov <david.abdurachmanov@gmail.com> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-11-20	RISC-V: Build flat and compressed kernel images	Anup Patel	6	-2/+120
	This patch extends Linux RISC-V build system to build and install: Image - Flat uncompressed kernel image Image.gz - Flat and GZip compressed kernel image Quiet a few bootloaders (such as Uboot, UEFI, etc) are capable of booting flat and compressed kernel images. In case of Uboot, booting Image or Image.gz is achieved using bootm command. The flat and uncompressed kernel image (i.e. Image) is very useful in pre-silicon developent and testing because we can create back-door HEX files for RAM on FPGAs from Image. Signed-off-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-11-20	RISC-V: Fix raw_copy_{to,from}_user()	Olof Johansson	1	-2/+2
	Sparse highlighted it, and appears to be a pure bug (from vs to). ./arch/riscv/include/asm/uaccess.h:403:35: warning: incorrect type in argument 1 (different address spaces) ./arch/riscv/include/asm/uaccess.h:403:39: warning: incorrect type in argument 2 (different address spaces) ./arch/riscv/include/asm/uaccess.h:409:37: warning: incorrect type in argument 1 (different address spaces) ./arch/riscv/include/asm/uaccess.h:409:41: warning: incorrect type in argument 2 (different address spaces) Signed-off-by: Olof Johansson <olof@lixom.net> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-11-20	HID: Add quirk for Primax PIXART OEM mice	Sebastian Parschauer	2	-0/+4
	The PixArt OEM mice are known for disconnecting every minute in runlevel 1 or 3 if they are not always polled. So add quirk ALWAYS_POLL for two Primax mice as well. 0x4e22 is the Dell MS111-P and 0x4d0f is the unbranded HP Portia mouse HP 697738-001. Both were built until approx. 2014. Those were the standard mice from those vendors and are still around - even as new old stock. Reference: https://github.com/sriemer/fix-linux-mouse/issues/11 Signed-off-by: Sebastian Parschauer <sparschauer@suse.de> CC: stable@vger.kernel.org Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-11-20	usb: cdc-acm: add entry for Hiro (Conexant) modem	Maarten Jacobs	1	-0/+3
	The cdc-acm kernel module currently does not support the Hiro (Conexant) H05228 USB modem. The patch below adds the device specific information: idVendor 0x0572 idProduct 0x1349 Signed-off-by: Maarten Jacobs <maarten256@outlook.com> Acked-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-20	drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture	Chris Wilson	3	-2/+26
	Since capturing the error state requires fiddling around with the GGTT to read arbitrary buffers and is itself run under stop_machine(), it deadlocks the machine (effectively a hard hang) when run in conjunction with Broxton's VTd workaround to serialize GGTT access. v2: Store the ERR_PTR in first_error so that the error can be reported to the user via sysfs. v3: Mention the quirk in dmesg (using info as per usual) Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: John Harrison <john.C.Harrison@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181102161232.17742-5-chris@chris-wilson.co.uk (cherry picked from commit fb6f0b64e455b207a636346588e65bf9598d30eb) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-11-19	Merge tag 'mlx5-fixes-2018-11-19' of ↵	David S. Miller	12	-83/+116
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2018-11-19 The following fixes are for mlx5 core and netdev driver. For -stable v4.16 bc7fda7d4637 ('net/mlx5e: IPoIB, Reset QP after channels are closed') For -stable v4.17 36917a270395 ('net/mlx5: IPSec, Fix the SA context hash key') For -stable v4.18 6492a432be3a ('net/mlx5e: Always use the match level enum when parsing TC rule match') c3f81be236b1 ('net/mlx5e: Removed unnecessary warnings in FEC caps query') c5ce2e736b64 ('net/mlx5e: Fix selftest for small MTUs') For -stable v4.19 effcd896b25e ('net/mlx5e: Adjust to max number of channles when re-attaching') 394cbc5acd68 ('net/mlx5e: RX, verify received packet size in Linear Striding RQ') 447cbb3613c8 ('net/mlx5e: Don't match on vlan non-existence if ethertype is wildcarded') c223c1574612 ('net/mlx5e: Claim TC hw offloads support only under a proper build config') Please pull and let me know if there's any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	net/ibmnvic: Fix deadlock problem in reset	Juliet Kim	2	-39/+22
	This patch changes to use rtnl_lock only during a reset to avoid deadlock that could occur when a thread operating close is holding rtnl_lock and waiting for reset_lock acquired by another thread, which is waiting for rtnl_lock in order to set the number of tx/rx queues during a reset. Also, we now setting the number of tx/rx queues during a soft reset for failover or LPM events. Signed-off-by: Juliet Kim <julietk@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	Merge branch 'qed-Fix-Queue-Manager-getters'	David S. Miller	1	-5/+24
	Denis Bolotin says: ==================== qed: Fix Queue Manager getters This patch series fixes various queue manager getter functions. It is important to make sure the getter's caller will receive a valid queue even in error case to prevent more serious bugs. Please consider applying to net. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	qed: Fix QM getters to always return a valid pq	Denis Bolotin	1	-4/+20
	The getter callers doesn't know the valid Physical Queues (PQ) values. This patch makes sure that a valid PQ will always be returned. The patch consists of 3 fixes: - When qed_init_qm_get_idx_from_flags() receives a disabled flag, it returned PQ 0, which can potentially be another function's pq. Verify that flag is enabled, otherwise return default start_pq. - When qed_init_qm_get_idx_from_flags() receives an unknown flag, it returned NULL and could lead to a segmentation fault. Return default start_pq instead. - A modulo operation was added to MCOS/VFS PQ getters to make sure the PQ returned is in range of the required flag. Fixes: b5a9ee7cf3be ("qed: Revise QM cofiguration") Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	qed: Fix bitmap_weight() check	Denis Bolotin	1	-1/+4
	Fix the condition which verifies that only one flag is set. The API bitmap_weight() should receive size in bits instead of bytes. Fixes: b5a9ee7cf3be ("qed: Revise QM cofiguration") Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	NFSv4: Fix a NFSv4 state manager deadlock	Trond Myklebust	2	-5/+13
	Fix a deadlock whereby the NFSv4 state manager can get stuck in the delegation return code, waiting for a layout return to complete in another thread. If the server reboots before that other thread completes, then we need to be able to start a second state manager thread in order to perform recovery. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-11-19	net/mlx5e: Fix failing ethtool query on FEC query error	Shay Agroskin	1	-2/+1
	If FEC caps query fails when executing 'ethtool <interface>' the whole callback fails unnecessarily, fixed that by replacing the error return code with debug logging only. Fixes: 6cfa94605091 ("net/mlx5e: Ethtool driver callback for query/set FEC policy") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-19	net/mlx5e: Removed unnecessary warnings in FEC caps query	Shay Agroskin	2	-4/+4
	Querying interface FEC caps with 'ethtool [int]' after link reset throws warning regading link speed. This warning is not needed as there is already an indication in user space that the link is not up. Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-19	net/mlx5e: Fix wrong field name in FEC related functions	Shay Agroskin	1	-2/+2
	This bug would result in reading wrong FEC capabilities for 10G/40G. Fixes: 2095b2641477 ("net/mlx5e: Add port FEC get/set functions") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-19	net/mlx5e: Fix a bug in turning off FEC policy in unsupported speeds	Shay Agroskin	1	-16/+12
	Some speeds don't support turning FEC policy off. In case a requested FEC policy is not supported for a speed (including current speed), its new FEC policy would be: no FEC - if disabling FEC is supported for that speed unchanged - else Fixes: 2095b2641477 ("net/mlx5e: Add port FEC get/set functions") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-19	Merge branch 'ena-hibernation-and-rmmod-bug-fixes'	David S. Miller	2	-13/+12
	Arthur Kiyanovski says: ==================== net: ena: hibernation and rmmod bug fixes This patchset includes 2 bug fixes: 1. A fix to a crash during resume from hibernation. 2. A fix to an illegal memory access during driver removal (e.g. during rmmod) which might cause a crash in certain systems. The subminor number in the driver version is also promoted to indicate driver was changed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	net: ena: update driver version from 2.0.1 to 2.0.2	Arthur Kiyanovski	1	-1/+1
	Update driver version due to critical bug fixes. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	net: ena: fix crash during ena_remove()	Arthur Kiyanovski	1	-11/+10
	In ena_remove() we have the following stack call: ena_remove() unregister_netdev() ena_destroy_device() netif_carrier_off() Calling netif_carrier_off() causes linkwatch to try to handle the link change event on the already unregistered netdev, which leads to a read from an unreadable memory address. This patch switches the order of the two functions, so that netif_carrier_off() is called on a regiestered netdev. To accomplish this fix we also had to: 1. Remove the set bit ENA_FLAG_TRIGGER_RESET 2. Add a sanitiy check in ena_close() both to prevent double device reset (when calling unregister_netdev() ena_close is called, but the device was already deleted in ena_destroy_device()). 3. Set the admin_queue running state to false to avoid using it after device was reset (for example when calling ena_destroy_all_io_queues() right after ena_com_dev_reset() in ena_down) Fixes: 944b28aa2982 ("net: ena: fix missing lock during device destruction") Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	net: ena: fix crash during failed resume from hibernation	Arthur Kiyanovski	1	-1/+1
	During resume from hibernation if ena_restore_device fails, ena_com_dev_reset() is called, and uses the readless read mechanism, which was already destroyed by the call to ena_com_mmio_reg_read_request_destroy(). This causes a NULL pointer reference. In this commit we switch the call order of the above two functions to avoid this crash. Fixes: d7703ddbd7c9 ("net: ena: fix rare bug when failed restart/resume is followed by driver removal") Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	sctp: not increase stream's incnt before sending addstrm_in request	Xin Long	1	-1/+0
	Different from processing the addstrm_out request, The receiver handles an addstrm_in request by sending back an addstrm_out request to the sender who will increase its stream's in and incnt later. Now stream->incnt has been increased since it sent out the addstrm_in request in sctp_send_add_streams(), with the wrong stream->incnt will even cause crash when copying stream info from the old stream's in to the new one's in sctp_process_strreset_addstrm_out(). This patch is to fix it by simply removing the stream->incnt change from sctp_send_add_streams(). Fixes: 242bd2d519d7 ("sctp: implement sender-side procedures for Add Incoming/Outgoing Streams Request Parameter") Reported-by: Jianwen Ji <jiji@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	net/mlx5e: Fix selftest for small MTUs	Valentine Fatiev	1	-16/+10
	Loopback test had fixed packet size, which can be bigger than configured MTU. Shorten the loopback packet size to be bigger than minimal MTU allowed by the device. Text field removed from struct 'mlx5ehdr' as redundant to allow send small packets as minimal allowed MTU. Fixes: d605d66 ("net/mlx5e: Add support for ethtool self diagnostics test") Signed-off-by: Valentine Fatiev <valentinef@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-19	net/mlx5e: RX, verify received packet size in Linear Striding RQ	Moshe Shemesh	5	-1/+15
	In case of striding RQ, we use MPWRQ (Multi Packet WQE RQ), which means that WQE (RX descriptor) can be used for many packets and so the WQE is much bigger than MTU. In virtualization setups where the port mtu can be larger than the vf mtu, if received packet is bigger than MTU, it won't be dropped by HW on too small receive WQE. If we use linear SKB in striding RQ, since each stride has room for mtu size payload and skb info, an oversized packet can lead to crash for crossing allocated page boundary upon the call to build_skb. So driver needs to check packet size and drop it. Introduce new SW rx counter, rx_oversize_pkts_sw_drop, which counts the number of packets dropped by the driver for being too large. As a new field is added to the RQ struct, re-open the channels whenever this field is being used in datapath (i.e., in the case of linear Striding RQ). Fixes: 619a8f2a42f1 ("net/mlx5e: Use linear SKB in Striding RQ") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-19	net/mlx5e: Apply the correct check for supporting TC esw rules split	Roi Dayan	1	-1/+1
	The mirror and not the output count is the one denoting a split. Fix to condition the offload attempt on the mirror count being > 0 along the firmware to have the related capability. Fixes: 592d36515969 ("net/mlx5e: Parse mirroring action for offloaded TC eswitch flows") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Yossi Kuperman <yossiku@mellanox.com> Reviewed-by: Chris Mi <chrism@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-19	net/mlx5e: Adjust to max number of channles when re-attaching	Yuval Avnery	1	-5/+22
	When core driver enters deattach/attach flow after pci reset, Number of logical CPUs may have changed. As a result we need to update the cpu affiliated resource tables. 1. indirect rqt list 2. eq table Reproduction (PowerPC): echo 1000 > /sys/kernel/debug/powerpc/eeh_max_freezes ppc64_cpu --smt=on # Restart driver modprobe -r ... ; modprobe ... # Link up ifconfig ... # Only physical CPUs ppc64_cpu --smt=off # Inject PCI errors so PCI will reset - calling the pci error handler echo 0x8000000000000000 > /sys/kernel/debug/powerpc/<PCI BUS>/err_injct_inboundA Call trace when trying to add non-existing rqs to an indirect rqt: mlx5e_redirect_rqt+0x84/0x260 [mlx5_core] (unreliable) mlx5e_redirect_rqts+0x188/0x190 [mlx5_core] mlx5e_activate_priv_channels+0x488/0x570 [mlx5_core] mlx5e_open_locked+0xbc/0x140 [mlx5_core] mlx5e_open+0x50/0x130 [mlx5_core] mlx5e_nic_enable+0x174/0x1b0 [mlx5_core] mlx5e_attach_netdev+0x154/0x290 [mlx5_core] mlx5e_attach+0x88/0xd0 [mlx5_core] mlx5_attach_device+0x168/0x1e0 [mlx5_core] mlx5_load_one+0x1140/0x1210 [mlx5_core] mlx5_pci_resume+0x6c/0xf0 [mlx5_core] Create cq will fail when trying to use non-existing EQ. Fixes: 89d44f0a6c73 ("net/mlx5_core: Add pci error handlers to mlx5_core driver") Signed-off-by: Yuval Avnery <yuvalav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-19	net/mlx5e: Always use the match level enum when parsing TC rule match	Or Gerlitz	1	-2/+2
	We get the match level (none, l2, l3, l4) while going over the match dissectors of an offloaded tc rule. When doing this, the match level enum and the not min inline enum values should be used, fix that. This worked accidentally b/c both enums have the same numerical values. Fixes: d708f902989b ('net/mlx5e: Get the required HW match level while parsing TC flow matches') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-19	net/mlx5e: Claim TC hw offloads support only under a proper build config	Or Gerlitz	1	-0/+6
	Currently, we are only supporting tc hw offloads when the eswitch support is compiled in, but we are not gating the adevertizment of the NETIF_F_HW_TC feature on this config being set. Fix it, and while doing that, also avoid dealing with the feature on ethtool when the config is not set. Fixes: e8f887ac6a45 ('net/mlx5e: Introduce tc offload support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-19	net/mlx5e: Don't match on vlan non-existence if ethertype is wildcarded	Or Gerlitz	1	-31/+32
	For the "all" ethertype we should not care whether the packet has vlans. Besides being wrong, the way we did it caused FW error for rules such as: tc filter add dev eth0 protocol all parent ffff: \ prio 1 flower skip_sw action drop b/c the matching meta-data (outer headers bit in struct mlx5_flow_spec) wasn't set. Fix that by matching on vlan non-existence only if we were also told to match on the ethertype. Fixes: cee26487620b ('net/mlx5e: Set vlan masks for all offloaded TC rules') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Slava Ovsiienko <viacheslavo@mellanox.com> Reviewed-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-19	net/mlx5e: IPoIB, Reset QP after channels are closed	Denis Drozdov	1	-1/+1
	The mlx5e channels should be closed before mlx5i_uninit_underlay_qp puts the QP into RST (reset) state during mlx5i_close. Currently QP state incorrectly set to RST before channels got deactivated and closed, since mlx5_post_send request expects QP in RTS (Ready To Send) state. The fix is to keep QP in RTS state until mlx5e channels get closed and to reset QP afterwards. Also this fix is simply correct in order to keep the open/close flow symmetric, i.e mlx5i_init_underlay_qp() is called first thing at open, the correct thing to do is to call mlx5i_uninit_underlay_qp() last thing at close, which is exactly what this patch is doing. Fixes: dae37456c8ac ("net/mlx5: Support for attaching multiple underlay QPs to root flow table") Signed-off-by: Denis Drozdov <denisd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-19	net/mlx5: IPSec, Fix the SA context hash key	Raed Salem	1	-2/+8
	The commit "net/mlx5: Refactor accel IPSec code" introduced a bug where asynchronous short time change in hash key value by create/release SA context might happen during an asynchronous hash resize operation this could cause a subsequent remove SA context operation to fail as the key value used during resize is not the same key value used when remove SA context operation is invoked. This commit fixes the bug by defining the SA context hash key such that it includes only fields that never change during the lifetime of the SA context object. Fixes: d6c4f0298cec ("net/mlx5: Refactor accel IPSec code") Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-19	xfs: make xfs_file_remap_range() static	Eric Biggers	1	-1/+1
	xfs_file_remap_range() is only used in fs/xfs/xfs_file.c, so make it static. This addresses a gcc warning when -Wmissing-prototypes is enabled. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-19	xfs: fix shared extent data corruption due to missing cow reservation	Brian Foster	1	-0/+1
	Page writeback indirectly handles shared extents via the existence of overlapping COW fork blocks. If COW fork blocks exist, writeback always performs the associated copy-on-write regardless if the underlying blocks are actually shared. If the blocks are shared, then overlapping COW fork blocks must always exist. fstests shared/010 reproduces a case where a buffered write occurs over a shared block without performing the requisite COW fork reservation. This ultimately causes writeback to the shared extent and data corruption that is detected across md5 checks of the filesystem across a mount cycle. The problem occurs when a buffered write lands over a shared extent that crosses an extent size hint boundary and that also happens to have a partial COW reservation that doesn't cover the start and end blocks of the data fork extent. For example, a buffered write occurs across the file offset (in FSB units) range of [29, 57]. A shared extent exists at blocks [29, 35] and COW reservation already exists at blocks [32, 34]. After accommodating a COW extent size hint of 32 blocks and the existing reservation at offset 32, xfs_reflink_reserve_cow() allocates 32 blocks of reservation at offset 0 and returns with COW reservation across the range of [0, 34]. The associated data fork extent is still [29, 35], however, which isn't fully covered by the COW reservation. This leads to a buffered write at file offset 35 over a shared extent without associated COW reservation. Writeback eventually kicks in, performs an overwrite of the underlying shared block and causes the associated data corruption. Update xfs_reflink_reserve_cow() to accommodate the fact that a delalloc allocation request may not fully cover the extent in the data fork. Trim the data fork extent appropriately, just as is done for shared extent boundaries and/or existing COW reservations that happen to overlap the start of the data fork extent. This prevents shared/010 failures due to data corruption on reflink enabled filesystems. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-19	mtd: spi-nor: Fix Cadence QSPI page fault kernel panic	Thor Thayer	1	-3/+16
	The current Cadence QSPI driver caused a kernel panic sporadically when writing to QSPI. The problem was caused by writing more bytes than needed because the QSPI operated on 4 bytes at a time. <snip> [ 11.202044] Unable to handle kernel paging request at virtual address bffd3000 [ 11.209254] pgd = e463054d [ 11.211948] [bffd3000] pgd=2fffb811, pte=00000000, *ppte=00000000 [ 11.218202] Internal error: Oops: 7 [#1] SMP ARM [ 11.222797] Modules linked in: [ 11.225844] CPU: 1 PID: 1317 Comm: systemd-hwdb Not tainted 4.17.7-d0c45cd44a8f [ 11.235796] Hardware name: Altera SOCFPGA Arria10 [ 11.240487] PC is at __raw_writesl+0x70/0xd4 [ 11.244741] LR is at cqspi_write+0x1a0/0x2cc </snip> On a page boundary limit the number of bytes copied from the tx buffer to remain within the page. This patch uses a temporary buffer to hold the 4 bytes to write and then copies only the bytes required from the tx buffer. Reported-by: Adrian Amborzewicz <adrian.ambrozewicz@intel.com> Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-11-19	drm/amd/pp: handle negative values when reading OD	Greathouse, Joseph	4	-46/+40
	Reading the sysfs files pp_sclk_od and pp_mclk_od return the percentage difference between the VBIOS-provided default frequency and the current (possibly user-set) frequency in the highest SCLK and MCLK DPM states, respectively. Writing to these files provides an easy mechanism for setting a higher-than-default maximum frequency. We normally only allow values >= 0 to be written here. However, with the addition of pp_od_clk_voltage, we now allow users to set custom DPM tables. If they then set the maximum DPM state to something less than the default, later reads of pp_*_od should return a negative value. The highest DPM state is now less than the VBIOS-provided default, so the percentage is negative. The math to calculate this was originally performed with unsigned values, meaning reads that should return negative values returned meaningless data. This patch corrects that issue and normalizes how all of the calculations are done across the various hwmgr types. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-19	drm/amdgpu: Add missing firmware entry for HAINAN	Takashi Iwai	1	-0/+1
	Due to lack of MODULE_FIRMWARE() with hainan_mc.bin, the driver doesn't work properly in initrd. Let's add it. Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1116239 Fixes: 8eaf2b1faaf4 ("drm/amdgpu: switch firmware path for SI parts") Cc: <stable@vger.kernel.org> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-19	drm/amd/powerplay: disable Vega20 DS related features	Evan Quan	1	-1/+11
	Disable these features on Vega20 for now. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Feifei Xu<Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-19	drm/amdgpu: Fix oops when pp_funcs->switch_power_profile is unset	Felix Kuehling	1	-2/+5
	On Vega20 and other pre-production GPUs, powerplay is not enabled yet. Check for NULL pointers before calling pp_funcs function pointers. Also affects Kaveri. CC: Joerg Roedel <jroedel@suse.de> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2018-11-19	Revert "sctp: remove sctp_transport_pmtu_check"	Xin Long	2	-0/+15
	This reverts commit 22d7be267eaa8114dcc28d66c1c347f667d7878a. The dst's mtu in transport can be updated by a non sctp place like in xfrm where the MTU information didn't get synced between asoc, transport and dst, so it is still needed to do the pmtu check in sctp_packet_config. Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	sctp: not allow to set asoc prsctp_enable by sockopt	Xin Long	1	-21/+5
	As rfc7496#section4.5 says about SCTP_PR_SUPPORTED: This socket option allows the enabling or disabling of the negotiation of PR-SCTP support for future associations. For existing associations, it allows one to query whether or not PR-SCTP support was negotiated on a particular association. It means only sctp sock's prsctp_enable can be set. Note that for the limitation of SCTP_{CURRENT\|ALL}_ASSOC, we will add it when introducing SCTP_{FUTURE\|CURRENT\|ALL}_ASSOC for linux sctp in another patchset. v1->v2: - drop the params.assoc_id check as Neil suggested. Fixes: 28aa4c26fce2 ("sctp: add SCTP_PR_SUPPORTED on sctp sockopt") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	sctp: count sk_wmem_alloc by skb truesize in sctp_packet_transmit	Xin Long	1	-20/+1
	Now sctp increases sk_wmem_alloc by 1 when doing set_owner_w for the skb allocked in sctp_packet_transmit and decreases by 1 when freeing this skb. But when this skb goes through networking stack, some subcomponents might change skb->truesize and add the same amount on sk_wmem_alloc. However sctp doesn't know the amount to decrease by, it would cause a leak on sk->sk_wmem_alloc and the sock can never be freed. Xiumei found this issue when it hit esp_output_head() by using sctp over ipsec, where skb->truesize is added and so is sk->sk_wmem_alloc. Since sctp has used sk_wmem_queued to count for writable space since Commit cd305c74b0f8 ("sctp: use sk_wmem_queued to check for writable space"), it's ok to fix it by counting sk_wmem_alloc by skb truesize in sctp_packet_transmit. Fixes: cac2661c53f3 ("esp4: Avoid skb_cow_data whenever possible") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	MAINTAINERS: Add myself as third phylib maintainer	Heiner Kallweit	1	-0/+1
	Add myself as third phylib maintainer. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds	58	-233/+539
	Pull networking fixes from David Miller: 1) Fix some potentially uninitialized variables and use-after-free in kvaser_usb can drier, from Jimmy Assarsson. 2) Fix leaks in qed driver, from Denis Bolotin. 3) Socket leak in l2tp, from Xin Long. 4) RSS context allocation fix in bnxt_en from Michael Chan. 5) Fix cxgb4 build errors, from Ganesh Goudar. 6) Route leaks in ipv6 when removing exceptions, from Xin Long. 7) Memory leak in IDR allocation handling of act_pedit, from Davide Caratti. 8) Use-after-free of bridge vlan stats, from Nikolay Aleksandrov. 9) When MTU is locked, do not force DF bit on ipv4 tunnels. From Sabrina Dubroca. 10) When NAPI cached skb is reused, we must set it to the proper initial state which includes skb->pkt_type. From Eric Dumazet. 11) Lockdep and non-linear SKB handling fix in tipc from Jon Maloy. 12) Set RX queue properly in various tuntap receive paths, from Matthew Cover. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits) tuntap: fix multiqueue rx ipv6: Fix PMTU updates for UDP/raw sockets in presence of VRF tipc: don't assume linear buffer when reading ancillary data tipc: fix lockdep warning when reinitilaizing sockets net-gro: reset skb->pkt_type in napi_reuse_skb() tc-testing: tdc.py: Guard against lack of returncode in executed command tc-testing: tdc.py: ignore errors when decoding stdout/stderr ip_tunnel: don't force DF when MTU is locked MAINTAINERS: Add entry for CAKE qdisc net: bridge: fix vlan stats use-after-free on destruction socket: do a generic_file_splice_read when proto_ops has no splice_read net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs Revert "net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs" net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs net/sched: act_pedit: fix memory leak when IDR allocation fails net: lantiq: Fix returned value in case of error in 'xrx200_probe()' ipv6: fix a dst leak when removing its exception net: mvneta: Don't advertise 2.5G modes drivers/net/ethernet/qlogic/qed/qed_rdma.h: fix typo net/mlx4: Fix UBSAN warning of signed integer overflow ...
2018-11-19	libceph: fall back to sendmsg for slab pages	Ilya Dryomov	1	-3/+9
	skb_can_coalesce() allows coalescing neighboring slab objects into a single frag: return page == skb_frag_page(frag) && off == frag->page_offset + skb_frag_size(frag); ceph_tcp_sendpage() can be handed slab pages. One example of this is XFS: it passes down sector sized slab objects for its metadata I/O. If the kernel client is co-located on the OSD node, the skb may go through loopback and pop on the receive side with the exact same set of frags. When tcp_recvmsg() attempts to copy out such a frag, hardened usercopy complains because the size exceeds the object's allocated size: usercopy: kernel memory exposure attempt detected from ffff9ba917f20a00 (kmalloc-512) (1024 bytes) Although skb_can_coalesce() could be taught to return false if the resulting frag would cross a slab object boundary, we already have a fallback for non-refcounted pages. Utilize it for slab pages too. Cc: stable@vger.kernel.org # 4.8+ Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-11-19	HID: i2c-hid: Disable runtime PM for LG touchscreen	Kai-Heng Feng	2	-0/+3
	LG touchscreen (1fd2:8001) stops working after reboot: [ 4.859153] i2c_hid i2c-SAPS2101:00: i2c_hid_get_input: incomplete report (64/66) [ 4.936070] i2c_hid i2c-SAPS2101:00: i2c_hid_get_input: incomplete report (64/66) [ 9.948224] i2c_hid i2c-SAPS2101:00: failed to reset device. The device in question stops working after receives SLEEP, ON, SLEEP commands in a short period. The scenario is like this: - Once the desktop session closes, it also closed the hid device, so the device gets runtime suspended and receives a SLEEP command. - Before calling shutdown callback, it gets runtime resumed and received an ON command. - In the shutdown callback, it receives another SLEEP command. I failed to find a reliable interval between ON/SLEEP commands that can make it work, so let's simply disable runtime PM for the device. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-11-19	HID: multitouch: Add pointstick support for Cirque Touchpad	Kai-Heng Feng	2	-0/+9
	Cirque Touchpad/Pointstick combo is similar to Alps devices, it requires MT_CLS_WIN_8_DUAL to expose its pointstick as a mouse. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-11-19	HID: steam: remove input device when a hid client is running.	Rodrigo Rivas Costa	1	-64/+90
	Previously, when a HID client such as the Steam Client was running, this driver disabled its input device to avoid doubling the input events. While it worked mostly fine, some games got confused by the idle gamepad, and switched to two player mode, or asked the user to choose which gamepad to use. Other games just crashed, probably a bug in Unity [1]. With this commit, when a HID client starts, the input device is removed; when the HID client ends the input device is recreated. [1]: https://github.com/ValveSoftware/steam-for-linux/issues/5645 Signed-off-by: Rodrigo Rivas Costa <rodrigorivascosta@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-11-19	XArray tests: Add missing locking	Matthew Wilcox	1	-0/+10
	Lockdep caught me being sloppy in the test suite and failing to lock the XArray appropriately. Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-11-19	dax: Avoid losing wakeup in dax_lock_mapping_entry	Matthew Wilcox	1	-0/+1
	After calling get_unlocked_entry(), you have to call put_unlocked_entry() to avoid subsequent waiters losing wakeups. Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()") Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-11-19	Revert "HID: uhid: use strlcpy() instead of strncpy()"	David Herrmann	1	-6/+7
	This reverts commit 336fd4f5f25157e9e8bd50e898a1bbcd99eaea46. Please note that `strlcpy()` does NOT do what you think it does. strlcpy() ALWAYS reads the full input string, regardless of the 'length' parameter. That is, if the input is not zero-terminated, strlcpy() will READ beyond input boundaries. It does this, because it always returns the size it would copy if the target was big enough, not the truncated size it actually copied. The original code was perfectly fine. The hid device is zero-initialized and the strncpy() functions copied up to n-1 characters. The result is always zero-terminated this way. This is the third time someone tried to replace strncpy with strlcpy in this function, and gets it wrong. I now added a comment that should at least make people reconsider. Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-11-19	HID: uhid: forbid UHID_CREATE under KERNEL_DS or elevated privileges	Eric Biggers	1	-0/+12
	When a UHID_CREATE command is written to the uhid char device, a copy_from_user() is done from a user pointer embedded in the command. When the address limit is KERNEL_DS, e.g. as is the case during sys_sendfile(), this can read from kernel memory. Alternatively, information can be leaked from a setuid binary that is tricked to write to the file descriptor. Therefore, forbid UHID_CREATE in these cases. No other commands in uhid_char_write() are affected by this bug and UHID_CREATE is marked as "obsolete", so apply the restriction to UHID_CREATE only rather than to uhid_char_write() entirely. Thanks to Dmitry Vyukov for adding uhid definitions to syzkaller and to Jann Horn for commit 9da3f2b740544 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses"), allowing this bug to be found. Reported-by: syzbot+72473edc9bf4eb1c6556@syzkaller.appspotmail.com Fixes: d365c6cfd337 ("HID: uhid: add UHID_CREATE and UHID_DESTROY events") Cc: <stable@vger.kernel.org> # v3.6+ Cc: Jann Horn <jannh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Jann Horn <jannh@google.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-11-19	mmc: sdhci-pci: Workaround GLK firmware failing to restore the tuning value	Adrian Hunter	1	-2/+77
	GLK firmware can indicate that the tuning value will be restored after runtime suspend, but not actually do that. Add a workaround that detects such cases, and lets the driver do re-tuning instead. Reported-by: Anisse Astier <anisse@astier.eu> Tested-by: Anisse Astier <anisse@astier.eu> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2018-11-19	drm/i915: Disable LP3 watermarks on all SNB machines	Ville Syrjälä	1	-1/+40
	I have a Thinkpad X220 Tablet in my hands that is losing vblank interrupts whenever LP3 watermarks are used. If I nudge the latency value written to the WM3 register just by one in either direction the problem disappears. That to me suggests that the punit will not enter the corrsponding powersave mode (MPLL shutdown IIRC) unless the latency value in the register matches exactly what we read from SSKPD. Ie. it's not really a latency value but rather just a cookie by which the punit can identify the desired power saving state. On HSW/BDW this was changed such that we actually just write the WM level number into those bits, which makes much more sense given the observed behaviour. We could try to handle this by disallowing LP3 watermarks only when vblank interrupts are enabled but we'd first have to prove that only vblank interrupts are affected, which seems unlikely. Also we can't grab the wm mutex from the vblank enable/disable hooks because those are called with various spinlocks held. Thus we'd have to redesigne the watermark locking. So to play it safe and keep the code simple we simply disable LP3 watermarks on all SNB machines. To do that we simply zero out the latency values for watermark level 3, and we adjust the watermark computation to check for that. The behaviour now matches that of the g4x/vlv/skl wm code in the presence of a zeroed latency value. v2: s/USHRT_MAX/U32_MAX/ for consistency with the types (Chris) Cc: stable@vger.kernel.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101269 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103713 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181114173440.6730-1-ville.syrjala@linux.intel.com (cherry picked from commit 03981c6ebec4fc7056b9b45f847393aeac90d060) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-11-19	ALSA: hda/ca0132 - fix AE-5 pincfg	Connor McAdams	1	-1/+1
	This patch fixes the pincfg assignment for the AE-5, which was previously using the Recon3D pincfg's by mistake. Fixes: d06feaf02fe6 ("ALSA: hda/ca0132 - Add pincfg for AE-5") Signed-off-by: Connor McAdams <conmanx360@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-11-19	ALSA: hda/ca0132 - Add new ZxR quirk	Connor McAdams	1	-0/+1
	This patch adds a new PCI subsys ID for the ZxR, as found and tested by other users. Without a way to know if any Z's use it as well, it keeps the quirk of QUIRK_SBZ and goes through the HDA subsys test function. Signed-off-by: Connor McAdams <conmanx360@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-11-19	mmc: sdhci-pci: Try "cd" for card-detect lookup before using NULL	Rajat Jain	1	-1/+6
	Problem: The card detect IRQ does not work with modern BIOS (that want to use _DSD to provide the card detect GPIO to the driver). Details: The mmc core provides the mmc_gpiod_request_cd() API to let host drivers request the gpio descriptor for the "card detect" pin. This pin is specified in the ACPI for the SDHC device: * Either as a resource using _CRS. This is a method used by legacy BIOS. (The driver needs to tell which resource index). * Or as a named property ("cd-gpios"/"cd-gpio") in _DSD (which internally points to an entry in _CRS). This way, the driver can lookup using a string. This is what modern BIOS prefer to use. This API finally results in a call to the following code: struct gpio_desc acpi_find_gpio(..., const char con_id,...) { ... /* Lookup gpio (using "<con_id>-gpio") in the _DSD / ... if (!acpi_can_fallback_to_crs(adev, con_id)) return ERR_PTR(-ENOENT); ... / Falling back to _CRS is allowed, Lookup gpio in the _CRS / ... } Note that this means that if the ACPI has _DSD properties, the kernel will never use _CRS for the lookup (Because acpi_can_fallback_to_crs() will always be false for any device hat has _DSD entries). The SDHCI driver is thus currently broken on a modern BIOS, even if BIOS provides both _CRS (for index based lookup) and _DSD entries (for string based lookup). Ironically, none of these will be used for the lookup currently because: Since the con_id is NULL, acpi_find_gpio() does not find a matching entry in DSDT. (The _DSDT entry has the property name = "cd-gpios") * Because ACPI contains DSDT entries, thus acpi_can_fallback_to_crs() returns false (because device properties have been populated from _DSD), thus the _CRS is never used for the lookup. Fix: Try "cd" for lookup in the _DSD before falling back to using NULL so as to try looking up in the _CRS. I've tested this patch successfully with both Legacy BIOS (that provide only _CRS method) as well as modern BIOS (that provide both _CRS and _DSD). Also the use of "cd" appears to be fairly consistent across other users of this API (other MMC host controller drivers). Link: https://lkml.org/lkml/2018/9/25/1113 Signed-off-by: Rajat Jain <rajatja@google.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: f10e4bf6632b ("gpio: acpi: Even more tighten up ACPI GPIO lookups") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2018-11-19	exec: make de_thread() freezable	Chanho Min	1	-2/+3
	Suspend fails due to the exec family of functions blocking the freezer. The casue is that de_thread() sleeps in TASK_UNINTERRUPTIBLE waiting for all sub-threads to die, and we have the deadlock if one of them is frozen. This also can occur with the schedule() waiting for the group thread leader to exit if it is frozen. In our machine, it causes freeze timeout as bellows. Freezing of tasks failed after 20.010 seconds (1 tasks refusing to freeze, wq_busy=0): setcpushares-ls D ffffffc00008ed70 0 5817 1483 0x0040000d Call trace: [<ffffffc00008ed70>] __switch_to+0x88/0xa0 [<ffffffc000d1c30c>] __schedule+0x1bc/0x720 [<ffffffc000d1ca90>] schedule+0x40/0xa8 [<ffffffc0001cd784>] flush_old_exec+0xdc/0x640 [<ffffffc000220360>] load_elf_binary+0x2a8/0x1090 [<ffffffc0001ccff4>] search_binary_handler+0x9c/0x240 [<ffffffc00021c584>] load_script+0x20c/0x228 [<ffffffc0001ccff4>] search_binary_handler+0x9c/0x240 [<ffffffc0001ce8e0>] do_execveat_common.isra.14+0x4f8/0x6e8 [<ffffffc0001cedd0>] compat_SyS_execve+0x38/0x48 [<ffffffc00008de30>] el0_svc_naked+0x24/0x28 To fix this, make de_thread() freezable. It looks safe and works fine. Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Chanho Min <chanho.min@lge.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-11-19	cpufreq: ti-cpufreq: Only register platform_device when supported	Dave Gerlach	1	-5/+21
	Currently the ti-cpufreq driver blindly registers a 'ti-cpufreq' to force the driver to probe on any platforms where the driver is built in. However, this should only happen on platforms that actually can make use of the driver. There is already functionality in place to match the SoC compatible so let's factor this out into a separate call and make sure we find a match before creating the ti-cpufreq platform device. Reviewed-by: Johan Hovold <johan@kernel.org> Signed-off-by: Dave Gerlach <d-gerlach@ti.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-11-19	Merge branch 'opp/fixes-for-4.20' of ↵	Rafael J. Wysocki	1	-1/+4
	git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull operating performance points (OPP) framework fixes for 4.20 from Viresh Kumar. * 'opp/fixes-for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: opp: ti-opp-supply: Correct the supply in _get_optimal_vdd_voltage call opp: ti-opp-supply: Dynamically update u_volt_min
2018-11-18	tuntap: fix multiqueue rx	Matthew Cover	1	-1/+6
	When writing packets to a descriptor associated with a combined queue, the packets should end up on that queue. Before this change all packets written to any descriptor associated with a tap interface end up on rx-0, even when the descriptor is associated with a different queue. The rx traffic can be generated by either of the following. 1. a simple tap program which spins up multiple queues and writes packets to each of the file descriptors 2. tx from a qemu vm with a tap multiqueue netdev The queue for rx traffic can be observed by either of the following (done on the hypervisor in the qemu case). 1. a simple netmap program which opens and reads from per-queue descriptors 2. configuring RPS and doing per-cpu captures with rxtxcpu Alternatively, if you printk() the return value of skb_get_rx_queue() just before each instance of netif_receive_skb() in tun.c, you will get 65535 for every skb. Calling skb_record_rx_queue() to set the rx queue to the queue_index fixes the association between descriptor and rx queue. Signed-off-by: Matthew Cover <matthew.cover@stackpath.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-18	ipv6: Fix PMTU updates for UDP/raw sockets in presence of VRF	David Ahern	1	-2/+5
	Preethi reported that PMTU discovery for UDP/raw applications is not working in the presence of VRF when the socket is not bound to a device. The problem is that ip6_sk_update_pmtu does not consider the L3 domain of the skb device if the socket is not bound. Update the function to set oif to the L3 master device if relevant. Fixes: ca254490c8df ("net: Add VRF support to IPv6 stack") Reported-by: Preethi Ramachandra <preethir@juniper.net> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19	drm/ast: Remove existing framebuffers before loading driver	Thomas Zimmermann	1	-0/+21
	If vesafb attaches to the AST device, it configures the framebuffer memory for uncached access by default. When ast.ko later tries to attach itself to the device, it wants to use write-combining on the framebuffer memory, but vesefb's existing configuration for uncached access takes precedence. This results in reduced performance. Removing the framebuffer's configuration before loding the AST driver fixes the problem. Other DRM drivers already contain equivalent code. Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1112963 Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: <stable@vger.kernel.org> Tested-by: Y.C. Chen <yc_chen@aspeedtech.com> Reviewed-by: Jean Delvare <jdelvare@suse.de> Tested-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-11-18	Linux 4.20-rc3	Linus Torvalds	1	-1/+1

2018-11-18	Merge tag 'libnvdimm-fixes-4.20-rc3' of ↵	Linus Torvalds	2	-18/+9
	git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "A small batch of fixes for v4.20-rc3. The overflow continuation fix addresses something that has been broken for several releases. Arguably it could wait even longer, but it's a one line fix and this finishes the last of the known address range scrub bug reports. The revert addresses a lockdep regression. The unit tests are not critical to fix, but no reason to hold this fix back. Summary: - Address Range Scrub overflow continuation handling has been broken since it was initially merged. It was only recently that error injection and platform-BIOS support enabled this corner case to be exercised. - The recent attempt to provide more isolation for the kernel Address Range Scrub state machine from userapace initiated sessions triggers a lockdep report. Revert and try again at the next merge window. - Fix a kasan reported buffer overflow in libnvdimm unit test infrastrucutre (nfit_test)" * tag 'libnvdimm-fixes-4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: Revert "acpi, nfit: Further restrict userspace ARS start requests" acpi, nfit: Fix ARS overflow continuation tools/testing/nvdimm: Fix the array size for dimm devices.
2018-11-18	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds	17	-97/+169
	Merge misc fixes from Andrew Morton: "16 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/memblock.c: fix a typo in __next_mem_pfn_range() comments mm, page_alloc: check for max order in hot path scripts/spdxcheck.py: make python3 compliant tmpfs: make lseek(SEEK_DATA/SEK_HOLE) return ENXIO with a negative offset lib/ubsan.c: don't mark __ubsan_handle_builtin_unreachable as noreturn mm/vmstat.c: fix NUMA statistics updates mm/gup.c: fix follow_page_mask() kerneldoc comment ocfs2: free up write context when direct IO failed scripts/faddr2line: fix location of start_kernel in comment mm: don't reclaim inodes with many attached pages mm, memory_hotplug: check zone_movable in has_unmovable_pages mm/swapfile.c: use kvzalloc for swap_info_struct allocation MAINTAINERS: update OMAP MMC entry hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444! kernel/sched/psi.c: simplify cgroup_move_task() z3fold: fix possible reclaim races
2018-11-18	Merge branch 'sched-urgent-for-linus' of ↵	Linus Torvalds	1	-14/+48
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "Fix an exec() related scalability/performance regression, which was caused by incorrectly calculating load and migrating tasks on exec() when they shouldn't be" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Fix cpu_util_wake() for 'execl' type workloads
2018-11-18	Merge branch 'perf-urgent-for-linus' of ↵	Linus Torvalds	2	-10/+144
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Fix uncore PMU enumeration for CofeeLake CPUs" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Support CoffeeLake 8th CBOX perf/x86/intel/uncore: Add more IMC PCI IDs for KabyLake and CoffeeLake CPUs
2018-11-18	Merge branch 'efi-urgent-for-linus' of ↵	Linus Torvalds	9	-14/+47
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Ingo Molnar: "Misc fixes: two warning splat fixes, a leak fix and persistent memory allocation fixes for ARM" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: Permit calling efi_mem_reserve_persistent() from atomic context efi/arm: Defer persistent reservations until after paging_init() efi/arm/libstub: Pack FDT after populating it efi/arm: Revert deferred unmap of early memmap mapping efi: Fix debugobjects warning on 'efi_rts_work'
2018-11-18	Merge branch 'spectre' of git://git.armlinux.org.uk/~rmk/linux-arm	Linus Torvalds	7	-47/+113
	Pull ARM spectre updates from Russell King: "These are the currently known final bits that resolve the Spectre issues. big.Little systems used to be sufficiently identical in that there were no differences between individual CPUs in the system that mattered to the kernel. With the advent of the Spectre problem, the CPUs now have differences in how the workaround is applied. As a result of previous Spectre patches, these systems ended up reporting quite a lot of: "CPUx: Spectre v2: incorrect context switching function, system vulnerable" messages due to the action of the big.Little switcher causing the CPUs to be re-initialised regularly. This series resolves that issue by making the CPU vtable unique to each CPU. However, since this is used very early, before per-cpu is setup, per-cpu can't be used. We also have a problem that two of the methods are not called from preempt-safe paths, but thankfully these remain identical between all CPUs in the system. To make sure, we validate that these are identical during boot" * 'spectre' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: spectre-v2: per-CPU vtables to work around big.Little systems ARM: add PROC_VTABLE and PROC_TABLE macros ARM: clean up per-processor check_bugs method call ARM: split out processor lookup ARM: make lookup_processor_type() non-__init
2018-11-18	mm/memblock.c: fix a typo in __next_mem_pfn_range() comments	Chen Chang	1	-1/+1
	Link: http://lkml.kernel.org/r/20181107100247.13359-1-rainccrun@gmail.com Signed-off-by: Chen Chang <rainccrun@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	mm, page_alloc: check for max order in hot path	Michal Hocko	1	-11/+9
	Konstantin has noticed that kvmalloc might trigger the following warning: WARNING: CPU: 0 PID: 6676 at mm/vmstat.c:986 __fragmentation_index+0x54/0x60 [...] Call Trace: fragmentation_index+0x76/0x90 compaction_suitable+0x4f/0xf0 shrink_node+0x295/0x310 node_reclaim+0x205/0x250 get_page_from_freelist+0x649/0xad0 __alloc_pages_nodemask+0x12a/0x2a0 kmalloc_large_node+0x47/0x90 __kmalloc_node+0x22b/0x2e0 kvmalloc_node+0x3e/0x70 xt_alloc_table_info+0x3a/0x80 [x_tables] do_ip6t_set_ctl+0xcd/0x1c0 [ip6_tables] nf_setsockopt+0x44/0x60 SyS_setsockopt+0x6f/0xc0 do_syscall_64+0x67/0x120 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 the problem is that we only check for an out of bound order in the slow path and the node reclaim might happen from the fast path already. This is fixable by making sure that kvmalloc doesn't ever use kmalloc for requests that are larger than KMALLOC_MAX_SIZE but this also shows that the code is rather fragile. A recent UBSAN report just underlines that by the following report UBSAN: Undefined behaviour in mm/page_alloc.c:3117:19 shift exponent 51 is too large for 32-bit type 'int' CPU: 0 PID: 6520 Comm: syz-executor1 Not tainted 4.19.0-rc2 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xd2/0x148 lib/dump_stack.c:113 ubsan_epilogue+0x12/0x94 lib/ubsan.c:159 __ubsan_handle_shift_out_of_bounds+0x2b6/0x30b lib/ubsan.c:425 __zone_watermark_ok+0x2c7/0x400 mm/page_alloc.c:3117 zone_watermark_fast mm/page_alloc.c:3216 [inline] get_page_from_freelist+0xc49/0x44c0 mm/page_alloc.c:3300 __alloc_pages_nodemask+0x21e/0x640 mm/page_alloc.c:4370 alloc_pages_current+0xcc/0x210 mm/mempolicy.c:2093 alloc_pages include/linux/gfp.h:509 [inline] __get_free_pages+0x12/0x60 mm/page_alloc.c:4414 dma_mem_alloc+0x36/0x50 arch/x86/include/asm/floppy.h:156 raw_cmd_copyin drivers/block/floppy.c:3159 [inline] raw_cmd_ioctl drivers/block/floppy.c:3206 [inline] fd_locked_ioctl+0xa00/0x2c10 drivers/block/floppy.c:3544 fd_ioctl+0x40/0x60 drivers/block/floppy.c:3571 __blkdev_driver_ioctl block/ioctl.c:303 [inline] blkdev_ioctl+0xb3c/0x1a30 block/ioctl.c:601 block_ioctl+0x105/0x150 fs/block_dev.c:1883 vfs_ioctl fs/ioctl.c:46 [inline] do_vfs_ioctl+0x1c0/0x1150 fs/ioctl.c:687 ksys_ioctl+0x9e/0xb0 fs/ioctl.c:702 __do_sys_ioctl fs/ioctl.c:709 [inline] __se_sys_ioctl fs/ioctl.c:707 [inline] __x64_sys_ioctl+0x7e/0xc0 fs/ioctl.c:707 do_syscall_64+0xc4/0x510 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Note that this is not a kvmalloc path. It is just that the fast path really depends on having sanitzed order as well. Therefore move the order check to the fast path. Link: http://lkml.kernel.org/r/20181113094305.GM15120@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reported-by: Kyungtae Kim <kt0755@gmail.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Aaron Lu <aaron.lu@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Byoungyoung Lee <lifeasageek@gmail.com> Cc: "Dae R. Jeong" <threeearcat@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	scripts/spdxcheck.py: make python3 compliant	Uwe Kleine-König	1	-1/+0
	Without this change the following happens when using Python3 (3.6.6): $ echo "GPL-2.0" \| python3 scripts/spdxcheck.py - FAIL: 'str' object has no attribute 'decode' Traceback (most recent call last): File "scripts/spdxcheck.py", line 253, in <module> parser.parse_lines(sys.stdin, args.maxlines, '-') File "scripts/spdxcheck.py", line 171, in parse_lines line = line.decode(locale.getpreferredencoding(False), errors='ignore') AttributeError: 'str' object has no attribute 'decode' So as the line is already a string, there is no need to decode it and the line can be dropped. /usr/bin/python on Arch is Python 3. So this would indeed be worth going into 4.19. Link: http://lkml.kernel.org/r/20181023070802.22558-1-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Joe Perches <joe@perches.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	tmpfs: make lseek(SEEK_DATA/SEK_HOLE) return ENXIO with a negative offset	Yufen Yu	1	-3/+1
	Other filesystems such as ext4, f2fs and ubifs all return ENXIO when lseek (SEEK_DATA or SEEK_HOLE) requests a negative offset. man 2 lseek says : EINVAL whence is not valid. Or: the resulting file offset would be : negative, or beyond the end of a seekable device. : : ENXIO whence is SEEK_DATA or SEEK_HOLE, and the file offset is beyond : the end of the file. Make tmpfs return ENXIO under these circumstances as well. After this, tmpfs also passes xfstests's generic/448. [akpm@linux-foundation.org: rewrite changelog] Link: http://lkml.kernel.org/r/1540434176-14349-1-git-send-email-yuyufen@huawei.com Signed-off-by: Yufen Yu <yuyufen@huawei.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Hugh Dickins <hughd@google.com> Cc: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	lib/ubsan.c: don't mark __ubsan_handle_builtin_unreachable as noreturn	Arnd Bergmann	1	-2/+1
	gcc-8 complains about the prototype for this function: lib/ubsan.c:432:1: error: ignoring attribute 'noreturn' in declaration of a built-in function '__ubsan_handle_builtin_unreachable' because it conflicts with attribute 'const' [-Werror=attributes] This is actually a GCC's bug. In GCC internals __ubsan_handle_builtin_unreachable() declared with both 'noreturn' and 'const' attributes instead of only 'noreturn': https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84210 Workaround this by removing the noreturn attribute. [aryabinin: add information about GCC bug in changelog] Link: http://lkml.kernel.org/r/20181107144516.4587-1-aryabinin@virtuozzo.com Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Olof Johansson <olof@lixom.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	mm/vmstat.c: fix NUMA statistics updates	Janne Huttunen	1	-3/+4
	Scan through the whole array to see if an update is needed. While we're at it, use sizeof() to be safe against any possible type changes in the future. The bug here is that we wouldn't sync per-cpu counters into global ones if there was an update of numa_stats for higher cpus. Highly theoretical one though because it is much more probable that zone_stats are updated so we would refresh anyway. So I wouldn't bother to mark this for stable, yet something nice to fix. [mhocko@suse.com: changelog enhancement] Link: http://lkml.kernel.org/r/1541601517-17282-1-git-send-email-janne.huttunen@nokia.com Fixes: 1d90ca897cb0 ("mm: update NUMA counter threshold size") Signed-off-by: Janne Huttunen <janne.huttunen@nokia.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	mm/gup.c: fix follow_page_mask() kerneldoc comment	Mike Rapoport	1	-2/+8
	Commit df06b37ffe5a ("mm/gup: cache dev_pagemap while pinning pages") modified the signature of follow_page_mask() but left the parameter description behind. Update the description to make the code and comments agree again. While at it, update formatting of the return value description to match Documentation/doc-guide/kernel-doc.rst guidelines. Link: http://lkml.kernel.org/r/1541603316-27832-1-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	ocfs2: free up write context when direct IO failed	Wengang Wang	2	-2/+19
	The write context should also be freed even when direct IO failed. Otherwise a memory leak is introduced and entries remain in oi->ip_unwritten_list causing the following BUG later in unlink path: ERROR: bug expression: !list_empty(&oi->ip_unwritten_list) ERROR: Clear inode of 215043, inode has unwritten extents ... Call Trace: ? __set_current_blocked+0x42/0x68 ocfs2_evict_inode+0x91/0x6a0 [ocfs2] ? bit_waitqueue+0x40/0x33 evict+0xdb/0x1af iput+0x1a2/0x1f7 do_unlinkat+0x194/0x28f SyS_unlinkat+0x1b/0x2f do_syscall_64+0x79/0x1ae entry_SYSCALL_64_after_hwframe+0x151/0x0 This patch also logs, with frequency limit, direct IO failures. Link: http://lkml.kernel.org/r/20181102170632.25921-1-wen.gang.wang@oracle.com Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Changwei Ge <ge.changwei@h3c.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	scripts/faddr2line: fix location of start_kernel in comment	Randy Dunlap	1	-1/+1
	Fix a source file reference location to the correct path name. Link: http://lkml.kernel.org/r/1d50bd3d-178e-dcd8-779f-9711887440eb@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	mm: don't reclaim inodes with many attached pages	Roman Gushchin	1	-2/+5
	Spock reported that commit 172b06c32b94 ("mm: slowly shrink slabs with a relatively small number of objects") leads to a regression on his setup: periodically the majority of the pagecache is evicted without an obvious reason, while before the change the amount of free memory was balancing around the watermark. The reason behind is that the mentioned above change created some minimal background pressure on the inode cache. The problem is that if an inode is considered to be reclaimed, all belonging pagecache page are stripped, no matter how many of them are there. So, if a huge multi-gigabyte file is cached in the memory, and the goal is to reclaim only few slab objects (unused inodes), we still can eventually evict all gigabytes of the pagecache at once. The workload described by Spock has few large non-mapped files in the pagecache, so it's especially noticeable. To solve the problem let's postpone the reclaim of inodes, which have more than 1 attached page. Let's wait until the pagecache pages will be evicted naturally by scanning the corresponding LRU lists, and only then reclaim the inode structure. Link: http://lkml.kernel.org/r/20181023164302.20436-1-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Reported-by: Spock <dairinin@gmail.com> Tested-by: Spock <dairinin@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: <stable@vger.kernel.org> [4.19.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	mm, memory_hotplug: check zone_movable in has_unmovable_pages	Michal Hocko	1	-0/+8
	Page state checks are racy. Under a heavy memory workload (e.g. stress -m 200 -t 2h) it is quite easy to hit a race window when the page is allocated but its state is not fully populated yet. A debugging patch to dump the struct page state shows has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0 page:ffffea0437fb0000 count:1 mapcount:1 mapping:ffff880e05239841 index:0x7f26e5000 compound_mapcount: 1 flags: 0x5fffffc0090034(uptodate\|lru\|active\|head\|swapbacked) Note that the state has been checked for both PageLRU and PageSwapBacked already. Closing this race completely would require some sort of retry logic. This can be tricky and error prone (think of potential endless or long taking loops). Workaround this problem for movable zones at least. Such a zone should only contain movable pages. Commit 15c30bc09085 ("mm, memory_hotplug: make has_unmovable_pages more robust") has told us that this is not strictly true though. Bootmem pages should be marked reserved though so we can move the original check after the PageReserved check. Pages from other zones are still prone to races but we even do not pretend that memory hotremove works for those so pre-mature failure doesn't hurt that much. Link: http://lkml.kernel.org/r/20181106095524.14629-1-mhocko@kernel.org Fixes: 15c30bc09085 ("mm, memory_hotplug: make has_unmovable_pages more robust") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Baoquan He <bhe@redhat.com> Tested-by: Baoquan He <bhe@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	mm/swapfile.c: use kvzalloc for swap_info_struct allocation	Vasily Averin	1	-3/+3
	Commit a2468cc9bfdf ("swap: choose swap device according to numa node") changed 'avail_lists' field of 'struct swap_info_struct' to an array. In popular linux distros it increased size of swap_info_struct up to 40 Kbytes and now swap_info_struct allocation requires order-4 page. Switch to kvzmalloc allows to avoid unexpected allocation failures. Link: http://lkml.kernel.org/r/fc23172d-3c75-21e2-d551-8b1808cbe593@virtuozzo.com Fixes: a2468cc9bfdf ("swap: choose swap device according to numa node") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Acked-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Huang Ying <ying.huang@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	MAINTAINERS: update OMAP MMC entry	Aaro Koskinen	2	-2/+6
	Jarkko's e-mail address hasn't worked for a long time. We still want to keep this driver working as it is critical for some of the OMAP boards. I use and test this driver frequently, so change myself as a maintainer with "Odd Fixes" status. Link: http://lkml.kernel.org/r/20181106222750.12939-1-aaro.koskinen@iki.fi Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444!	Mike Kravetz	1	-4/+19
	This bug has been experienced several times by the Oracle DB team. The BUG is in remove_inode_hugepages() as follows: /* * If page is mapped, it was faulted in after being * unmapped in caller. Unmap (again) now after taking * the fault mutex. The mutex will prevent faults * until we finish removing the page. * * This race can only happen in the hole punch case. * Getting here in a truncate operation is a bug. / if (unlikely(page_mapped(page))) { BUG_ON(truncate_op); In this case, the elevated map count is not the result of a race. Rather it was incorrectly incremented as the result of a bug in the huge pmd sharing code. Consider the following: - Process A maps a hugetlbfs file of sufficient size and alignment (PUD_SIZE) that a pmd page could be shared. - Process B maps the same hugetlbfs file with the same size and alignment such that a pmd page is shared. - Process B then calls mprotect() to change protections for the mapping with the shared pmd. As a result, the pmd is 'unshared'. - Process B then calls mprotect() again to chage protections for the mapping back to their original value. pmd remains unshared. - Process B then forks and process C is created. During the fork process, we do dup_mm -> dup_mmap -> copy_page_range to copy page tables. Copying page tables for hugetlb mappings is done in the routine copy_hugetlb_page_range. In copy_hugetlb_page_range(), the destination pte is obtained by: dst_pte = huge_pte_alloc(dst, addr, sz); If pmd sharing is possible, the returned pointer will be to a pte in an existing page table. In the situation above, process C could share with either process A or process B. Since process A is first in the list, the returned pte is a pointer to a pte in process A's page table. However, the check for pmd sharing in copy_hugetlb_page_range is: / If the pagetables are shared don't copy or take references */ if (dst_pte == src_pte) continue; Since process C is sharing with process A instead of process B, the above test fails. The code in copy_hugetlb_page_range which follows assumes dst_pte points to a huge_pte_none pte. It copies the pte entry from src_pte to dst_pte and increments this map count of the associated page. This is how we end up with an elevated map count. To solve, check the dst_pte entry for huge_pte_none. If !none, this implies PMD sharing so do not copy. Link: http://lkml.kernel.org/r/20181105212315.14125-1-mike.kravetz@oracle.com Fixes: c5c99429fa57 ("fix hugepages leak due to pagetable page sharing") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Prakash Sangappa <prakash.sangappa@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	kernel/sched/psi.c: simplify cgroup_move_task()	Olof Johansson	1	-21/+22
	The existing code triggered an invalid warning about 'rq' possibly being used uninitialized. Instead of doing the silly warning suppression by initializa it to NULL, refactor the code to bail out early instead. Warning was: kernel/sched/psi.c: In function `cgroup_move_task': kernel/sched/psi.c:639:13: warning: `rq' may be used uninitialized in this function [-Wmaybe-uninitialized] Link: http://lkml.kernel.org/r/20181103183339.8669-1-olof@lixom.net Fixes: 2ce7135adc9ad ("psi: cgroup support") Signed-off-by: Olof Johansson <olof@lixom.net> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	z3fold: fix possible reclaim races	Vitaly Wool	1	-39/+62
	Reclaim and free can race on an object which is basically fine but in order for reclaim to be able to map "freed" object we need to encode object length in the handle. handle_to_chunks() is then introduced to extract object length from a handle and use it during mapping. Moreover, to avoid racing on a z3fold "headless" page release, we should not try to free that page in z3fold_free() if the reclaim bit is set. Also, in the unlikely case of trying to reclaim a page being freed, we should not proceed with that page. While at it, fix the page accounting in reclaim function. This patch supersedes "[PATCH] z3fold: fix reclaim lock-ups". Link: http://lkml.kernel.org/r/20181105162225.74e8837d03583a9b707cf559@gmail.com Signed-off-by: Vitaly Wool <vitaly.vul@sony.com> Signed-off-by: Jongseok Kim <ks77sj@gmail.com> Reported-by-by: Jongseok Kim <ks77sj@gmail.com> Reviewed-by: Snild Dolkow <snild@sony.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18	mtd: rawnand: qcom: Namespace prefix some commands	Olof Johansson	1	-16/+16
	PAGE_READ is used by RISC-V arch code included through mm headers, and it makes sense to bring in a prefix on these in the driver. drivers/mtd/nand/raw/qcom_nandc.c:153: warning: "PAGE_READ" redefined #define PAGE_READ 0x2 In file included from include/linux/memremap.h:7, from include/linux/mm.h:27, from include/linux/scatterlist.h:8, from include/linux/dma-mapping.h:11, from drivers/mtd/nand/raw/qcom_nandc.c:17: arch/riscv/include/asm/pgtable.h:48: note: this is the location of the previous definition Caught by riscv allmodconfig. Signed-off-by: Olof Johansson <olof@lixom.net> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-11-18	mtd: rawnand: atmel: fix OF child-node lookup	Johan Hovold	1	-4/+7
	Use the new of_get_compatible_child() helper to lookup the nfc child node instead of using of_find_compatible_node(), which searches the entire tree from a given start node and thus can return an unrelated (i.e. non-child) node. This also addresses a potential use-after-free (e.g. after probe deferral) as the tree-wide helper drops a reference to its first argument (i.e. the node of the device being probed). While at it, also fix a related nfc-node reference leak. Fixes: f88fc122cc34 ("mtd: nand: Cleanup/rework the atmel_nand driver") Cc: stable <stable@vger.kernel.org> # 4.11 Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: Josh Wu <rainyfeeling@outlook.com> Cc: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-11-17	tipc: don't assume linear buffer when reading ancillary data	Jon Maloy	1	-4/+11
	The code for reading ancillary data from a received buffer is assuming the buffer is linear. To make this assumption true we have to linearize the buffer before message data is read. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	tipc: fix lockdep warning when reinitilaizing sockets	Jon Maloy	3	-18/+48
	We get the following warning: [ 47.926140] 32-bit node address hash set to 2010a0a [ 47.927202] [ 47.927433] ================================ [ 47.928050] WARNING: inconsistent lock state [ 47.928661] 4.19.0+ #37 Tainted: G E [ 47.929346] -------------------------------- [ 47.929954] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 47.930116] swapper/3/0 [HC0[0]:SC1[3]:HE1:SE0] takes: [ 47.930116] 00000000af8bc31e (&(&ht->lock)->rlock){+.?.}, at: rhashtable_walk_enter+0x36/0xb0 [ 47.930116] {SOFTIRQ-ON-W} state was registered at: [ 47.930116] _raw_spin_lock+0x29/0x60 [ 47.930116] rht_deferred_worker+0x556/0x810 [ 47.930116] process_one_work+0x1f5/0x540 [ 47.930116] worker_thread+0x64/0x3e0 [ 47.930116] kthread+0x112/0x150 [ 47.930116] ret_from_fork+0x3a/0x50 [ 47.930116] irq event stamp: 14044 [ 47.930116] hardirqs last enabled at (14044): [<ffffffff9a07fbba>] __local_bh_enable_ip+0x7a/0xf0 [ 47.938117] hardirqs last disabled at (14043): [<ffffffff9a07fb81>] __local_bh_enable_ip+0x41/0xf0 [ 47.938117] softirqs last enabled at (14028): [<ffffffff9a0803ee>] irq_enter+0x5e/0x60 [ 47.938117] softirqs last disabled at (14029): [<ffffffff9a0804a5>] irq_exit+0xb5/0xc0 [ 47.938117] [ 47.938117] other info that might help us debug this: [ 47.938117] Possible unsafe locking scenario: [ 47.938117] [ 47.938117] CPU0 [ 47.938117] ---- [ 47.938117] lock(&(&ht->lock)->rlock); [ 47.938117] <Interrupt> [ 47.938117] lock(&(&ht->lock)->rlock); [ 47.938117] [ 47.938117] * DEADLOCK * [ 47.938117] [ 47.938117] 2 locks held by swapper/3/0: [ 47.938117] #0: 0000000062c64f90 ((&d->timer)){+.-.}, at: call_timer_fn+0x5/0x280 [ 47.938117] #1: 00000000ee39619c (&(&d->lock)->rlock){+.-.}, at: tipc_disc_timeout+0xc8/0x540 [tipc] [ 47.938117] [ 47.938117] stack backtrace: [ 47.938117] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G E 4.19.0+ #37 [ 47.938117] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 47.938117] Call Trace: [ 47.938117] <IRQ> [ 47.938117] dump_stack+0x5e/0x8b [ 47.938117] print_usage_bug+0x1ed/0x1ff [ 47.938117] mark_lock+0x5b5/0x630 [ 47.938117] __lock_acquire+0x4c0/0x18f0 [ 47.938117] ? lock_acquire+0xa6/0x180 [ 47.938117] lock_acquire+0xa6/0x180 [ 47.938117] ? rhashtable_walk_enter+0x36/0xb0 [ 47.938117] _raw_spin_lock+0x29/0x60 [ 47.938117] ? rhashtable_walk_enter+0x36/0xb0 [ 47.938117] rhashtable_walk_enter+0x36/0xb0 [ 47.938117] tipc_sk_reinit+0xb0/0x410 [tipc] [ 47.938117] ? mark_held_locks+0x6f/0x90 [ 47.938117] ? __local_bh_enable_ip+0x7a/0xf0 [ 47.938117] ? lockdep_hardirqs_on+0x20/0x1a0 [ 47.938117] tipc_net_finalize+0xbf/0x180 [tipc] [ 47.938117] tipc_disc_timeout+0x509/0x540 [tipc] [ 47.938117] ? call_timer_fn+0x5/0x280 [ 47.938117] ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc] [ 47.938117] ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc] [ 47.938117] call_timer_fn+0xa1/0x280 [ 47.938117] ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc] [ 47.938117] run_timer_softirq+0x1f2/0x4d0 [ 47.938117] __do_softirq+0xfc/0x413 [ 47.938117] irq_exit+0xb5/0xc0 [ 47.938117] smp_apic_timer_interrupt+0xac/0x210 [ 47.938117] apic_timer_interrupt+0xf/0x20 [ 47.938117] </IRQ> [ 47.938117] RIP: 0010:default_idle+0x1c/0x140 [ 47.938117] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 54 55 53 65 8b 2d d8 2b 74 65 0f 1f 44 00 00 e8 c6 2c 8b ff fb f4 <65> 8b 2d c5 2b 74 65 0f 1f 44 00 00 5b 5d 41 5c c3 65 8b 05 b4 2b [ 47.938117] RSP: 0018:ffffaf6ac0207ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13 [ 47.938117] RAX: ffff8f5b3735e200 RBX: 0000000000000003 RCX: 0000000000000001 [ 47.938117] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff8f5b3735e200 [ 47.938117] RBP: 0000000000000003 R08: 0000000000000001 R09: 0000000000000000 [ 47.938117] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 47.938117] R13: 0000000000000000 R14: ffff8f5b3735e200 R15: ffff8f5b3735e200 [ 47.938117] ? default_idle+0x1a/0x140 [ 47.938117] do_idle+0x1bc/0x280 [ 47.938117] cpu_startup_entry+0x19/0x20 [ 47.938117] start_secondary+0x187/0x1c0 [ 47.938117] secondary_startup_64+0xa4/0xb0 The reason seems to be that tipc_net_finalize()->tipc_sk_reinit() is calling the function rhashtable_walk_enter() within a timer interrupt. We fix this by executing tipc_net_finalize() in work queue context. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	net-gro: reset skb->pkt_type in napi_reuse_skb()	Eric Dumazet	1	-0/+4
	eth_type_trans() assumes initial value for skb->pkt_type is PACKET_HOST. This is indeed the value right after a fresh skb allocation. However, it is possible that GRO merged a packet with a different value (like PACKET_OTHERHOST in case macvlan is used), so we need to make sure napi->skb will have pkt_type set back to PACKET_HOST. Otherwise, valid packets might be dropped by the stack because their pkt_type is not PACKET_HOST. napi_reuse_skb() was added in commit 96e93eab2033 ("gro: Add internal interfaces for VLAN"), but this bug always has been there. Fixes: 96e93eab2033 ("gro: Add internal interfaces for VLAN") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	Merge branch 'tdc-fixes'	David S. Miller	1	-5/+13
	Lucas Bates says: ==================== Prevent uncaught exceptions in tdc This patch series addresses two potential bugs in tdc that can cause exceptions to be raised in certain circumstances. These exceptions are generally not handled, so instead we will prevent them from being raised. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	tc-testing: tdc.py: Guard against lack of returncode in executed command	Brenda J. Butler	1	-3/+11
	Add some defensive coding in case one of the subprocesses created by tdc returns nothing. If no object is returned from exec_cmd, then tdc will halt with an unhandled exception. Signed-off-by: Brenda J. Butler <bjb@mojatatu.com> Signed-off-by: Lucas Bates <lucasb@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	tc-testing: tdc.py: ignore errors when decoding stdout/stderr	Lucas Bates	1	-2/+2
	Prevent exceptions from being raised while decoding output from an executed command. There is no impact on tdc's execution and the verify command phase would fail the pattern match. Signed-off-by: Lucas Bates <lucasb@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	ip_tunnel: don't force DF when MTU is locked	Sabrina Dubroca	1	-1/+1
	The various types of tunnels running over IPv4 can ask to set the DF bit to do PMTU discovery. However, PMTU discovery is subject to the threshold set by the net.ipv4.route.min_pmtu sysctl, and is also disabled on routes with "mtu lock". In those cases, we shouldn't set the DF bit. This patch makes setting the DF bit conditional on the route's MTU locking state. This issue seems to be older than git history. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	MAINTAINERS: Add entry for CAKE qdisc	Toke Høiland-Jørgensen	1	-0/+6
	We would like the existing community to be kept in the loop for any new developments on CAKE; and I certainly plan to keep maintaining it. Reflect this in MAINTAINERS. Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	net: bridge: fix vlan stats use-after-free on destruction	Nikolay Aleksandrov	2	-1/+9
	Syzbot reported a use-after-free of the global vlan context on port vlan destruction. When I added per-port vlan stats I missed the fact that the global vlan context can be freed before the per-port vlan rcu callback. There're a few different ways to deal with this, I've chosen to add a new private flag that is set only when per-port stats are allocated so we can directly check it on destruction without dereferencing the global context at all. The new field in net_bridge_vlan uses a hole. v2: cosmetic change, move the check to br_process_vlan_info where the other checks are done v3: add change log in the patch, add private (in-kernel only) flags in a hole in net_bridge_vlan struct and use that instead of mixing user-space flags with private flags Fixes: 9163a0fc1f0c ("net: bridge: add support for per-port vlan stats") Reported-by: syzbot+04681da557a0e49a52e5@syzkaller.appspotmail.com Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	socket: do a generic_file_splice_read when proto_ops has no splice_read	Slavomir Kaslev	1	-1/+1
	splice(2) fails with -EINVAL when called reading on a socket with no splice_read set in its proto_ops (such as vsock sockets). Switch this to fallbacks to a generic_file_splice_read instead. Signed-off-by: Slavomir Kaslev <kaslevs@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs	Martin Schiller	1	-5/+5
	Up until commit 7e5fbd1e0700 ("net: mdio-gpio: Convert to use gpiod functions where possible"), the _cansleep variants of the gpio_ API was used. After that commit and the change to gpiod_ API, the _cansleep() was dropped. This then results in WARN_ON() when used with GPIO devices which do sleep. Add back the _cansleep() to avoid this. Fixes: 7e5fbd1e0700 ("net: mdio-gpio: Convert to use gpiod functions where possible") Signed-off-by: Martin Schiller <ms@dev.tdt.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>