aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2023-06-16Merge tag 'drm-fixes-2023-06-17' of git://anongit.freedesktop.org/drm/drmHEADmasterLinus Torvalds18-27/+281
Pull drm fixes from Dave Airlie: "A bunch of misc fixes across the board. amdgpu is the usual bulk with a revert and other fixes, nouveau has a race fix that was causing a UAF that was hard hanging systems, otherwise some qaic, bridge and radeon. amdgpu: - GFX9 preemption fixes - Add missing radeon secondary PCI ID - vblflash fixes - SMU 13 fix - VCN 4.0 fix - Re-enable TOPDOWN flag for large BAR systems to fix regression - eDP fix - PSR hang fix - DPIA fix radeon: - fbdev client warning fix qaic: - leak fix - null ptr deref fix nouveau: - use-after-free caused by fence race fix - runtime pm fix - NULL ptr checks bridge: - ti-sn65dsi86: Avoid possible buffer overflow" * tag 'drm-fixes-2023-06-17' of git://anongit.freedesktop.org/drm/drm: (21 commits) nouveau: fix client work fence deletion race drm/amd/display: limit DPIA link rate to HBR3 drm/amd/display: fix the system hang while disable PSR drm/amd/display: edp do not add non-edid timings Revert "drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar system" drm/amdgpu: vcn_4_0 set instance 0 init sched score to 1 drm/radeon: Disable outputs when releasing fbdev client drm/amd/pm: workaround for compute workload type on some skus drm/amd: Tighten permissions on VBIOS flashing attributes drm/amd: Make sure image is written to trigger VBIOS image update flow drm/amdgpu: add missing radeon secondary PCI ID drm/amdgpu: Implement gfx9 patch functions for resubmission drm/amdgpu: Modify indirect buffer packages for resubmission drm/amdgpu: Program gds backup address as zero if no gds allocated drm/nouveau: add nv_encoder pointer check for NULL drm/amdgpu: Reset CP_VMID_PREEMPT after trailing fence signaled drm/nouveau/dp: check for NULL nv_connector->native_mode drm/bridge: ti-sn65dsi86: Avoid possible buffer overflow drm/nouveau: don't detect DSM for non-NVIDIA device accel/qaic: Fix NULL pointer deref in qaic_destroy_drm_device() ...
2023-06-16afs: Fix vlserver probe RTT handlingDavid Howells1-2/+2
In the same spirit as commit ca57f02295f1 ("afs: Fix fileserver probe RTT handling"), don't rule out using a vlserver just because there haven't been enough packets yet to calculate a real rtt. Always set the server's probe rtt from the estimate provided by rxrpc_kernel_get_srtt, which is capped at 1 second. This could lead to EDESTADDRREQ errors when accessing a cell for the first time, even though the vl servers are known and have responded to a probe. Fixes: 1d4adfaf6574 ("rxrpc: Make rxrpc_kernel_get_srtt() indicate validity") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org Link: http://lists.infradead.org/pipermail/linux-afs/2023-June/006746.html Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-17Merge tag 'drm-misc-fixes-2023-06-16' of ↵Dave Airlie4-3/+15
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes maybe in time for v6.4-rc7: - qaic leak and null deref fix. - Fix runtime pm in nouveau. - Fix array overflow in ti-sn65dsi86 pwm chip handling. - Assorted null check fixes in nouveau. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <dev@lankhorst.se> Link: https://patchwork.freedesktop.org/patch/msgid/641eb8a8-fbd7-90ad-0805-310b7fec9344@lankhorst.se
2023-06-16Merge tag 'for-6.4-rc6-tag' of ↵Linus Torvalds3-10/+22
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Two fixes for NOCOW files, a regression fix in scrub and an assertion fix: - NOCOW fixes: - keep length of iomap direct io request in case of a failure - properly pass mode of extent reference checking, this can break some cases for swapfile - fix error value confusion when scrubbing a stripe - convert assertion to a proper error handling when loading global roots, reported by syzbot" * tag 'for-6.4-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: scrub: fix a return value overwrite in scrub_stripe() btrfs: do not ASSERT() on duplicated global roots btrfs: can_nocow_file_extent should pass down args->strict from callers btrfs: fix iomap_begin length for nocow writes
2023-06-16Merge tag 'block-6.4-2023-06-15' of git://git.kernel.dk/linuxLinus Torvalds1-9/+31
Pull block fix from Jens Axboe: "Just a single fix for blk-cg stats flushing" * tag 'block-6.4-2023-06-15' of git://git.kernel.dk/linux: blk-cgroup: Flush stats before releasing blkcg_gq
2023-06-16Merge tag 'io_uring-6.4-2023-06-15' of git://git.kernel.dk/linuxLinus Torvalds2-2/+13
Pull io_uring fixes from Jens Axboe: "A fix for sendmsg with CMSG, and the followup fix discussed for avoiding touching task->worker_private after the worker has started exiting" * tag 'io_uring-6.4-2023-06-15' of git://git.kernel.dk/linux: io_uring/io-wq: clear current->worker_private on exit io_uring/net: save msghdr->msg_control for retries
2023-06-16Merge tag 'sound-6.4-rc7' of ↵Linus Torvalds6-15/+35
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Just a few small fixes. The only change to the core code is for a minor race in ALSA OSS sequencer, and the rest are all device-specific fixes (regression fixes and a usual quirk)" * tag 'sound-6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: usb-audio: Add quirk flag for HEM devices to enable native DSD playback ALSA: usb-audio: Fix broken resume due to UAC3 power state ALSA: seq: oss: Fix racy open/close of MIDI devices ASoC: tegra: Fix Master Volume Control ALSA: hda/realtek: Add a quirk for Compaq N14JP6 firmware: cs_dsp: Log correct region name in bin error messages
2023-06-16Merge tag 'urgent-rcu.2023.06.11a' of ↵Linus Torvalds1-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU fix from Paul McKenney: "This fixes a spinlock-initialization regression in SRCU that causes the SRCU notifier to fail. The fix simply adds the initialization, but introduces a #ifdef because there is no spinlock to initialize for the Tiny SRCU used in !SMP builds. Yes, it would be nice to abstract this somehow in order to hide it in SRCU, but I still don't see a good way of doing this" * tag 'urgent-rcu.2023.06.11a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: notifier: Initialize new struct srcu_usage field
2023-06-16Merge tag 'riscv-for-linus-6.4-rc7' of ↵Linus Torvalds1-0/+18
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fix from Palmer Dabbelt: - A documentation patch describing how we use patchwork * tag 'riscv-for-linus-6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: Documentation: RISC-V: patch-acceptance: mention patchwork's role
2023-06-16Merge tag 'asoc-fix-v6.4-rc6-2' of ↵Takashi Iwai2-2/+6
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.4 A couple more fixes for v6.4, one fixing a misleading error log and another stopping us seeing spurious failures setting the master volume on some Tegra systems introduced by a change to how we calculate delay times.
2023-06-16ALSA: usb-audio: Add quirk flag for HEM devices to enable native DSD playbackLukasz Tyl1-0/+2
This commit adds new DEVICE_FLG with QUIRK_FLAG_DSD_RAW and Vendor Id for HEM devices which supports native DSD. Prior to this change Linux kernel was not enabling native DSD playback for HEM devices, and as a result, DSD audio was being converted to PCM "on the fly". HEM devices, when connected to the system, would only play audio in PCM format, even if the source material was in DSD format. With the addition of new VENDOR_FLG in the quircks.c file, the devices are now correctly recognized, and raw DSD data is transmitted to the device, allowing for native DSD playback. Signed-off-by: Lukasz Tyl <ltyl@hem-e.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20230614122524.30271-1-ltyl@hem-e.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-06-16ALSA: usb-audio: Fix broken resume due to UAC3 power stateTakashi Iwai1-0/+4
As reported in the bugzilla below, the PM resume of a UAC3 device may fail due to the incomplete power state change, stuck at D1. The reason is that the driver expects the full D0 power state change only at hw_params, while the normal PCM resume procedure doesn't call hw_params. For fixing the bug, we add the same power state update to D0 at the prepare callback, which is certainly called by the resume procedure. Note that, with this change, the power state change in the hw_params becomes almost redundant, since snd_usb_hw_params() doesn't touch the parameters (at least it tires so). But dropping it is still a bit risky (e.g. we have the media-driver binding), so I leave the D0 power state change in snd_usb_hw_params() as is for now. Fixes: a0a4959eb4e9 ("ALSA: usb-audio: Operate UAC3 Power Domains in PCM callbacks") Cc: <stable@vger.kernel.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217539 Link: https://lore.kernel.org/r/20230612132818.29486-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-06-16ALSA: seq: oss: Fix racy open/close of MIDI devicesTakashi Iwai1-13/+22
Although snd_seq_oss_midi_open() and snd_seq_oss_midi_close() can be called concurrently from different code paths, we have no proper data protection against races. Introduce open_mutex to each seq_oss_midi object for avoiding the races. Reported-by: "Gong, Sishuai" <sishuai@purdue.edu> Closes: https://lore.kernel.org/r/7DC9AF71-F481-4ABA-955F-76C535661E33@purdue.edu Link: https://lore.kernel.org/r/20230612125533.27461-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-06-15Merge tag 'net-6.4-rc7' of ↵Linus Torvalds76-442/+945
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireless, and netfilter. Selftests excluded - we have 58 patches and diff of +442/-199, which isn't really small but perhaps with the exception of the WiFi locking change it's old(ish) bugs. We have no known problems with v6.4. The selftest changes are rather large as MPTCP folks try to apply Greg's guidance that selftest from torvalds/linux should be able to run against stable kernels. Last thing I should call out is the DCCP/UDP-lite deprecation notices. We are fairly sure those are dead, but if we're wrong reverting them back in won't be fun. Current release - regressions: - wifi: - cfg80211: fix double lock bug in reg_wdev_chan_valid() - iwlwifi: mvm: spin_lock_bh() to fix lockdep regression Current release - new code bugs: - handshake: remove fput() that causes use-after-free Previous releases - regressions: - sched: cls_u32: fix reference counter leak leading to overflow - sched: cls_api: fix lockup on flushing explicitly created chain Previous releases - always broken: - nf_tables: integrate pipapo into commit protocol - nf_tables: incorrect error path handling with NFT_MSG_NEWRULE, fix dangling pointer on failure - ping6: fix send to link-local addresses with VRF - sched: act_pedit: parse L3 header for L4 offset, the skb may not have the offset saved - sched: act_ct: fix promotion of offloaded unreplied tuple - sched: refuse to destroy an ingress and clsact Qdiscs if there are lockless change operations in flight - wifi: mac80211: fix handful of bugs in multi-link operation - ipvlan: fix bound dev checking for IPv6 l3s mode - eth: enetc: correct the indexes of highest and 2nd highest TCs - eth: ice: fix XDP memory leak when NIC is brought up and down Misc: - add deprecation notices for UDP-lite and DCCP - selftests: mptcp: skip tests not supported by old kernels - sctp: handle invalid error codes without calling BUG()" * tag 'net-6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits) dccp: Print deprecation notice. udplite: Print deprecation notice. octeon_ep: Add missing check for ioremap selftests/ptp: Fix timestamp printf format for PTP_SYS_OFFSET net: ethernet: stmicro: stmmac: fix possible memory leak in __stmmac_open net: tipc: resize nlattr array to correct size sfc: fix XDP queues mode with legacy IRQ net: macsec: fix double free of percpu stats net: lapbether: only support ethernet devices MAINTAINERS: add reviewers for SMC Sockets s390/ism: Fix trying to free already-freed IRQ by repeated ism_dev_exit() net: dsa: felix: fix taprio guard band overflow at 10Mbps with jumbo frames net/sched: cls_api: Fix lockup on flushing explicitly created chain ice: Fix ice module unload net/handshake: remove fput() that causes use-after-free selftests: forwarding: hw_stats_l3: Set addrgenmode in a separate step net/sched: qdisc_destroy() old ingress and clsact Qdiscs before grafting net/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs net/sched: act_ct: Fix promotion of offloaded unreplied tuple wifi: iwlwifi: mvm: spin_lock_bh() to fix lockdep regression ...
2023-06-15Merge tag 'loongarch-fixes-6.4-1' of ↵Linus Torvalds6-6/+11
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Some trivial bug fixes for v6.4-rc7" * tag 'loongarch-fixes-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: Fix debugfs_create_dir() error checking LoongArch: Avoid uninitialized alignment_mask LoongArch: Fix perf event id calculation LoongArch: Fix the write_fcsr() macro LoongArch: Let pmd_present() return true when splitting pmd
2023-06-15Merge tag 'for-6.4/dm-fixes' of ↵Linus Torvalds4-23/+34
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix DM thinp discard performance regression introduced during this merge window where DM core was splitting large discards every 128K (max_sectors_kb) rather than every 64M (discard_max_bytes). - Extend DM core LOCKFS fix, made during 6.4 merge, to also fix race between do_mount and dm's do_suspend (in addition to the earlier fix's do_mount race with dm's do_resume). - Fix DM thin metadata operations to first check if the thin-pool is in "fail_io" mode; otherwise UAF can occur. - Fix DM thinp's call to __blkdev_issue_discard to use GFP_NOIO rather than GFP_NOWAIT (__blkdev_issue_discard cannot handle NULL return from bio_alloc). * tag 'for-6.4/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: use op specific max_sectors when splitting abnormal io dm thin: fix issue_discard to pass GFP_NOIO to __blkdev_issue_discard dm thin metadata: check fail_io before using data_sm dm: don't lock fs when the map is NULL during suspend or resume
2023-06-15Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds22-130/+447
Pull rdma fixes from Jason Gunthorpe: "This is an unusually large bunch of bug fixes for the later rc cycle, rxe and mlx5 both dumped a lot of things at once. rxe continues to fix itself, and mlx5 is fixing a bunch of "queue counters" related bugs. There is one highly notable bug fix regarding the qkey. This small security check was missed in the original 2005 implementation and it allows some significant issues. Summary: - Two rtrs bug fixes for error unwind bugs - Several rxe bug fixes: * Incorrect Rx packet validation * Using memory without a refcount * Syzkaller found use before initialization * Regression fix for missing locking with the tasklet conversion from this merge window - Have bnxt report the correct link properties to userspace, this was a regression in v6.3 - Several mlx5 bug fixes: * Kernel crash triggerable by userspace for the RAW ethernet profile * Defend against steering refcounting issues created by userspace * Incorrect change of QP port affinity parameters in some LAG configurations - Fix mlx5 Q counters: * Do not over allocate Q counters to allow userspace to use the full port capacity * Kernel crash triggered by eswitch due to mis-use of Q counters * Incorrect mlx5_device for Q counters in some LAG configurations - Properly implement the IBA spec restricting privileged qkeys to root - Always an error when reading from a disassociated device's event queue - isert bug fixes: * Avoid a deadlock with the CM handler and CM ID destruction * Correct list corruption due to incorrect locking * Fix a use after free around connection tear down" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/rxe: Fix rxe_cq_post IB/isert: Fix incorrect release of isert connection IB/isert: Fix possible list corruption in CMA handler IB/isert: Fix dead lock in ib_isert RDMA/mlx5: Fix affinity assignment IB/uverbs: Fix to consider event queue closing also upon non-blocking mode RDMA/uverbs: Restrict usage of privileged QKEYs RDMA/cma: Always set static rate to 0 for RoCE RDMA/mlx5: Fix Q-counters query in LAG mode RDMA/mlx5: Remove vport Q-counters dependency on normal Q-counters RDMA/mlx5: Fix Q-counters per vport allocation RDMA/mlx5: Create an indirect flow table for steering anchor RDMA/mlx5: Initiate dropless RQ for RAW Ethernet functions RDMA/rxe: Fix the use-before-initialization error of resp_pkts RDMA/bnxt_re: Fix reporting active_{speed,width} attributes RDMA/rxe: Fix ref count error in check_rkey() RDMA/rxe: Fix packet length checks RDMA/rtrs: Fix rxe_dealloc_pd warning RDMA/rtrs: Fix the last iu->buf leak in err path
2023-06-15Merge tag 'spi-fix-v6.4-rc6' of ↵Linus Torvalds3-3/+21
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A few more driver specific fixes. The DesignWare fix is for an issue introduced by conversion to the chip select accessor functions and is pretty important but the other two are less severe" * tag 'spi-fix-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: dw: Replace incorrect spi_get_chipselect with set spi: fsl-dspi: avoid SCK glitches with continuous transfers spi: cadence-quadspi: Add missing check for dma_set_mask
2023-06-15Merge tag 'regulator-fix-v6.4-rc6' of ↵Linus Torvalds1-15/+15
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "The set of regulators described for the Qualcomm PM8550 just seems to have been completely wrong and would likely not have worked at all if anything tried to actually configure anything except for enabling and disabling at runtime" * tag 'regulator-fix-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: qcom-rpmh: Fix regulators for PM8550
2023-06-15Merge tag 'regmap-fix-v6.4-rc6' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fix from Mark Brown: "Another fix for the maple tree cache, Takashi noticed that unlike other caches the maple tree cache didn't check for read only registers before trying to sync which would result in spurious syncs for read only registers where we don't have a default. This was due to the check being open coded in the caches, we now check in the shared 'does this register need sync' function so that is fixed for this and future caches" * tag 'regmap-fix-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: regcache: Don't sync read-only registers
2023-06-15Merge tag 'media/v6.4-6' of ↵Linus Torvalds2-49/+10
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "A fix for dvb-core to avoid a race condition during DVB board registration" * tag 'media/v6.4-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: Revert "media: dvb-core: Fix use-after-free on race condition at dvb_frontend"
2023-06-16nouveau: fix client work fence deletion raceDave Airlie1-4/+10
This seems to have existed for ever but is now more apparant after commit 9bff18d13473 ("drm/ttm: use per BO cleanup workers") My analysis: two threads are running, one in the irq signalling the fence, in dma_fence_signal_timestamp_locked, it has done the DMA_FENCE_FLAG_SIGNALLED_BIT setting, but hasn't yet reached the callbacks. The second thread in nouveau_cli_work_ready, where it sees the fence is signalled, so then puts the fence, cleanups the object and frees the work item, which contains the callback. Thread one goes again and tries to call the callback and causes the use-after-free. Proposed fix: lock the fence signalled check in nouveau_cli_work_ready, so either the callbacks are done or the memory is freed. Reviewed-by: Karol Herbst <kherbst@redhat.com> Fixes: 11e451e74050 ("drm/nouveau: remove fence wait code from deferred client work handler") Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://lore.kernel.org/dri-devel/20230615024008.1600281-1-airlied@gmail.com/
2023-06-16Merge tag 'amd-drm-fixes-6.4-2023-06-14' of ↵Dave Airlie13-20/+256
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.4-2023-06-14: amdgpu: - GFX9 preemption fixes - Add missing radeon secondary PCI ID - vblflash fixes - SMU 13 fix - VCN 4.0 fix - Re-enable TOPDOWN flag for large BAR systems to fix regression - eDP fix - PSR hang fix - DPIA fix radeon: - fbdev client warning fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230615024011.7773-1-alexander.deucher@amd.com
2023-06-15Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds1-12/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Fix two regressions in ext4, one report by syzkaller[1], and reported by multiple users (and tracked by regzbot[2])" [1] https://syzkaller.appspot.com/bug?extid=4acc7d910e617b360859 [2] https://linux-regtracking.leemhuis.info/regzbot/regression/ZIauBR7YiV3rVAHL@glitch/ * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: drop the call to ext4_error() from ext4_get_group_info() Revert "ext4: remove unnecessary check in ext4_bg_num_gdb_nometa"
2023-06-15Merge tag '6.4-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds9-56/+190
Pull smb client fixes from Steve French: "Eight, mostly small, smb3 client fixes: - important fix for deferred close oops (race with unmount) found with xfstest generic/098 to some servers - important reconnect fix - fix problem with max_credits mount option - two multichannel (interface related) fixes - one trivial removal of confusing comment - two small debugging improvements (to better spot crediting problems)" * tag '6.4-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: add a warning when the in-flight count goes negative cifs: fix lease break oops in xfstest generic/098 cifs: fix max_credits implementation cifs: fix sockaddr comparison in iface_cmp smb/client: print "Unknown" instead of bogus link speed value cifs: print all credit counters in DebugData cifs: fix status checks in cifs_tree_connect smb: remove obsolete comment
2023-06-15Merge branch 'udplite-dccp-print-deprecation-notice'Jakub Kicinski3-0/+9
Kuniyuki Iwashima says: ==================== udplite/dccp: Print deprecation notice. UDP-Lite is assumed to have no users for 7 years, and DCCP is orphaned for 7 years too. Let's add deprecation notice and see if anyone responds to it. ==================== Link: https://lore.kernel.org/r/20230614194705.90673-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15dccp: Print deprecation notice.Kuniyuki Iwashima1-0/+3
DCCP was marked as Orphan in the MAINTAINERS entry 2 years ago in commit 054c4610bd05 ("MAINTAINERS: dccp: move Gerrit Renker to CREDITS"). It says we haven't heard from the maintainer for five years, so DCCP is not well maintained for 7 years now. Recently DCCP only receives updates for bugs, and major distros disable it by default. Removing DCCP would allow for better organisation of TCP fields to reduce the number of cache lines hit in the fast path. Let's add a deprecation notice when DCCP socket is created and schedule its removal to 2025. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15udplite: Print deprecation notice.Kuniyuki Iwashima2-0/+6
Recently syzkaller reported a 7-year-old null-ptr-deref [0] that occurs when a UDP-Lite socket tries to allocate a buffer under memory pressure. Someone should have stumbled on the bug much earlier if UDP-Lite had been used in a real app. Also, we do not always need a large UDP-Lite workload to hit the bug since UDP and UDP-Lite share the same memory accounting limit. Removing UDP-Lite would simplify UDP code removing a bunch of conditionals in fast path. Let's add a deprecation notice when UDP-Lite socket is created and schedule its removal to 2025. Link: https://lore.kernel.org/netdev/20230523163305.66466-1-kuniyu@amazon.com/ [0] Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15octeon_ep: Add missing check for ioremapJiasheng Jiang1-1/+6
Add check for ioremap() and return the error if it fails in order to guarantee the success of ioremap(). Fixes: 862cd659a6fb ("octeon_ep: Add driver framework and device initialization") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://lore.kernel.org/r/20230615033400.2971-1-jiasheng@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15selftests/ptp: Fix timestamp printf format for PTP_SYS_OFFSETAlex Maftei1-3/+3
Previously, timestamps were printed using "%lld.%u" which is incorrect for nanosecond values lower than 100,000,000 as they're fractional digits, therefore leading zeros are meaningful. This patch changes the format strings to "%lld.%09u" in order to add leading zeros to the nanosecond value. Fixes: 568ebc5985f5 ("ptp: add the PTP_SYS_OFFSET ioctl to the testptp program") Fixes: 4ec54f95736f ("ptp: Fix compiler warnings in the testptp utility") Fixes: 6ab0e475f1f3 ("Documentation: fix misc. warnings") Signed-off-by: Alex Maftei <alex.maftei@amd.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/20230615083404.57112-1-alex.maftei@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15net: ethernet: stmicro: stmmac: fix possible memory leak in __stmmac_openChristian Marangi1-2/+7
Fix a possible memory leak in __stmmac_open when stmmac_init_phy fails. It's also needed to free everything allocated by stmmac_setup_dma_desc and not just the dma_conf struct. Drop free_dma_desc_resources from __stmmac_open and correctly call free_dma_desc_resources on each user of __stmmac_open on error. Reported-by: Jose Abreu <Jose.Abreu@synopsys.com> Fixes: ba39b344e924 ("net: ethernet: stmicro: stmmac: generate stmmac dma conf before open") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Cc: stable@vger.kernel.org Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Jose Abreu <Jose.Abreu@synopsys.com> Link: https://lore.kernel.org/r/20230614091714.15912-1-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15net: tipc: resize nlattr array to correct sizeLin Ma1-2/+2
According to nla_parse_nested_deprecated(), the tb[] is supposed to the destination array with maxtype+1 elements. In current tipc_nl_media_get() and __tipc_nl_media_set(), a larger array is used which is unnecessary. This patch resize them to a proper size. Fixes: 1e55417d8fc6 ("tipc: add media set to new netlink api") Fixes: 46f15c6794fb ("tipc: add media get/dump to new netlink api") Signed-off-by: Lin Ma <linma@zju.edu.cn> Reviewed-by: Florian Westphal <fw@strlen.de> Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Link: https://lore.kernel.org/r/20230614120604.1196377-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15dm: use op specific max_sectors when splitting abnormal ioMike Snitzer1-9/+16
Split abnormal IO in terms of the corresponding operation specific max_sectors (max_discard_sectors, max_secure_erase_sectors or max_write_zeroes_sectors). This fixes a significant dm-thinp discard performance regression that was introduced with commit e2dd8aca2d76 ("dm bio prison v1: improve concurrent IO performance"). Relative to discard: max_discard_sectors is used instead of max_sectors; which fixes excessive discard splitting (e.g. max_sectors=128K vs max_discard_sectors=64M). Tested by discarding an 1 Petabyte dm-thin device: lvcreate -V 1125899906842624B -T test/pool -n thin time blkdiscard /dev/test/thin Before this fix (splitting discards every 128K): ~116m After this fix (splitting discards every 64M) : 0m33.460s Reported-by: Zorro Lang <zlang@redhat.com> Fixes: 06961c487a33 ("dm: split discards further if target sets max_discard_granularity") Requires: 13f6facf3fae ("dm: allow targets to require splitting WRITE_ZEROES and SECURE_ERASE") Fixes: e2dd8aca2d76 ("dm bio prison v1: improve concurrent IO performance") Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-06-15dm thin: fix issue_discard to pass GFP_NOIO to __blkdev_issue_discardMike Snitzer1-2/+1
issue_discard() passes GFP_NOWAIT to __blkdev_issue_discard() despite its code assuming bio_alloc() always succeeds. Commit 3dba53a958a75 ("dm thin: use __blkdev_issue_discard for async discard support") clearly shows where things went bad: Before commit 3dba53a958a75, dm-thin.c's open-coded __blkdev_issue_discard_async() properly handled using GFP_NOWAIT. Unfortunately __blkdev_issue_discard() doesn't and it was missed during review. Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-06-15dm thin metadata: check fail_io before using data_smLi Lingfeng1-8/+12
Must check pmd->fail_io before using pmd->data_sm since pmd->data_sm may be destroyed by other processes. P1(kworker) P2(message) do_worker process_prepared process_prepared_discard_passdown_pt2 dm_pool_dec_data_range pool_message commit dm_pool_commit_metadata ↓ // commit failed metadata_operation_failed abort_transaction dm_pool_abort_metadata __open_or_format_metadata ↓ dm_sm_disk_open ↓ // open failed // pmd->data_sm is NULL dm_sm_dec_blocks ↓ // try to access pmd->data_sm --> UAF As shown above, if dm_pool_commit_metadata() and dm_pool_abort_metadata() fail in pool_message process, kworker may trigger UAF. Fixes: be500ed721a6 ("dm space maps: improve performance with inc/dec on ranges of blocks") Cc: stable@vger.kernel.org Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-06-15dm: don't lock fs when the map is NULL during suspend or resumeLi Lingfeng2-4/+5
As described in commit 38d11da522aa ("dm: don't lock fs when the map is NULL in process of resume"), a deadlock may be triggered between do_resume() and do_mount(). This commit preserves the fix from commit 38d11da522aa but moves it to where it also serves to fix a similar deadlock between do_suspend() and do_mount(). It does so, if the active map is NULL, by clearing DM_SUSPEND_LOCKFS_FLAG in dm_suspend() which is called by both do_suspend() and do_resume(). Fixes: 38d11da522aa ("dm: don't lock fs when the map is NULL in process of resume") Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-06-15sfc: fix XDP queues mode with legacy IRQÍñigo Huguet2-0/+4
In systems without MSI-X capabilities, xdp_txq_queues_mode is calculated in efx_allocate_msix_channels, but when enabling MSI-X fails, it was not changed to a proper default value. This was leading to the driver thinking that it has dedicated XDP queues, when it didn't. Fix it by setting xdp_txq_queues_mode to the correct value if the driver fallbacks to MSI or legacy IRQ mode. The correct value is EFX_XDP_TX_QUEUES_BORROWED because there are no XDP dedicated queues. The issue can be easily visible if the kernel is started with pci=nomsi, then a call trace is shown. It is not shown only with sfc's modparam interrupt_mode=2. Call trace example: WARNING: CPU: 2 PID: 663 at drivers/net/ethernet/sfc/efx_channels.c:828 efx_set_xdp_channels+0x124/0x260 [sfc] [...skip...] Call Trace: <TASK> efx_set_channels+0x5c/0xc0 [sfc] efx_probe_nic+0x9b/0x15a [sfc] efx_probe_all+0x10/0x1a2 [sfc] efx_pci_probe_main+0x12/0x156 [sfc] efx_pci_probe_post_io+0x18/0x103 [sfc] efx_pci_probe.cold+0x154/0x257 [sfc] local_pci_probe+0x42/0x80 Fixes: 6215b608a8c4 ("sfc: last resort fallback for lack of xdp tx queues") Reported-by: Yanghang Liu <yanghliu@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-15net: macsec: fix double free of percpu statsFedor Pchelkin1-7/+5
Inside macsec_add_dev() we free percpu macsec->secy.tx_sc.stats and macsec->stats on some of the memory allocation failure paths. However, the net_device is already registered to that moment: in macsec_newlink(), just before calling macsec_add_dev(). This means that during unregister process its priv_destructor - macsec_free_netdev() - will be called and will free the stats again. Remove freeing percpu stats inside macsec_add_dev() because macsec_free_netdev() will correctly free the already allocated ones. The pointers to unallocated stats stay NULL, and free_percpu() treats that correctly. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: 0a28bfd4971f ("net/macsec: Add MACsec skb_metadata_dst Tx Data path support") Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-15net: lapbether: only support ethernet devicesEric Dumazet1-0/+3
It probbaly makes no sense to support arbitrary network devices for lapbether. syzbot reported: skbuff: skb_under_panic: text:ffff80008934c100 len:44 put:40 head:ffff0000d18dd200 data:ffff0000d18dd1ea tail:0x16 end:0x140 dev:bond1 kernel BUG at net/core/skbuff.c:200 ! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 5643 Comm: dhcpcd Not tainted 6.4.0-rc5-syzkaller-g4641cff8e810 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : skb_panic net/core/skbuff.c:196 [inline] pc : skb_under_panic+0x13c/0x140 net/core/skbuff.c:210 lr : skb_panic net/core/skbuff.c:196 [inline] lr : skb_under_panic+0x13c/0x140 net/core/skbuff.c:210 sp : ffff8000973b7260 x29: ffff8000973b7270 x28: ffff8000973b7360 x27: dfff800000000000 x26: ffff0000d85d8150 x25: 0000000000000016 x24: ffff0000d18dd1ea x23: ffff0000d18dd200 x22: 000000000000002c x21: 0000000000000140 x20: 0000000000000028 x19: ffff80008934c100 x18: ffff8000973b68a0 x17: 0000000000000000 x16: ffff80008a43bfbc x15: 0000000000000202 x14: 0000000000000000 x13: 0000000000000001 x12: 0000000000000001 x11: 0000000000000201 x10: 0000000000000000 x9 : f22f7eb937cced00 x8 : f22f7eb937cced00 x7 : 0000000000000001 x6 : 0000000000000001 x5 : ffff8000973b6b78 x4 : ffff80008df9ee80 x3 : ffff8000805974f4 x2 : 0000000000000001 x1 : 0000000100000201 x0 : 0000000000000086 Call trace: skb_panic net/core/skbuff.c:196 [inline] skb_under_panic+0x13c/0x140 net/core/skbuff.c:210 skb_push+0xf0/0x108 net/core/skbuff.c:2409 ip6gre_header+0xbc/0x738 net/ipv6/ip6_gre.c:1383 dev_hard_header include/linux/netdevice.h:3137 [inline] lapbeth_data_transmit+0x1c4/0x298 drivers/net/wan/lapbether.c:257 lapb_data_transmit+0x8c/0xb0 net/lapb/lapb_iface.c:447 lapb_transmit_buffer+0x178/0x204 net/lapb/lapb_out.c:149 lapb_send_control+0x220/0x320 net/lapb/lapb_subr.c:251 lapb_establish_data_link+0x94/0xec lapb_device_event+0x348/0x4e0 notifier_call_chain+0x1a4/0x510 kernel/notifier.c:93 raw_notifier_call_chain+0x3c/0x50 kernel/notifier.c:461 __dev_notify_flags+0x2bc/0x544 dev_change_flags+0xd0/0x15c net/core/dev.c:8643 devinet_ioctl+0x858/0x17e4 net/ipv4/devinet.c:1150 inet_ioctl+0x2ac/0x4d8 net/ipv4/af_inet.c:979 sock_do_ioctl+0x134/0x2dc net/socket.c:1201 sock_ioctl+0x4ec/0x858 net/socket.c:1318 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl fs/ioctl.c:856 [inline] __arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:856 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] invoke_syscall+0x98/0x2c0 arch/arm64/kernel/syscall.c:52 el0_svc_common+0x138/0x244 arch/arm64/kernel/syscall.c:142 do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:191 el0_svc+0x4c/0x160 arch/arm64/kernel/entry-common.c:647 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:665 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 Code: aa1803e6 aa1903e7 a90023f5 947730f5 (d4210000) Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-15MAINTAINERS: add reviewers for SMC SocketsJan Karcher1-0/+3
adding three people from Alibaba as reviewers for SMC. They are currently working on improving SMC on other architectures than s390 and help with reviewing patches on top. Thank you D. Wythe, Tony Lu and Wen Gu for your contributions and collaboration and welcome on board as reviewers! Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Acked-by: Tony Lu <tonylu@linux.alibaba.com> Acked-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-15s390/ism: Fix trying to free already-freed IRQ by repeated ism_dev_exit()Julian Ruess1-8/+0
This patch prevents the system from crashing when unloading the ISM module. How to reproduce: Attach an ISM device and execute 'rmmod ism'. Error-Log: - Trying to free already-free IRQ 0 - WARNING: CPU: 1 PID: 966 at kernel/irq/manage.c:1890 free_irq+0x140/0x540 After calling ism_dev_exit() for each ISM device in the exit routine, pci_unregister_driver() will execute ism_remove() for each ISM device. Because ism_remove() also calls ism_dev_exit(), free_irq(pci_irq_vector(pdev, 0), ism) is called twice for each ISM device. This results in a crash with the error 'Trying to free already-free IRQ'. In the exit routine, it is enough to call pci_unregister_driver() because it ensures that ism_dev_exit() is called once per ISM device. Cc: <stable@vger.kernel.org> # 6.3+ Fixes: 89e7d2ba61b7 ("net/ism: Add new API for client registration") Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Julian Ruess <julianr@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-15LoongArch: Fix debugfs_create_dir() error checkingImmad Mir1-1/+1
The debugfs_create_dir() returns ERR_PTR in case of an error and the correct way of checking it is using the IS_ERR_OR_NULL inline function rather than the simple null comparision. This patch fixes the issue. Cc: stable@vger.kernel.org Suggested-By: Ivan Orlov <ivan.orlov0322@gmail.com> Signed-off-by: Immad Mir <mirimmad17@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-15LoongArch: Avoid uninitialized alignment_maskQing Zhang1-0/+2
The hardware monitoring points for instruction fetching and load/store operations need to align 4 bytes and 1/2/4/8 bytes respectively. Reported-by: Colin King <colin.i.king@gmail.com> Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-15LoongArch: Fix perf event id calculationHuacai Chen1-3/+3
LoongArch PMCFG has 10bit event id rather than 8 bit, so fix it. Cc: stable@vger.kernel.org Signed-off-by: Jun Yi <yijun@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-15LoongArch: Fix the write_fcsr() macroQi Hu1-1/+1
The "write_fcsr()" macro uses wrong the positions for val and dest in asm. Fix it! Reported-by: Miao HAO <haomiao19@mails.ucas.ac.cn> Signed-off-by: Qi Hu <huqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-15LoongArch: Let pmd_present() return true when splitting pmdHongchen Zhang2-1/+4
When we split a pmd into ptes, pmd_present() and pmd_trans_huge() should return true, otherwise it would be treated as a swap pmd. This is the same as arm64 does in commit b65399f6111b ("arm64/mm: Change THP helpers to comply with generic MM semantics"), we also add a new bit named _PAGE_PRESENT_INVALID for LoongArch. Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-14net: dsa: felix: fix taprio guard band overflow at 10Mbps with jumbo framesVladimir Oltean1-1/+1
The DEV_MAC_MAXLEN_CFG register contains a 16-bit value - up to 65535. Plus 2 * VLAN_HLEN (4), that is up to 65543. The picos_per_byte variable is the largest when "speed" is lowest - SPEED_10 = 10. In that case it is (1000000L * 8) / 10 = 800000. Their product - 52434400000 - exceeds 32 bits, which is a problem, because apparently, a multiplication between two 32-bit factors is evaluated as 32-bit before being assigned to a 64-bit variable. In fact it's a problem for any MTU value larger than 5368. Cast one of the factors of the multiplication to u64 to force the multiplication to take place on 64 bits. Issue found by Coverity. Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230613170907.2413559-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14net/sched: cls_api: Fix lockup on flushing explicitly created chainVlad Buslov1-5/+7
Mingshuai Ren reports: When a new chain is added by using tc, one soft lockup alarm will be generated after delete the prio 0 filter of the chain. To reproduce the problem, perform the following steps: (1) tc qdisc add dev eth0 root handle 1: htb default 1 (2) tc chain add dev eth0 (3) tc filter del dev eth0 chain 0 parent 1: prio 0 (4) tc filter add dev eth0 chain 0 parent 1: Fix the issue by accounting for additional reference to chains that are explicitly created by RTM_NEWCHAIN message as opposed to implicitly by RTM_NEWTFILTER message. Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush") Reported-by: Mingshuai Ren <renmingshuai@huawei.com> Closes: https://lore.kernel.org/lkml/87legswvi3.fsf@nvidia.com/T/ Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Link: https://lore.kernel.org/r/20230612093426.2867183-1-vladbu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14ice: Fix ice module unloadJakub Buchocki1-11/+5
Clearing the interrupt scheme before PFR reset, during the removal routine, could cause the hardware errors and possibly lead to system reboot, as the PF reset can cause the interrupt to be generated. Place the call for PFR reset inside ice_deinit_dev(), wait until reset and all pending transactions are done, then call ice_clear_interrupt_scheme(). This introduces a PFR reset to multiple error paths. Additionally, remove the call for the reset from ice_load() - it will be a part of ice_unload() now. Error example: [ 75.229328] ice 0000:ca:00.1: Failed to read Tx Scheduler Tree - User Selection data from flash [ 77.571315] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 [ 77.571418] {1}[Hardware Error]: event severity: recoverable [ 77.571459] {1}[Hardware Error]: Error 0, type: recoverable [ 77.571500] {1}[Hardware Error]: section_type: PCIe error [ 77.571540] {1}[Hardware Error]: port_type: 4, root port [ 77.571580] {1}[Hardware Error]: version: 3.0 [ 77.571615] {1}[Hardware Error]: command: 0x0547, status: 0x4010 [ 77.571661] {1}[Hardware Error]: device_id: 0000:c9:02.0 [ 77.571703] {1}[Hardware Error]: slot: 25 [ 77.571736] {1}[Hardware Error]: secondary_bus: 0xca [ 77.571773] {1}[Hardware Error]: vendor_id: 0x8086, device_id: 0x347a [ 77.571821] {1}[Hardware Error]: class_code: 060400 [ 77.571858] {1}[Hardware Error]: bridge: secondary_status: 0x2800, control: 0x0013 [ 77.572490] pcieport 0000:c9:02.0: AER: aer_status: 0x00200000, aer_mask: 0x00100020 [ 77.572870] pcieport 0000:c9:02.0: [21] ACSViol (First) [ 77.573222] pcieport 0000:c9:02.0: AER: aer_layer=Transaction Layer, aer_agent=Receiver ID [ 77.573554] pcieport 0000:c9:02.0: AER: aer_uncor_severity: 0x00463010 [ 77.691273] {2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 [ 77.691738] {2}[Hardware Error]: event severity: recoverable [ 77.691971] {2}[Hardware Error]: Error 0, type: recoverable [ 77.692192] {2}[Hardware Error]: section_type: PCIe error [ 77.692403] {2}[Hardware Error]: port_type: 4, root port [ 77.692616] {2}[Hardware Error]: version: 3.0 [ 77.692825] {2}[Hardware Error]: command: 0x0547, status: 0x4010 [ 77.693032] {2}[Hardware Error]: device_id: 0000:c9:02.0 [ 77.693238] {2}[Hardware Error]: slot: 25 [ 77.693440] {2}[Hardware Error]: secondary_bus: 0xca [ 77.693641] {2}[Hardware Error]: vendor_id: 0x8086, device_id: 0x347a [ 77.693853] {2}[Hardware Error]: class_code: 060400 [ 77.694054] {2}[Hardware Error]: bridge: secondary_status: 0x0800, control: 0x0013 [ 77.719115] pci 0000:ca:00.1: AER: can't recover (no error_detected callback) [ 77.719140] pcieport 0000:c9:02.0: AER: device recovery failed [ 77.719216] pcieport 0000:c9:02.0: AER: aer_status: 0x00200000, aer_mask: 0x00100020 [ 77.719390] pcieport 0000:c9:02.0: [21] ACSViol (First) [ 77.719557] pcieport 0000:c9:02.0: AER: aer_layer=Transaction Layer, aer_agent=Receiver ID [ 77.719723] pcieport 0000:c9:02.0: AER: aer_uncor_severity: 0x00463010 Fixes: 5b246e533d01 ("ice: split probe into smaller functions") Signed-off-by: Jakub Buchocki <jakubx.buchocki@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230612171421.21570-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14Merge branch '1GbE' of ↵Jakub Kicinski2-1/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-06-12 (igc, igb) This series contains updates to igc and igb drivers. Husaini clears Tx rings when interface is brought down for igc. Vinicius disables PTM and PCI busmaster when removing igc driver. Alex adds error check and path for NVM read error on igb. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igb: fix nvm.ops.read() error handling igc: Fix possible system crash when loading module igc: Clean the TX buffer and TX descriptor ring ==================== Link: https://lore.kernel.org/r/20230612205208.115292-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14net/handshake: remove fput() that causes use-after-freeLin Ma2-5/+0
A reference underflow is found in TLS handshake subsystem that causes a direct use-after-free. Part of the crash log is like below: [ 2.022114] ------------[ cut here ]------------ [ 2.022193] refcount_t: underflow; use-after-free. [ 2.022288] WARNING: CPU: 0 PID: 60 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110 [ 2.022432] Modules linked in: [ 2.022848] RIP: 0010:refcount_warn_saturate+0xbe/0x110 [ 2.023231] RSP: 0018:ffffc900001bfe18 EFLAGS: 00000286 [ 2.023325] RAX: 0000000000000000 RBX: 0000000000000007 RCX: 00000000ffffdfff [ 2.023438] RDX: 0000000000000000 RSI: 00000000ffffffea RDI: 0000000000000001 [ 2.023555] RBP: ffff888004c20098 R08: ffffffff82b392c8 R09: 00000000ffffdfff [ 2.023693] R10: ffffffff82a592e0 R11: ffffffff82b092e0 R12: ffff888004c200d8 [ 2.023813] R13: 0000000000000000 R14: ffff888004c20000 R15: ffffc90000013ca8 [ 2.023930] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 [ 2.024062] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.024161] CR2: ffff888003601000 CR3: 0000000002a2e000 CR4: 00000000000006f0 [ 2.024275] Call Trace: [ 2.024322] <TASK> [ 2.024367] ? __warn+0x7f/0x130 [ 2.024430] ? refcount_warn_saturate+0xbe/0x110 [ 2.024513] ? report_bug+0x199/0x1b0 [ 2.024585] ? handle_bug+0x3c/0x70 [ 2.024676] ? exc_invalid_op+0x18/0x70 [ 2.024750] ? asm_exc_invalid_op+0x1a/0x20 [ 2.024830] ? refcount_warn_saturate+0xbe/0x110 [ 2.024916] ? refcount_warn_saturate+0xbe/0x110 [ 2.024998] __tcp_close+0x2f4/0x3d0 [ 2.025065] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [ 2.025168] tcp_close+0x1f/0x70 [ 2.025231] inet_release+0x33/0x60 [ 2.025297] sock_release+0x1f/0x80 [ 2.025361] handshake_req_cancel_test2+0x100/0x2d0 [ 2.025457] kunit_try_run_case+0x4c/0xa0 [ 2.025532] kunit_generic_run_threadfn_adapter+0x15/0x20 [ 2.025644] kthread+0xe1/0x110 [ 2.025708] ? __pfx_kthread+0x10/0x10 [ 2.025780] ret_from_fork+0x2c/0x50 One can enable CONFIG_NET_HANDSHAKE_KUNIT_TEST config to reproduce above crash. The root cause of this bug is that the commit 1ce77c998f04 ("net/handshake: Unpin sock->file if a handshake is cancelled") adds one additional fput() function. That patch claims that the fput() is used to enable sock->file to be freed even when user space never calls DONE. However, it seems that the intended DONE routine will never give an additional fput() of ths sock->file. The existing two of them are just used to balance the reference added in sockfd_lookup(). This patch revert the mentioned commit to avoid the use-after-free. The patched kernel could successfully pass the KUNIT test and boot to shell. [ 0.733613] # Subtest: Handshake API tests [ 0.734029] 1..11 [ 0.734255] KTAP version 1 [ 0.734542] # Subtest: req_alloc API fuzzing [ 0.736104] ok 1 handshake_req_alloc NULL proto [ 0.736114] ok 2 handshake_req_alloc CLASS_NONE [ 0.736559] ok 3 handshake_req_alloc CLASS_MAX [ 0.737020] ok 4 handshake_req_alloc no callbacks [ 0.737488] ok 5 handshake_req_alloc no done callback [ 0.737988] ok 6 handshake_req_alloc excessive privsize [ 0.738529] ok 7 handshake_req_alloc all good [ 0.739036] # req_alloc API fuzzing: pass:7 fail:0 skip:0 total:7 [ 0.739444] ok 1 req_alloc API fuzzing [ 0.740065] ok 2 req_submit NULL req arg [ 0.740436] ok 3 req_submit NULL sock arg [ 0.740834] ok 4 req_submit NULL sock->file [ 0.741236] ok 5 req_lookup works [ 0.741621] ok 6 req_submit max pending [ 0.741974] ok 7 req_submit multiple [ 0.742382] ok 8 req_cancel before accept [ 0.742764] ok 9 req_cancel after accept [ 0.743151] ok 10 req_cancel after done [ 0.743510] ok 11 req_destroy works [ 0.743882] # Handshake API tests: pass:11 fail:0 skip:0 total:11 [ 0.744205] # Totals: pass:17 fail:0 skip:0 total:17 Acked-by: Chuck Lever <chuck.lever@oracle.com> Fixes: 1ce77c998f04 ("net/handshake: Unpin sock->file if a handshake is cancelled") Signed-off-by: Lin Ma <linma@zju.edu.cn> Link: https://lore.kernel.org/r/20230613083204.633896-1-linma@zju.edu.cn Link: https://lore.kernel.org/r/20230614015249.987448-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14Merge tag 'wireless-2023-06-14' of ↵Jakub Kicinski10-24/+36
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== A couple of straggler fixes, mostly in the stack: - fix fragmentation for multi-link related elements - fix callback copy/paste error - fix multi-link locking - remove double-locking of wiphy mutex - transmit only on active links, not all - activate links in the correct order - don't remove links that weren't added - disable soft-IRQs for LQ lock in iwlwifi * tag 'wireless-2023-06-14' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: iwlwifi: mvm: spin_lock_bh() to fix lockdep regression wifi: mac80211: fragment per STA profile correctly wifi: mac80211: Use active_links instead of valid_links in Tx wifi: cfg80211: remove links only on AP wifi: mac80211: take lock before setting vif links wifi: cfg80211: fix link del callback to call correct handler wifi: mac80211: fix link activation settings order wifi: cfg80211: fix double lock bug in reg_wdev_chan_valid() ==================== Link: https://lore.kernel.org/r/20230614075502.11765-1-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14ext4: drop the call to ext4_error() from ext4_get_group_info()Fabio M. De Francesco1-11/+9
A recent patch added a call to ext4_error() which is problematic since some callers of the ext4_get_group_info() function may be holding a spinlock, whereas ext4_error() must never be called in atomic context. This triggered a report from Syzbot: "BUG: sleeping function called from invalid context in ext4_update_super" (see the link below). Therefore, drop the call to ext4_error() from ext4_get_group_info(). In the meantime use eight characters tabs instead of nine characters ones. Reported-by: syzbot+4acc7d910e617b360859@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/00000000000070575805fdc6cdb2@google.com/ Fixes: 5354b2af3406 ("ext4: allow ext4_get_group_info() to fail") Suggested-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Link: https://lore.kernel.org/r/20230614100446.14337-1-fmdefrancesco@gmail.com
2023-06-14Revert "ext4: remove unnecessary check in ext4_bg_num_gdb_nometa"Kemeng Shi1-1/+4
This reverts commit ad3f09be6cfe332be8ff46c78e6ec0f8839107aa. The reverted commit was intended to simpfy the code to get group descriptor block number in non-meta block group by assuming s_gdb_count is block number used for all non-meta block group descriptors. However s_gdb_count is block number used for all meta *and* non-meta group descriptors. So s_gdb_group will be > actual group descriptor block number used for all non-meta block group which should be "total non-meta block group" / "group descriptors per block", e.g. s_first_meta_bg. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20230613225025.3859522-1-shikemeng@huaweicloud.com Fixes: ad3f09be6cfe ("ext4: remove unnecessary check in ext4_bg_num_gdb_nometa") Cc: stable@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-06-14Revert "media: dvb-core: Fix use-after-free on race condition at dvb_frontend"Mauro Carvalho Chehab2-49/+10
As reported by Thomas Voegtle <tv@lio96.de>, sometimes a DVB card does not initialize properly booting Linux 6.4-rc4. This is not always, maybe in 3 out of 4 attempts. After double-checking, the root cause seems to be related to the UAF fix, which is causing a race issue: [ 26.332149] tda10071 7-0005: found a 'NXP TDA10071' in cold state, will try to load a firmware [ 26.340779] tda10071 7-0005: downloading firmware from file 'dvb-fe-tda10071.fw' [ 989.277402] INFO: task vdr:743 blocked for more than 491 seconds. [ 989.283504] Not tainted 6.4.0-rc5-i5 #249 [ 989.288036] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 989.295860] task:vdr state:D stack:0 pid:743 ppid:711 flags:0x00004002 [ 989.295865] Call Trace: [ 989.295867] <TASK> [ 989.295869] __schedule+0x2ea/0x12d0 [ 989.295877] ? asm_sysvec_apic_timer_interrupt+0x16/0x20 [ 989.295881] schedule+0x57/0xc0 [ 989.295884] schedule_preempt_disabled+0xc/0x20 [ 989.295887] __mutex_lock.isra.16+0x237/0x480 [ 989.295891] ? dvb_get_property.isra.10+0x1bc/0xa50 [ 989.295898] ? dvb_frontend_stop+0x36/0x180 [ 989.338777] dvb_frontend_stop+0x36/0x180 [ 989.338781] dvb_frontend_open+0x2f1/0x470 [ 989.338784] dvb_device_open+0x81/0xf0 [ 989.338804] ? exact_lock+0x20/0x20 [ 989.338808] chrdev_open+0x7f/0x1c0 [ 989.338811] ? generic_permission+0x1a2/0x230 [ 989.338813] ? link_path_walk.part.63+0x340/0x380 [ 989.338815] ? exact_lock+0x20/0x20 [ 989.338817] do_dentry_open+0x18e/0x450 [ 989.374030] path_openat+0xca5/0xe00 [ 989.374031] ? terminate_walk+0xec/0x100 [ 989.374034] ? path_lookupat+0x93/0x140 [ 989.374036] do_filp_open+0xc0/0x140 [ 989.374038] ? __call_rcu_common.constprop.91+0x92/0x240 [ 989.374041] ? __check_object_size+0x147/0x260 [ 989.374043] ? __check_object_size+0x147/0x260 [ 989.374045] ? alloc_fd+0xbb/0x180 [ 989.374048] ? do_sys_openat2+0x243/0x310 [ 989.374050] do_sys_openat2+0x243/0x310 [ 989.374052] do_sys_open+0x52/0x80 [ 989.374055] do_syscall_64+0x5b/0x80 [ 989.421335] ? __task_pid_nr_ns+0x92/0xa0 [ 989.421337] ? syscall_exit_to_user_mode+0x20/0x40 [ 989.421339] ? do_syscall_64+0x67/0x80 [ 989.421341] ? syscall_exit_to_user_mode+0x20/0x40 [ 989.421343] ? do_syscall_64+0x67/0x80 [ 989.421345] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 989.421348] RIP: 0033:0x7fe895d067e3 [ 989.421349] RSP: 002b:00007fff933c2ba0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101 [ 989.421351] RAX: ffffffffffffffda RBX: 00007fff933c2c10 RCX: 00007fe895d067e3 [ 989.421352] RDX: 0000000000000802 RSI: 00005594acdce160 RDI: 00000000ffffff9c [ 989.421353] RBP: 0000000000000802 R08: 0000000000000000 R09: 0000000000000000 [ 989.421353] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000001 [ 989.421354] R13: 00007fff933c2ca0 R14: 00000000ffffffff R15: 00007fff933c2c90 [ 989.421355] </TASK> This reverts commit 6769a0b7ee0c3b31e1b22c3fadff2bfb642de23f. Fixes: 6769a0b7ee0c ("media: dvb-core: Fix use-after-free on race condition at dvb_frontend") Link: https://lore.kernel.org/all/da5382ad-09d6-20ac-0d53-611594b30861@lio96.de/ Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2023-06-14io_uring/io-wq: clear current->worker_private on exitJens Axboe1-1/+6
A recent fix stopped clearing PF_IO_WORKER from current->flags on exit, which meant that we can now call inc/dec running on the worker after it has been removed if it ends up scheduling in/out as part of exit. If this happens after an RCU grace period has passed, then the struct pointed to by current->worker_private may have been freed, and we can now be accessing memory that is freed. Ensure this doesn't happen by clearing the task worker_private field. Both io_wq_worker_running() and io_wq_worker_sleeping() check this field before going any further, and we don't need any accounting etc done after this worker has exited. Fixes: fd37b884003c ("io_uring/io-wq: don't clear PF_IO_WORKER on exit") Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-14RDMA/rxe: Fix rxe_cq_postBob Pearson1-2/+2
A recent patch replaced a tasklet execution of cq->comp_handler by a direct call. While this made sense it let changes to cq->notify state be unprotected and assumed that the cq completion machinery and the ulp done callbacks were reentrant. The result is that in some cases completion events can be lost. This patch moves the cq->comp_handler call inside of the spinlock in rxe_cq_post which solves both issues. This is compatible with the matching code in the request notify verb. Fixes: 78b26a335310 ("RDMA/rxe: Remove tasklet call from rxe_cq.c") Link: https://lore.kernel.org/r/20230612155032.17036-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-06-14btrfs: scrub: fix a return value overwrite in scrub_stripe()Qu Wenruo1-1/+1
[RETURN VALUE OVERWRITE] Inside scrub_stripe(), we would submit all the remaining stripes after iterating all extents. But since flush_scrub_stripes() can return error, we need to avoid overwriting the existing @ret if there is any error. However the existing check is doing the wrong check: ret2 = flush_scrub_stripes(); if (!ret2) ret = ret2; This would overwrite the existing @ret to 0 as long as the final flush detects no critical errors. [FIX] We should check @ret other than @ret2 in that case. Fixes: 8eb3dd17eadd ("btrfs: dev-replace: error out if we have unrepaired metadata error during") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-14cifs: add a warning when the in-flight count goes negativeShyam Prasad N1-0/+1
We've seen the in-flight count go into negative with some internal stress testing in Microsoft. Adding a WARN when this happens, in hope of understanding why this happens when it happens. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-14cifs: fix lease break oops in xfstest generic/098Steve French1-2/+6
umount can race with lease break so need to check if tcon->ses->server is still valid to send the lease break response. Reviewed-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Fixes: 59a556aebc43 ("SMB3: drop reference to cfile before sending oplock break") Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-14Documentation: RISC-V: patch-acceptance: mention patchwork's roleConor Dooley1-0/+18
Palmer suggested at some point, not sure if it was in one of the weekly linux-riscv syncs, or a conversation at FOSDEM, that we should document the role of the automation running on our patchwork instance plays in patch acceptance. Add a short note to the patch-acceptance document to that end. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20230606-rehab-monsoon-12c17bbe08e3@wendy Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-14selftests: forwarding: hw_stats_l3: Set addrgenmode in a separate stepDanielle Ratson1-4/+7
Setting the IPv6 address generation mode of a net device during its creation never worked, but after commit b0ad3c179059 ("rtnetlink: call validate_linkmsg in rtnl_create_link") it explicitly fails [1]. The failure is caused by the fact that validate_linkmsg() is called before the net device is registered, when it still does not have an 'inet6_dev'. Likewise, raising the net device before setting the address generation mode is meaningless, because by the time the mode is set, the address has already been generated. Therefore, fix the test to first create the net device, then set its IPv6 address generation mode and finally bring it up. [1] # ip link add name mydev addrgenmode eui64 type dummy RTNETLINK answers: Address family not supported by protocol Fixes: ba95e7930957 ("selftests: forwarding: hw_stats_l3: Add a new test") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/f3b05d85b2bc0c3d6168fe8f7207c6c8365703db.1686580046.git.petrm@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14Merge branch 'net-sched-fix-race-conditions-in-mini_qdisc_pair_swap'Paolo Abeni3-16/+50
Peilin Ye says: ==================== net/sched: Fix race conditions in mini_qdisc_pair_swap() These 2 patches fix race conditions for ingress and clsact Qdiscs as reported [1] by syzbot, split out from another [2] series (last 2 patches of it). Per-patch changelog omitted. Patch 1 hasn't been touched since last version; I just included everybody's tag. Patch 2 bases on patch 6 v1 of [2], with comments and commit log slightly changed. We also need rtnl_dereference() to load ->qdisc_sleeping since commit d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping"), so I changed that; please take yet another look, thanks! Patch 2 has been tested with the new reproducer Pedro posted [3]. [1] https://syzkaller.appspot.com/bug?extid=b53a9c0d1ea4ad62da8b [2] https://lore.kernel.org/r/cover.1684887977.git.peilin.ye@bytedance.com/ [3] https://lore.kernel.org/r/7879f218-c712-e9cc-57ba-665990f5f4c9@mojatatu.com/ ==================== Link: https://lore.kernel.org/r/cover.1686355297.git.peilin.ye@bytedance.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14net/sched: qdisc_destroy() old ingress and clsact Qdiscs before graftingPeilin Ye3-8/+42
mini_Qdisc_pair::p_miniq is a double pointer to mini_Qdisc, initialized in ingress_init() to point to net_device::miniq_ingress. ingress Qdiscs access this per-net_device pointer in mini_qdisc_pair_swap(). Similar for clsact Qdiscs and miniq_egress. Unfortunately, after introducing RTNL-unlocked RTM_{NEW,DEL,GET}TFILTER requests (thanks Hillf Danton for the hint), when replacing ingress or clsact Qdiscs, for example, the old Qdisc ("@old") could access the same miniq_{in,e}gress pointer(s) concurrently with the new Qdisc ("@new"), causing race conditions [1] including a use-after-free bug in mini_qdisc_pair_swap() reported by syzbot: BUG: KASAN: slab-use-after-free in mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573 Write of size 8 at addr ffff888045b31308 by task syz-executor690/14901 ... Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106 print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:319 print_report mm/kasan/report.c:430 [inline] kasan_report+0x11c/0x130 mm/kasan/report.c:536 mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573 tcf_chain_head_change_item net/sched/cls_api.c:495 [inline] tcf_chain0_head_change.isra.0+0xb9/0x120 net/sched/cls_api.c:509 tcf_chain_tp_insert net/sched/cls_api.c:1826 [inline] tcf_chain_tp_insert_unique net/sched/cls_api.c:1875 [inline] tc_new_tfilter+0x1de6/0x2290 net/sched/cls_api.c:2266 ... @old and @new should not affect each other. In other words, @old should never modify miniq_{in,e}gress after @new, and @new should not update @old's RCU state. Fixing without changing sch_api.c turned out to be difficult (please refer to Closes: for discussions). Instead, make sure @new's first call always happen after @old's last call (in {ingress,clsact}_destroy()) has finished: In qdisc_graft(), return -EBUSY if @old has any ongoing filter requests, and call qdisc_destroy() for @old before grafting @new. Introduce qdisc_refcount_dec_if_one() as the counterpart of qdisc_refcount_inc_nz() used for filter requests. Introduce a non-static version of qdisc_destroy() that does a TCQ_F_BUILTIN check, just like qdisc_put() etc. Depends on patch "net/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs". [1] To illustrate, the syzkaller reproducer adds ingress Qdiscs under TC_H_ROOT (no longer possible after commit c7cfbd115001 ("net/sched: sch_ingress: Only create under TC_H_INGRESS")) on eth0 that has 8 transmission queues: Thread 1 creates ingress Qdisc A (containing mini Qdisc a1 and a2), then adds a flower filter X to A. Thread 2 creates another ingress Qdisc B (containing mini Qdisc b1 and b2) to replace A, then adds a flower filter Y to B. Thread 1 A's refcnt Thread 2 RTM_NEWQDISC (A, RTNL-locked) qdisc_create(A) 1 qdisc_graft(A) 9 RTM_NEWTFILTER (X, RTNL-unlocked) __tcf_qdisc_find(A) 10 tcf_chain0_head_change(A) mini_qdisc_pair_swap(A) (1st) | | RTM_NEWQDISC (B, RTNL-locked) RCU sync 2 qdisc_graft(B) | 1 notify_and_destroy(A) | tcf_block_release(A) 0 RTM_NEWTFILTER (Y, RTNL-unlocked) qdisc_destroy(A) tcf_chain0_head_change(B) tcf_chain0_head_change_cb_del(A) mini_qdisc_pair_swap(B) (2nd) mini_qdisc_pair_swap(A) (3rd) | ... ... Here, B calls mini_qdisc_pair_swap(), pointing eth0->miniq_ingress to its mini Qdisc, b1. Then, A calls mini_qdisc_pair_swap() again during ingress_destroy(), setting eth0->miniq_ingress to NULL, so ingress packets on eth0 will not find filter Y in sch_handle_ingress(). This is just one of the possible consequences of concurrently accessing miniq_{in,e}gress pointers. Fixes: 7a096d579e8e ("net: sched: ingress: set 'unlocked' flag for Qdisc ops") Fixes: 87f373921c4e ("net: sched: ingress: set 'unlocked' flag for clsact Qdisc ops") Reported-by: syzbot+b53a9c0d1ea4ad62da8b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/0000000000006cf87705f79acf1a@google.com/ Cc: Hillf Danton <hdanton@sina.com> Cc: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14net/sched: Refactor qdisc_graft() for ingress and clsact QdiscsPeilin Ye1-10/+10
Grafting ingress and clsact Qdiscs does not need a for-loop in qdisc_graft(). Refactor it. No functional changes intended. Tested-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14net/sched: act_ct: Fix promotion of offloaded unreplied tuplePaul Blakey4-7/+21
Currently UNREPLIED and UNASSURED connections are added to the nf flow table. This causes the following connection packets to be processed by the flow table which then skips conntrack_in(), and thus such the connections will remain UNREPLIED and UNASSURED even if reply traffic is then seen. Even still, the unoffloaded reply packets are the ones triggering hardware update from new to established state, and if there aren't any to triger an update and/or previous update was missed, hardware can get out of sync with sw and still mark packets as new. Fix the above by: 1) Not skipping conntrack_in() for UNASSURED packets, but still refresh for hardware, as before the cited patch. 2) Try and force a refresh by reply-direction packets that update the hardware rules from new to established state. 3) Remove any bidirectional flows that didn't failed to update in hardware for re-insertion as bidrectional once any new packet arrives. Fixes: 6a9bad0069cf ("net/sched: act_ct: offload UDP NEW connections") Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/1686313379-117663-1-git-send-email-paulb@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14wifi: iwlwifi: mvm: spin_lock_bh() to fix lockdep regressionHugh Dickins1-6/+6
Lockdep on 6.4-rc on ThinkPad X1 Carbon 5th says ===================================================== WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected 6.4.0-rc5 #1 Not tainted ----------------------------------------------------- kworker/3:1/49 [HC0[0]:SC0[4]:HE1:SE0] is trying to acquire: ffff8881066fa368 (&mvm_sta->deflink.lq_sta.rs_drv.pers.lock){+.+.}-{2:2}, at: rs_drv_get_rate+0x46/0xe7 and this task is already holding: ffff8881066f80a8 (&sta->rate_ctrl_lock){+.-.}-{2:2}, at: rate_control_get_rate+0xbd/0x126 which would create a new lock dependency: (&sta->rate_ctrl_lock){+.-.}-{2:2} -> (&mvm_sta->deflink.lq_sta.rs_drv.pers.lock){+.+.}-{2:2} but this new dependency connects a SOFTIRQ-irq-safe lock: (&sta->rate_ctrl_lock){+.-.}-{2:2} etc. etc. etc. Changing the spin_lock() in rs_drv_get_rate() to spin_lock_bh() was not enough to pacify lockdep, but changing them all on pers.lock has worked. Fixes: a8938bc881d2 ("wifi: iwlwifi: mvm: Add locking to the rate read flow") Signed-off-by: Hugh Dickins <hughd@google.com> Link: https://lore.kernel.org/r/79ffcc22-9775-cb6d-3ffd-1a517c40beef@google.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-13Merge branch 'fix-small-bugs-and-annoyances-in-tc-testing'Jakub Kicinski3-7/+4
Vlad Buslov says: ==================== Fix small bugs and annoyances in tc-testing ==================== Link: https://lore.kernel.org/r/20230612075712.2861848-1-vladbu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13selftests/tc-testing: Remove configs that no longer existVlad Buslov1-5/+0
Some qdiscs and classifiers have recently been retired from kernel. However, tc-testing config is still cluttered with them which causes noise when using merge_config.sh script to update existing config for tc-testing compatibility. Remove the config settings for affected qdiscs and classifiers. Fixes: fb38306ceb9e ("net/sched: Retire ATM qdisc") Fixes: 051d44209842 ("net/sched: Retire CBQ qdisc") Fixes: bbe77c14ee61 ("net/sched: Retire dsmark qdisc") Fixes: 265b4da82dbf ("net/sched: Retire rsvp classifier") Fixes: 8c710f75256b ("net/sched: Retire tcindex classifier") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13selftests/tc-testing: Fix SFB db testVlad Buslov1-2/+2
Setting very small value of db like 10ms introduces rounding errors when converting to/from jiffies on some kernel configs. For example, on 250hz the actual value will be set to 12ms which causes the test to fail: # $ sudo ./tdc.py -d eth2 -e 3410 # -- ns/SubPlugin.__init__ # Test 3410: Create SFB with db setting # # All test results: # # 1..1 # not ok 1 3410 - Create SFB with db setting # Could not match regex pattern. Verify command output: # qdisc sfb 1: root refcnt 2 rehash 600s db 12ms limit 1000p max 25p target 20p increment 0.000503548 decrement 4.57771e-05 penalty_rate 10pps penalty_burst 20p Set the value to 100ms instead which currently seem to work on 100hz, 250hz, 300hz and 1000hz kernel configs. Fixes: 6ad92dc56fca ("selftests/tc-testing: add selftests for sfb qdisc") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13selftests/tc-testing: Fix Error: failed to find target LOGVlad Buslov1-0/+1
Add missing netfilter config dependency. Fixes following example error when running tests via tdc.sh for all XT tests: # $ sudo ./tdc.py -d eth2 -e 2029 # Test 2029: Add xt action with log-prefix # exit: 255 # exit: 0 # failed to find target LOG # # bad action parsing # parse_action: bad value (7:xt)! # Illegal "action" # # -----> teardown stage *** Could not execute: "$TC actions flush action xt" # # -----> teardown stage *** Error message: "Error: Cannot flush unknown TC action. # We have an error flushing # " # returncode 1; expected [0] # # -----> teardown stage *** Aborting test run. # # <_io.BufferedReader name=3> *** stdout *** # # <_io.BufferedReader name=5> *** stderr *** # "-----> teardown stage" did not complete successfully # Exception <class '__main__.PluginMgrTestFail'> ('teardown', ' failed to find target LOG\n\nbad action parsing\nparse_action: bad value (7:xt)!\nIllegal "action"\n', '"-----> teardown stage" did not complete successfully') (caught in test_runner, running test 2 2029 Add xt action with log-prefix stage teardown) # --------------- # traceback # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 495, in test_runner # res = run_one_test(pm, args, index, tidx) # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 434, in run_one_test # prepare_env(args, pm, 'teardown', '-----> teardown stage', tidx['teardown'], procout) # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 245, in prepare_env # raise PluginMgrTestFail( # --------------- # accumulated output for this test: # failed to find target LOG # # bad action parsing # parse_action: bad value (7:xt)! # Illegal "action" # # --------------- # # All test results: # # 1..1 # ok 1 2029 - Add xt action with log-prefix # skipped - "-----> teardown stage" did not complete successfully Fixes: 910d504bc187 ("selftests/tc-testings: add selftests for xt action") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13selftests/tc-testing: Fix Error: Specified qdisc kind is unknown.Vlad Buslov1-0/+1
All TEQL tests assume that sch_teql module is loaded. Load module in tdc.sh before running qdisc tests. Fixes following example error when running tests via tdc.sh for all TEQL tests: # $ sudo ./tdc.py -d eth2 -e 84a0 # -- ns/SubPlugin.__init__ # Test 84a0: Create TEQL with default setting # exit: 2 # exit: 0 # Error: Specified qdisc kind is unknown. # # -----> teardown stage *** Could not execute: "$TC qdisc del dev $DUMMY handle 1: root" # # -----> teardown stage *** Error message: "Error: Invalid handle. # " # returncode 2; expected [0] # # -----> teardown stage *** Aborting test run. # # <_io.BufferedReader name=3> *** stdout *** # # <_io.BufferedReader name=5> *** stderr *** # "-----> teardown stage" did not complete successfully # Exception <class '__main__.PluginMgrTestFail'> ('teardown', 'Error: Specified qdisc kind is unknown.\n', '"-----> teardown stage" did not complete successfully') (caught in test_runner, running test 2 84a0 Create TEQL with default setting stage teardown) # --------------- # traceback # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 495, in test_runner # res = run_one_test(pm, args, index, tidx) # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 434, in run_one_test # prepare_env(args, pm, 'teardown', '-----> teardown stage', tidx['teardown'], procout) # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 245, in prepare_env # raise PluginMgrTestFail( # --------------- # accumulated output for this test: # Error: Specified qdisc kind is unknown. # # --------------- # # All test results: # # 1..1 # ok 1 84a0 - Create TEQL with default setting # skipped - "-----> teardown stage" did not complete successfully Fixes: cc62fbe114c9 ("selftests/tc-testing: add selftests for teql qdisc") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13net: ethernet: ti: am65-cpsw: Call of_node_put() on error pathDan Carpenter1-1/+1
This code returns directly but it should instead call of_node_put() to drop some reference counts. Fixes: dab2b265dd23 ("net: ethernet: ti: am65-cpsw: Add support for SERDES configuration") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Roger Quadros <rogerq@kernel.org> Link: https://lore.kernel.org/r/e3012f0c-1621-40e6-bf7d-03c276f6e07f@kili.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13io_uring/net: save msghdr->msg_control for retriesJens Axboe1-1/+7
If the application sets ->msg_control and we have to later retry this command, or if it got queued with IOSQE_ASYNC to begin with, then we need to retain the original msg_control value. This is due to the net stack overwriting this field with an in-kernel pointer, to copy it in. Hitting that path for the second time will now fail the copy from user, as it's attempting to copy from a non-user address. Cc: stable@vger.kernel.org # 5.10+ Link: https://github.com/axboe/liburing/issues/880 Reported-and-tested-by: Marek Majkowski <marek@cloudflare.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-13Merge tag 'nios2_fix_v6.4' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux Pull NIOS2 dts fix from Dinh Nguyen: - Fix tse_mac "max-frame-size" property * tag 'nios2_fix_v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: nios2: dts: Fix tse_mac "max-frame-size" property
2023-06-13nios2: dts: Fix tse_mac "max-frame-size" propertyJanne Grunau2-2/+2
The given value of 1518 seems to refer to the layer 2 ethernet frame size without 802.1Q tag. Actual use of the "max-frame-size" including in the consumer of the "altr,tse-1.0" compatible is the MTU. Fixes: 95acd4c7b69c ("nios2: Device tree support") Fixes: 61c610ec61bb ("nios2: Add Max10 device tree") Cc: <stable@vger.kernel.org> Signed-off-by: Janne Grunau <j@jannau.net> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2023-06-13drm/amd/display: limit DPIA link rate to HBR3Peichen Huang1-0/+5
[Why] DPIA doesn't support UHBR, driver should not enable UHBR for dp tunneling [How] limit DPIA link rate to HBR3 Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Peichen Huang <peichen.huang@amd.com> Reviewed-by: Mustapha Ghaddar <Mustapha.Ghaddar@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-13drm/amd/display: fix the system hang while disable PSRTom Chung1-4/+6
[Why] When the PSR enabled. If you try to adjust the timing parameters, it may cause system hang. Because the timing mismatch with the DMCUB settings. [How] Disable the PSR before adjusting timing parameters. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Reviewed-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-13drm/amd/display: edp do not add non-edid timingsHersen Wu1-1/+7
[Why] most edp support only timings from edid. applying non-edid timings, especially those timings out of edp bandwidth, may damage edp. [How] do not add non-edid timings for edp. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Reviewed-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-13Revert "drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar ↵Arunpravin Paneer Selvam1-1/+1
system" This reverts commit c105518679b6e87232874ffc989ec403bee59664. This patch disables the TOPDOWN flag for APU and few dGPU cards which has the VRAM size equal to the BAR size. When we enable the TOPDOWN flag, we get the free blocks at the highest available memory region and we don't split the lower order blocks. This change is required to keep off the fragmentation related issues particularly in ASIC which has VRAM space <= 500MiB Hence, we are reverting this patch. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2270 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-06-13drm/amdgpu: vcn_4_0 set instance 0 init sched score to 1Sonny Jiang1-1/+5
Only vcn0 can process AV1 codecx. In order to use both vcn0 and vcn1 in h264/265 transcode to AV1 cases, set vcn0 sched score to 1 at initialization time. Signed-off-by: Sonny Jiang <sonjiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-06-13drm/radeon: Disable outputs when releasing fbdev clientThomas Zimmermann1-0/+1
Disable the modesetting pipeline before release the radeon's fbdev client. Fixes the following error: [ 17.217408] WARNING: CPU: 5 PID: 1464 at drivers/gpu/drm/ttm/ttm_bo.c:326 ttm_bo_release+0x27e/0x2d0 [ttm] [ 17.217418] Modules linked in: edac_mce_amd radeon(+) drm_ttm_helper ttm video drm_suballoc_helper drm_display_helper kvm irqbypass drm_kms_helper syscopyarea crc32_pclmul sysfillrect sha512_ssse3 sysimgblt sha512_generic cfbfillrect cfbimgblt wmi_bmof aesni_intel cfbcopyarea crypto_simd cryptd k10temp acpi_cpufreq wmi dm_mod [ 17.217432] CPU: 5 PID: 1464 Comm: systemd-udevd Not tainted 6.4.0-rc4+ #1 [ 17.217436] Hardware name: Micro-Star International Co., Ltd. MS-7A38/B450M PRO-VDH MAX (MS-7A38), BIOS B.G0 07/26/2022 [ 17.217438] RIP: 0010:ttm_bo_release+0x27e/0x2d0 [ttm] [ 17.217444] Code: 48 89 43 38 48 89 43 40 48 8b 5c 24 30 48 8b b5 40 08 00 00 48 8b 6c 24 38 48 83 c4 58 e9 7a 49 f7 e0 48 89 ef e9 6c fe ff ff <0f> 0b 48 83 7b 20 00 0f 84 b7 fd ff ff 0f 0b 0f 1f 00 e9 ad fd ff [ 17.217448] RSP: 0018:ffffc9000095fbb0 EFLAGS: 00010202 [ 17.217451] RAX: 0000000000000001 RBX: ffff8881052c8de0 RCX: 0000000000000000 [ 17.217453] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8881052c8de0 [ 17.217455] RBP: ffff888104a66e00 R08: ffff8881052c8de0 R09: ffff888104a7cf08 [ 17.217457] R10: ffffc9000095fbe0 R11: ffffc9000095fbe8 R12: ffff8881052c8c78 [ 17.217458] R13: ffff8881052c8c78 R14: dead000000000100 R15: ffff88810528b108 [ 17.217460] FS: 00007f319fcbb8c0(0000) GS:ffff88881a540000(0000) knlGS:0000000000000000 [ 17.217463] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 17.217464] CR2: 000055dc8b0224a0 CR3: 000000010373d000 CR4: 0000000000750ee0 [ 17.217466] PKRU: 55555554 [ 17.217468] Call Trace: [ 17.217470] <TASK> [ 17.217472] ? __warn+0x97/0x160 [ 17.217476] ? ttm_bo_release+0x27e/0x2d0 [ttm] [ 17.217481] ? report_bug+0x1ec/0x200 [ 17.217487] ? handle_bug+0x3c/0x70 [ 17.217490] ? exc_invalid_op+0x1f/0x90 [ 17.217493] ? preempt_count_sub+0xb5/0x100 [ 17.217496] ? asm_exc_invalid_op+0x16/0x20 [ 17.217500] ? ttm_bo_release+0x27e/0x2d0 [ttm] [ 17.217505] ? ttm_resource_move_to_lru_tail+0x1ab/0x1d0 [ttm] [ 17.217511] radeon_bo_unref+0x1a/0x30 [radeon] [ 17.217547] radeon_gem_object_free+0x20/0x30 [radeon] [ 17.217579] radeon_fbdev_fb_destroy+0x57/0x90 [radeon] [ 17.217616] unregister_framebuffer+0x72/0x110 [ 17.217620] drm_client_dev_unregister+0x6d/0xe0 [ 17.217623] drm_dev_unregister+0x2e/0x90 [ 17.217626] drm_put_dev+0x26/0x90 [ 17.217628] pci_device_remove+0x44/0xc0 [ 17.217631] really_probe+0x257/0x340 [ 17.217635] __driver_probe_device+0x73/0x120 [ 17.217638] driver_probe_device+0x2c/0xb0 [ 17.217641] __driver_attach+0xa0/0x150 [ 17.217643] ? __pfx___driver_attach+0x10/0x10 [ 17.217646] bus_for_each_dev+0x67/0xa0 [ 17.217649] bus_add_driver+0x10e/0x210 [ 17.217651] driver_register+0x5c/0x120 [ 17.217653] ? __pfx_radeon_module_init+0x10/0x10 [radeon] [ 17.217681] do_one_initcall+0x44/0x220 [ 17.217684] ? kmalloc_trace+0x37/0xc0 [ 17.217688] do_init_module+0x64/0x240 [ 17.217691] __do_sys_finit_module+0xb2/0x100 [ 17.217694] do_syscall_64+0x3b/0x90 [ 17.217697] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 17.217700] RIP: 0033:0x7f319feaa5a9 [ 17.217702] Code: 08 89 e8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 08 0d 00 f7 d8 64 89 01 48 [ 17.217706] RSP: 002b:00007ffc6bf3e7f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 17.217709] RAX: ffffffffffffffda RBX: 00005607204f3170 RCX: 00007f319feaa5a9 [ 17.217710] RDX: 0000000000000000 RSI: 00007f31a002eefd RDI: 0000000000000018 [ 17.217712] RBP: 00007f31a002eefd R08: 0000000000000000 R09: 00005607204f1860 [ 17.217714] R10: 0000000000000018 R11: 0000000000000246 R12: 0000000000020000 [ 17.217716] R13: 0000000000000000 R14: 0000560720522450 R15: 0000560720255899 [ 17.217718] </TASK> [ 17.217719] ---[ end trace 0000000000000000 ]--- The buffer object backing the fbdev emulation got pinned twice: by the fb_probe helper radeon_fbdev_create_pinned_object() and the modesetting code when the framebuffer got displayed. It only got unpinned once by the fbdev helper radeon_fbdev_destroy_pinned_object(). Hence TTM's BO- release function complains about the pin counter. Forcing the outputs off also undoes the modesettings pin increment. Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Reported-by: Borislav Petkov <bp@alien8.de> Closes: https://lore.kernel.org/dri-devel/20230603174814.GCZHt83pN+wNjf63sC@fat_crate.local/ Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: e317a69fe891 ("drm/radeon: Implement client-based fbdev emulation") Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: amd-gfx@lists.freedesktop.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-13drm/amd/pm: workaround for compute workload type on some skusKenneth Feng1-2/+31
On smu 13.0.0, the compute workload type cannot be set on all the skus due to some other problems. This workaround is to make sure compute workload type can also run on some specific skus. v2: keep the variable consistent Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-06-13drm/amd: Tighten permissions on VBIOS flashing attributesMario Limonciello1-2/+2
Non-root users shouldn't be able to try to trigger a VBIOS flash or query the flashing status. This should be reserved for users with the appropriate permissions. Cc: stable@vger.kernel.org Fixes: 8424f2ccb3c0 ("drm/amdgpu/psp: Add vbflash sysfs interface support") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-13drm/amd: Make sure image is written to trigger VBIOS image update flowMario Limonciello1-0/+3
The VBIOS image update flow requires userspace to: 1) Write the image to `psp_vbflash` 2) Read `psp_vbflash` 3) Poll `psp_vbflash_status` to check for completion If userspace reads `psp_vbflash` before writing an image, it's possible that it causes problems that can put the dGPU into an invalid state. Explicitly check that an image has been written before letting a read succeed. Cc: stable@vger.kernel.org Fixes: 8424f2ccb3c0 ("drm/amdgpu/psp: Add vbflash sysfs interface support") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-13drm/amdgpu: add missing radeon secondary PCI IDAlex Deucher1-0/+1
0x5b70 is a missing RV370 secondary id. Add it so we don't try and probe it with amdgpu. Cc: michel@daenzer.net Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Tested-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-06-13drm/amdgpu: Implement gfx9 patch functions for resubmissionJiadong Zhu1-0/+80
Patch the packages including CONTEXT_CONTROL and WRITE_DATA for gfx9 during the resubmission scenario. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.3.x
2023-06-13drm/amdgpu: Modify indirect buffer packages for resubmissionJiadong Zhu4-0/+102
When the preempted IB frame resubmitted to cp, we need to modify the frame data including: 1. set PRE_RESUME 1 in CONTEXT_CONTROL. 2. use meta data(DE and CE) read from CSA in WRITE_DATA. Add functions to save the location the first time IBs emitted and callback to patch the package when resubmission happens. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.3.x
2023-06-13drm/amdgpu: Program gds backup address as zero if no gds allocatedJiadong Zhu1-5/+8
It is firmware requirement to set gds_backup_addrlo and gds_backup_addrhi of DE meta both zero if no gds partition is allocated for the frame. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.3.x
2023-06-13drm/nouveau: add nv_encoder pointer check for NULLNatalia Petrova1-1/+2
Pointer nv_encoder could be dereferenced at nouveau_connector.c in case it's equal to NULL by jumping to goto label. This patch adds a NULL-check to avoid it. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 3195c5f9784a ("drm/nouveau: set encoder for lvds") Signed-off-by: Natalia Petrova <n.petrova@fintech.ru> Reviewed-by: Lyude Paul <lyude@redhat.com> [Fixed patch title] Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230512103320.82234-1-n.petrova@fintech.ru
2023-06-13drm/amdgpu: Reset CP_VMID_PREEMPT after trailing fence signaledJiadong Zhu1-4/+4
When MEC executes unmap_queue for mid command buffer preemption, it will kick the write pointer of the gfx ring, set CP_VMID_PREEMPT to trigger the preemption and wait for CP_VMID_PREEMPT becomes zero after the preemption done. There is a race condition that PFP may excute the resetting command before MEC set CP_VMID_PREEMPT. As a result, hang happens as CP_VMID_PREEMPT is always 0xffff. To avoid this, we send resetting CP_VMID_PREEMPT command after the trailing fence is siganled and update gfx write pointer explicitly. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.3.x Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2535
2023-06-13drm/nouveau/dp: check for NULL nv_connector->native_modeNatalia Petrova1-2/+2
Add checking for NULL before calling nouveau_connector_detect_depth() in nouveau_connector_get_modes() function because nv_connector->native_mode could be dereferenced there since connector pointer passed to nouveau_connector_detect_depth() and the same value of nv_connector->native_mode is used there. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: d4c2c99bdc83 ("drm/nouveau/dp: remove broken display depth function, use the improved one") Signed-off-by: Natalia Petrova <n.petrova@fintech.ru> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230512111526.82408-1-n.petrova@fintech.ru
2023-06-13spi: dw: Replace incorrect spi_get_chipselect with setAbe Kohandel1-1/+1
Commit 445164e8c136 ("spi: dw: Replace spi->chip_select references with function calls") replaced direct access to spi.chip_select with spi_*_chipselect calls but incorrectly replaced a set instance with a get instance, replace the incorrect instance. Fixes: 445164e8c136 ("spi: dw: Replace spi->chip_select references with function calls") Signed-off-by: Abe Kohandel <abe.kohandel@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Serge Semin <fancer.lancer@gmail.com> Link: https://lore.kernel.org/r/20230613162103.569812-1-abe.kohandel@intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-13drm/bridge: ti-sn65dsi86: Avoid possible buffer overflowSu Hui1-0/+4
Smatch error:buffer overflow 'ti_sn_bridge_refclk_lut' 5 <= 5. Fixes: cea86c5bb442 ("drm/bridge: ti-sn65dsi86: Implement the pwm_chip") Signed-off-by: Su Hui <suhui@nfschina.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20230608012443.839372-1-suhui@nfschina.com
2023-06-13Merge tag 'devicetree-fixes-for-6.4-3' of ↵Linus Torvalds12-13/+15
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Fix missing of_node_put() in init_overlay_changeset() - Fix schema for qcom,pmic-mpp "qcom,paired" property - Fix 'additionalProperties' in silvaco,i3c-master binding - usage-model.rst: Use documented "arm,primecell" compatible string - Update Damien Le Moal's email address - Fixes in Realtek Bluetooth binding * tag 'devicetree-fixes-for-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: pinctrl: qcom,pmic-mpp: Fix schema for "qcom,paired" dt-bindings: i3c: silvaco,i3c-master: fix missing schema restriction of: overlay: Fix missing of_node_put() in error case of init_overlay_changeset() docs: zh_CN/devicetree: sync usage-model fix docs: dt: fix documented Primecell compatible string dt-bindings: Change Damien Le Moal's contact email dt-bindings: net: realtek-bluetooth: Fix double RTL8723CS in desc dt-bindings: net: realtek-bluetooth: Fix RTL8821CS binding
2023-06-13dt-bindings: pinctrl: qcom,pmic-mpp: Fix schema for "qcom,paired"Rob Herring1-2/+3
The "qcom,paired" schema is all wrong. First, it's a list rather than an object(dictionary). Second, it is missing a required type. The meta-schema normally catches this, but schemas under "$defs" was not getting checked. A fix for that is pending. Fixes: f9a06b810951 ("dt-bindings: pinctrl: qcom,pmic-mpp: Convert qcom pmic mpp bindings to YAML") Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230418150606.1528107-1-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2023-06-13regmap: regcache: Don't sync read-only registersTakashi Iwai1-0/+3
regcache_maple_sync() tries to sync all cached values no matter whether it's writable or not. OTOH, regache_sync_val() does care the wrtability and returns -EIO for a read-only register. This results in an error message like: snd_hda_codec_realtek hdaudioC0D0: Unable to sync register 0x2f0009. -5 and the sync loop is aborted incompletely. This patch adds the writable register check to regcache_sync_val() for addressing the bug above. Note that, although we may add the check in the caller side (regcache_maple_sync()), here we put in regcache_sync_val(), so that a similar case like this can be avoided in future. Fixes: f033c26de5a5 ("regmap: Add maple tree based register cache") Link: https://lore.kernel.org/r/877cs7g6f1.wl-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/20230613112240.3361-1-tiwai@suse.de Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-13ASoC: tegra: Fix Master Volume ControlJon Hunter1-0/+3
Commit 3ed2b549b39f ("ALSA: pcm: fix wait_time calculations") corrected the PCM wait_time calculations and in doing so reduced the calculated wait_time. This exposed an issue with the Tegra Master Volume Control (MVC) device where the reduced wait_time caused the MVC to fail. For now fix this by setting the default wait_time for Tegra to be 500ms. Fixes: 3ed2b549b39f ("ALSA: pcm: fix wait_time calculations") Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Link: https://lore.kernel.org/r/20230613093453.13927-1-jonathanh@nvidia.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-13drm/nouveau: don't detect DSM for non-NVIDIA deviceRatchanan Srirattanamet1-0/+3
The call site of nouveau_dsm_pci_probe() uses single set of output variables for all invocations. So, we must not write anything to them unless it's an NVIDIA device. Otherwise, if we are called with another device after the NVIDIA device, we'll clober the result of the NVIDIA device. For example, if the other device doesn't have _PR3 resources, the detection later would miss the presence of power resource support, and the rest of the code will keep using Optimus DSM, breaking power management for that machine. Also, because we're detecting NVIDIA's DSM, it doesn't make sense to run this detection on a non-NVIDIA device anyway. Thus, check at the beginning of the detection code if this is an NVIDIA card, and just return if it isn't. This, together with commit d22915d22ded ("drm/nouveau/devinit/tu102-: wait for GFW_BOOT_PROGRESS == COMPLETED") developed independently and landed earlier, fixes runtime power management of the NVIDIA card in Lenovo Legion 5-15ARH05. Without this patch, the GPU resumption code will "timeout", sometimes hanging userspace. As a bonus, we'll also stop preventing _PR3 usage from the bridge for unrelated devices, which is always nice, I guess. Fixes: ccfc2d5cdb02 ("drm/nouveau: Use generic helper to check _PR3 presence") Signed-off-by: Ratchanan Srirattanamet <peathot@hotmail.com> Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/79 Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/DM6PR19MB2780805D4BE1E3F9B3AC96D0BC409@DM6PR19MB2780.namprd19.prod.outlook.com
2023-06-12Merge branch 'mptcp-fixes'Jakub Kicinski2-185/+354
Matthieu Baerts says: ==================== selftests: mptcp: skip tests not supported by old kernels (part 3) After a few years of increasing test coverage in the MPTCP selftests, we realised [1] the last version of the selftests is supposed to run on old kernels without issues. Supporting older versions is not that easy for this MPTCP case: these selftests are often validating the internals by checking packets that are exchanged, when some MIB counters are incremented after some actions, how connections are getting opened and closed in some cases, etc. In other words, it is not limited to the socket interface between the userspace and the kernelspace. In addition to that, the current MPTCP selftests run a lot of different sub-tests but the TAP13 protocol used in the selftests don't support sub-tests: one failure in sub-tests implies that the whole selftest is seen as failed at the end because sub-tests are not tracked. It is then important to skip sub-tests not supported by old kernels. To minimise the modifications and reduce the complexity to support old versions, the idea is to look at external signs and skip the whole selftest or just some sub-tests before starting them. This cannot be applied in all cases. Similar to the second part, this third one focuses on marking different sub-tests as skipped if some MPTCP features are not supported. This time, only in "mptcp_join.sh" selftest, the remaining one, is modified. Several techniques are used here to achieve this task: - Before starting some tests: - Check if a file (sysctl knob) is present: that's what patch 12/17 is doing for the userspace PM feature. - Check if a required kernel symbol is present in /proc/kallsyms: patches 9, 10, 14 and 15/17 are using this technique. - Check if it is possible to setup a particular network environment requiring Netfilter or TC: if the preparation step fail, the linked sub-test is marked as skipped. Patch 5/17 is doing that. - Check if a MIB counter is available: patches 7 and 13/17 do that. - Check if the kernel version is newer than a specific one: patch 1/17 adds some helpers in mptcp_lib.sh to ease its use. That's not ideal and it is only used as last resort but as mentioned above, it is important to skip tests if they are not supported not to have the whole selftest always being marked as failed on old kernels. Patches 11 and 17/17 are checking the kernel version. An alternative would be to ignore the results for some sub-tests but that's not ideal too. Note that SELFTESTS_MPTCP_LIB_NO_KVERSION_CHECK env var can be set to 1 not to skip these tests if the running kernel doesn't have a supported version. - After having launched the tests: - Adapt the expectations depending on the presence of a kernel symbol (patch 6/17) or a kernel version (patch 8/17). - Check is a MIB counter is available and skip the verification if not. Patch 4/17 is using this technique. Before skipping tests, SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var value is checked: if it is set to 1, the test is marked as "failed" instead of "skipped". MPTCP public CI expects to have all features supported and it sets this env var to 1 to catch regressions in these new checks. Patch 2/17 uses 'iptables-legacy' if available because it might be needed when using an older kernel not supporting iptables-nft. Patch 3/17 adds some helpers used in the other patches mentioned to easily mark sub-tests as skipped. Patch 16/17 uniforms MPTCP Join "listener" tests: it was imported code from userspace_pm.sh but without using the "code style" and ways of using tools and printing messages from MPTCP Join selftest. Link: https://lore.kernel.org/stable/CA+G9fYtDGpgT4dckXD-y-N92nqUxuvue_7AtDdBcHrbOMsDZLg@mail.gmail.com/ [1] Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 ==================== Link: https://lore.kernel.org/r/20230609-upstream-net-20230610-mptcp-selftests-support-old-kernels-part-3-v1-0-2896fe2ee8a3@tessares.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: skip mixed tests if not supportedMatthieu Baerts1-4/+8
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of a mix of subflows in v4 and v6 by the in-kernel PM introduced by commit b9d69db87fb7 ("mptcp: let the in-kernel PM use mixed IPv4 and IPv6 addresses"). It looks like there is no external sign we can use to predict the expected behaviour. Instead of accepting different behaviours and thus not really checking for the expected behaviour, we are looking here for a specific kernel version. That's not ideal but it looks better than removing the test because it cannot support older kernel versions. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: ad3493746ebe ("selftests: mptcp: add test-cases for mixed v4/v6 subflows") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: uniform listener testsMatthieu Baerts1-16/+14
The alignment was different from the other tests because tabs were used instead of spaces. While at it, also use 'echo' instead of 'printf' to print the result to keep the same style as done in the other sub-tests. And, even if it should be better with, also remove 'stdbuf' and sed's '--unbuffered' option because they are not used in the other subtests and they are not available when using a minimal environment with busybox. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 178d023208eb ("selftests: mptcp: listener test for in-kernel PM") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: skip PM listener tests if not supportedMatthieu Baerts1-0/+5
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of PM listener events introduced by commit f8c9dfbd875b ("mptcp: add pm listener events"). It is possible to look for "mptcp_event_pm_listener" in kallsyms to know in advance if the kernel supports this feature. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 178d023208eb ("selftests: mptcp: listener test for in-kernel PM") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: skip MPC backups tests if not supportedMatthieu Baerts1-4/+8
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of sending an MP_PRIO signal for the initial subflow, introduced by commit c157bbe776b7 ("mptcp: allow the in kernel PM to set MPC subflow priority"). It is possible to look for "mptcp_subflow_send_ack" in kallsyms because it was needed to introduce the mentioned feature. So we can know in advance if the feature is supported instead of trying and accepting any results. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 914f6a59b10f ("selftests: mptcp: add MPC backup tests") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: skip fail tests if not supportedMatthieu Baerts1-1/+1
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of the MP_FAIL / infinite mapping introduced by commit 1e39e5a32ad7 ("mptcp: infinite mapping sending") and the following ones. It is possible to look for one of the infinite mapping counters to know in advance if the this feature is available. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: b6e074e171bc ("selftests: mptcp: add infinite map testcase") Cc: stable@vger.kernel.org Fixes: 2ba18161d407 ("selftests: mptcp: add MP_FAIL reset testcase") Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: skip userspace PM tests if not supportedMatthieu Baerts1-9/+17
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of the userspace PM introduced by commit 4638de5aefe5 ("mptcp: handle local addrs announced by userspace PMs") and the following ones. It is possible to look for the MPTCP pm_type's sysctl knob to know in advance if the userspace PM is available. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 5ac1d2d63451 ("selftests: mptcp: Add tests for userspace PM type") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: skip fullmesh flag tests if not supportedMatthieu Baerts1-4/+8
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of the fullmesh flag for the in-kernel PM introduced by commit 2843ff6f36db ("mptcp: remote addresses fullmesh") and commit 1a0d6136c5f0 ("mptcp: local addresses fullmesh"). It looks like there is no easy external sign we can use to predict the expected behaviour. We could add the flag and then check if it has been added but for that, and for each fullmesh test, we would need to setup a new environment, do the checks, clean it and then only start the test from yet another clean environment. To keep it simple and avoid introducing new issues, we look for a specific kernel version. That's not ideal but an acceptable solution for this case. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 6a0653b96f5d ("selftests: mptcp: add fullmesh setting tests") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: skip backup if set flag on ID not supportedMatthieu Baerts1-3/+6
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. Commit bccefb762439 ("selftests: mptcp: simplify pm_nl_change_endpoint") has simplified the way the backup flag is set on an endpoint. Instead of doing: ./pm_nl_ctl set 10.0.2.1 flags backup Now we do: ./pm_nl_ctl set id 1 flags backup The new way is easier to maintain but it is also incompatible with older kernels not supporting the implicit endpoints putting in place the infrastructure to set flags per ID, hence the second Fixes tag. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: bccefb762439 ("selftests: mptcp: simplify pm_nl_change_endpoint") Cc: stable@vger.kernel.org Fixes: 4cf86ae84c71 ("mptcp: strict local address ID selection") Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: skip implicit tests if not supportedMatthieu Baerts1-2/+5
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of the implicit endpoints introduced by commit d045b9eb95a9 ("mptcp: introduce implicit endpoints"). It is possible to look for "mptcp_subflow_send_ack" in kallsyms because it was needed to introduce the mentioned feature. So we can know in advance if the feature is supported instead of trying and accepting any results. Note that here and in the following commits, we re-do the same check for each sub-test of the same function for a few reasons. The main one is not to break the ID assign to each test in order to be able to easily compare results between different kernel versions. Also, we can still run a specific test even if it is skipped. Another reason is that it makes it clear during the review that a specific subtest will be skipped or not under certain conditions. At the end, it looks OK to call the exact same helper multiple times: it is not a critical path and it is the same code that is executed, not really more cases to maintain. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 69c6ce7b6eca ("selftests: mptcp: add implicit endpoint test case") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: support RM_ADDR for used endpoints or notMatthieu Baerts1-1/+6
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. At some points, a new feature caused internal behaviour changes we are verifying in the selftests, see the Fixes tag below. It was not a UAPI change but because in these selftests, we check some internal behaviours, it is normal we have to adapt them from time to time after having added some features. It looks like there is no external sign we can use to predict the expected behaviour. Instead of accepting different behaviours and thus not really checking for the expected behaviour, we are looking here for a specific kernel version. That's not ideal but it looks better than removing the test because it cannot support older kernel versions. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 6fa0174a7c86 ("mptcp: more careful RM_ADDR generation") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: skip Fastclose tests if not supportedMatthieu Baerts1-2/+15
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of MP_FASTCLOSE introduced in commit f284c0c77321 ("mptcp: implement fastclose xmit path"). If the MIB counter is not available, the test cannot be verified and the behaviour will not be the expected one. So we can skip the test if the counter is missing. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 01542c9bf9ab ("selftests: mptcp: add fastclose testcase") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: support local endpoint being tracked or notMatthieu Baerts1-4/+11
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. At some points, a new feature caused internal behaviour changes we are verifying in the selftests, see the Fixes tag below. It was not a uAPI change but because in these selftests, we check some internal behaviours, it is normal we have to adapt them from time to time after having added some features. It is possible to look for "mptcp_pm_subflow_check_next" in kallsyms because it was needed to introduce the mentioned feature. So we can know in advance what the behaviour we are expecting here instead of supporting the two behaviours. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 86e39e04482b ("mptcp: keep track of local endpoint still available for each msk") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: skip test if iptables/tc cmds failMatthieu Baerts1-31/+57
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. Some tests are using IPTables and/or TC commands to force some behaviours. If one of these commands fails -- likely because some features are not available due to missing kernel config -- we should intercept the error and skip the tests requiring these features. Note that if we expect to have these features available and if SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var is set to 1, the tests will be marked as failed instead of skipped. This patch also replaces the 'exit 1' by 'return 1' not to stop the selftest in the middle without the conclusion if there is an issue with NF or TC. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 8d014eaa9254 ("selftests: mptcp: add ADD_ADDR timeout test case") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: skip check if MIB counter not supportedMatthieu Baerts1-100/+130
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the MPTCP MIB counters introduced in commit fc518953bc9c ("mptcp: add and use MIB counter infrastructure") and more later. The MPTCP Join selftest heavily relies on these counters. If a counter is not supported by the kernel, it is not displayed when using 'nstat -z'. We can then detect that and skip the verification. A new helper (get_counter()) has been added to do the required checks and return an error if the counter is not available. Note that if we expect to have these features available and if SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var is set to 1, the tests will be marked as failed instead of skipped. This new helper also makes sure we get the exact counter we want to avoid issues we had in the past, e.g. with MPTcpExtRmAddr and MPTcpExtRmAddrDrop sharing the same prefix. While at it, we uniform the way we fetch a MIB counter. Note for the backports: we rarely change these modified blocks so if there is are conflicts, it is very likely because a counter is not used in the older kernels and we don't need that chunk. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: b08fbf241064 ("selftests: add test-cases for MPTCP MP_JOIN") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: helpers to skip testsMatthieu Baerts1-0/+27
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. Here are some helpers that will be used to mark subtests as skipped if a feature is not supported. Marking as a fix for the commit introducing this selftest to help with the backports. While at it, also check if kallsyms feature is available as it will also be used in the following commits to check if MPTCP features are available before starting a test. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: b08fbf241064 ("selftests: add test-cases for MPTCP MP_JOIN") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: join: use 'iptables-legacy' if availableMatthieu Baerts1-6/+12
IPTables commands using 'iptables-nft' fail on old kernels, at least 5.15 because it doesn't see the default IPTables chains: $ iptables -L iptables/1.8.2 Failed to initialize nft: Protocol not supported As a first step before switching to NFTables, we can use iptables-legacy if available. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 8d014eaa9254 ("selftests: mptcp: add ADD_ADDR timeout test case") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: mptcp: lib: skip if not below kernel versionMatthieu Baerts1-0/+26
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. A new function is now available to easily detect if a feature is missing by looking at the kernel version. That's clearly not ideal and this kind of check should be avoided as soon as possible. But sometimes, there are no external sign that a "feature" is available or not: internal behaviours can change without modifying the uAPI and these selftests are verifying the internal behaviours. Sometimes, the only (easy) way to verify if the feature is present is to run the test but then the validation cannot determine if there is a failure with the feature or if the feature is missing. Then it looks better to check the kernel version instead of having tests that can never fail. In any case, we need a solution not to have a whole selftest being marked as failed just because one sub-test has failed. Note that this env var car be set to 1 not to do such check and run the linked sub-test: SELFTESTS_MPTCP_LIB_NO_KVERSION_CHECK. This new helper is going to be used in the following commits. In order to ease the backport of such future patches, it would be good if this patch is backported up to the introduction of MPTCP selftests, hence the Fixes tag below: this type of check was supposed to be done from the beginning. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12Merge branch 'fixes-for-q-usgmii-speeds-and-autoneg'Jakub Kicinski1-2/+39
Maxime Chevallier says: ==================== fixes for Q-USGMII speeds and autoneg This is the second version of a small changeset for QUSGMII support, fixing inconsistencies in reported max speed and control word parsing. As reported here [1], there are some inconsistencies for the Q-USGMII mode speeds and configuration. The first patch in this fixup series makes so that we correctly report the max speed of 1Gbps for this mode. The second patch uses a dedicated helper to decode the control word. This is necessary as although USGMII control words are close to USXGMII, they don't support the same speeds. [1] : https://lore.kernel.org/netdev/ZHnd+6FUO77XFJvQ@shell.armlinux.org.uk/ ==================== Link: https://lore.kernel.org/r/20230609080305.546028-1-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12net: phylink: use a dedicated helper to parse usgmii control wordMaxime Chevallier1-1/+38
Q-USGMII is a derivative of USGMII, that uses a specific formatting for the control word. The layout is close to the USXGMII control word, but doesn't support speeds over 1Gbps. Use a dedicated decoding logic for the USGMII control word, re-using USXGMII definitions but only considering 10/100/1000Mbps speeds Fixes: 5e61fe157a27 ("net: phy: Introduce QUSGMII PHY mode") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12net: phylink: report correct max speed for QUSGMIIMaxime Chevallier1-1/+1
Q-USGMII is the quad port version of USGMII, and supports a max speed of 1Gbps on each line. Make so that phylink_interface_max_speed() reports this information correctly. Fixes: ae0e4bb2a0e0 ("net: phylink: Adjust link settings based on rate matching") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13btrfs: do not ASSERT() on duplicated global rootsQu Wenruo1-2/+8
[BUG] Syzbot reports a reproducible ASSERT() when using rescue=usebackuproot mount option on a corrupted fs. The full report can be found here: https://syzkaller.appspot.com/bug?extid=c4614eae20a166c25bf0 BTRFS error (device loop0: state C): failed to load root csum assertion failed: !tmp, in fs/btrfs/disk-io.c:1103 ------------[ cut here ]------------ kernel BUG at fs/btrfs/ctree.h:3664! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 3608 Comm: syz-executor356 Not tainted 6.0.0-rc7-syzkaller-00029-g3800a713b607 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022 RIP: 0010:assertfail+0x1a/0x1c fs/btrfs/ctree.h:3663 RSP: 0018:ffffc90003aaf250 EFLAGS: 00010246 RAX: 0000000000000032 RBX: 0000000000000000 RCX: f21c13f886638400 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffff888021c640a0 R08: ffffffff816bd38d R09: ffffed10173667f1 R10: ffffed10173667f1 R11: 1ffff110173667f0 R12: dffffc0000000000 R13: ffff8880229c21f7 R14: ffff888021c64060 R15: ffff8880226c0000 FS: 0000555556a73300(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055a2637d7a00 CR3: 00000000709c4000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> btrfs_global_root_insert+0x1a7/0x1b0 fs/btrfs/disk-io.c:1103 load_global_roots_objectid+0x482/0x8c0 fs/btrfs/disk-io.c:2467 load_global_roots fs/btrfs/disk-io.c:2501 [inline] btrfs_read_roots fs/btrfs/disk-io.c:2528 [inline] init_tree_roots+0xccb/0x203c fs/btrfs/disk-io.c:2939 open_ctree+0x1e53/0x33df fs/btrfs/disk-io.c:3574 btrfs_fill_super+0x1c6/0x2d0 fs/btrfs/super.c:1456 btrfs_mount_root+0x885/0x9a0 fs/btrfs/super.c:1824 legacy_get_tree+0xea/0x180 fs/fs_context.c:610 vfs_get_tree+0x88/0x270 fs/super.c:1530 fc_mount fs/namespace.c:1043 [inline] vfs_kern_mount+0xc9/0x160 fs/namespace.c:1073 btrfs_mount+0x3d3/0xbb0 fs/btrfs/super.c:1884 [CAUSE] Since the introduction of global roots, we handle csum/extent/free-space-tree roots as global roots, even if no extent-tree-v2 feature is enabled. So for regular csum/extent/fst roots, we load them into fs_info::global_root_tree rb tree. And we should not expect any conflicts in that rb tree, thus we have an ASSERT() inside btrfs_global_root_insert(). But rescue=usebackuproot can break the assumption, as we will try to load those trees again and again as long as we have bad roots and have backup roots slot remaining. So in that case we can have conflicting roots in the rb tree, and triggering the ASSERT() crash. [FIX] We can safely remove that ASSERT(), as the caller will properly put the offending root. To make further debugging easier, also add two explicit error messages: - Error message for conflicting global roots - Error message when using backup roots slot Reported-by: syzbot+a694851c6ab28cbcfb9c@syzkaller.appspotmail.com Fixes: abed4aaae4f7 ("btrfs: track the csum, extent, and free space trees in a rb tree") CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-12Merge tag 'mm-hotfixes-stable-2023-06-12-12-22' of ↵Linus Torvalds21-29/+161
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "19 hotfixes. 14 are cc:stable and the remainder address issues which were introduced during this development cycle or which were considered inappropriate for a backport" * tag 'mm-hotfixes-stable-2023-06-12-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: zswap: do not shrink if cgroup may not zswap page cache: fix page_cache_next/prev_miss off by one ocfs2: check new file size on fallocate call mailmap: add entry for John Keeping mm/damon/core: fix divide error in damon_nr_accesses_to_accesses_bp() epoll: ep_autoremove_wake_function should use list_del_init_careful mm/gup_test: fix ioctl fail for compat task nilfs2: reject devices with insufficient block count ocfs2: fix use-after-free when unmounting read-only filesystem lib/test_vmalloc.c: avoid garbage in page array nilfs2: fix possible out-of-bounds segment allocation in resize ioctl riscv/purgatory: remove PGO flags powerpc/purgatory: remove PGO flags x86/purgatory: remove PGO flags kexec: support purgatories with .text.hot sections mm/uffd: allow vma to merge as much as possible mm/uffd: fix vma operation where start addr cuts part of vma radix-tree: move declarations to header nilfs2: fix incomplete buffer cleanup in nilfs_btnode_abort_change_key()
2023-06-13btrfs: can_nocow_file_extent should pass down args->strict from callersChris Mason1-1/+1
Commit 619104ba453ad0 ("btrfs: move common NOCOW checks against a file extent into a helper") changed our call to btrfs_cross_ref_exist() to always pass false for the 'strict' parameter. We're passing this down through the stack so that we can do a full check for cross references during swapfile activation. With strict always false, this test fails: btrfs subvol create swappy chattr +C swappy fallocate -l1G swappy/swapfile chmod 600 swappy/swapfile mkswap swappy/swapfile btrfs subvol snap swappy swapsnap btrfs subvol del -C swapsnap btrfs fi sync / sync;sync;sync swapon swappy/swapfile The fix is to just use args->strict, and everyone except swapfile activation is passing false. Fixes: 619104ba453ad0 ("btrfs: move common NOCOW checks against a file extent into a helper") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-13btrfs: fix iomap_begin length for nocow writesChristoph Hellwig1-6/+12
can_nocow_extent can reduce the len passed in, which needs to be propagated to btrfs_dio_iomap_begin so that iomap does not submit more data then is mapped. This problems exists since the btrfs_get_blocks_direct helper was added in commit c5794e51784a ("btrfs: Factor out write portion of btrfs_get_blocks_direct"), but the ordered_extent splitting added in commit b73a6fd1b1ef ("btrfs: split partial dio bios before submit") added a WARN_ON that made a syzkaller test fail. Reported-by: syzbot+ee90502d5c8fd1d0dd93@syzkaller.appspotmail.com Fixes: c5794e51784a ("btrfs: Factor out write portion of btrfs_get_blocks_direct") CC: stable@vger.kernel.org # 6.1+ Tested-by: syzbot+ee90502d5c8fd1d0dd93@syzkaller.appspotmail.com Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-12igb: fix nvm.ops.read() error handlingAleksandr Loktionov1-0/+3
Add error handling into igb_set_eeprom() function, in case nvm.ops.read() fails just quit with error code asap. Fixes: 9d5c824399de ("igb: PCI-Express 82575 Gigabit Ethernet driver") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-12igc: Fix possible system crash when loading moduleVinicius Costa Gomes1-0/+3
Guarantee that when probe() is run again, PTM and PCI busmaster will be in the same state as it was if the driver was never loaded. Avoid an i225/i226 hardware issue that PTM requests can be made even though PCI bus mastering is not enabled. These unexpected PTM requests can crash some systems. So, "force" disable PTM and busmastering before removing the driver, so they can be re-enabled in the right order during probe(). This is more like a workaround and should be applicable for i225 and i226, in any platform. Fixes: 1b5d73fb8624 ("igc: Enable PCIe PTM") Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-12igc: Clean the TX buffer and TX descriptor ringMuhammad Husaini Zulkifli1-1/+8
There could be a race condition during link down where interrupt being generated and igc_clean_tx_irq() been called to perform the TX completion. Properly clear the TX buffer/descriptor ring and disable the TX Queue ring in igc_free_tx_resources() to avoid that. Kernel trace: [ 108.237177] Hardware name: Intel Corporation Tiger Lake Client Platform/TigerLake U DDR4 SODIMM RVP, BIOS TGLIFUI1.R00.4204.A00.2105270302 05/27/2021 [ 108.237178] RIP: 0010:refcount_warn_saturate+0x55/0x110 [ 108.242143] RSP: 0018:ffff9e7980003db0 EFLAGS: 00010286 [ 108.245555] Code: 84 bc 00 00 00 c3 cc cc cc cc 85 f6 74 46 80 3d 20 8c 4d 01 00 75 ee 48 c7 c7 88 f4 03 ab c6 05 10 8c 4d 01 01 e8 0b 10 96 ff <0f> 0b c3 cc cc cc cc 80 3d fc 8b 4d 01 00 75 cb 48 c7 c7 b0 f4 03 [ 108.250434] [ 108.250434] RSP: 0018:ffff9e798125f910 EFLAGS: 00010286 [ 108.254358] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 108.259325] [ 108.259325] RAX: 0000000000000000 RBX: ffff8ddb935b8000 RCX: 0000000000000027 [ 108.261868] RDX: ffff8de250a28800 RSI: ffff8de250a1c580 RDI: ffff8de250a1c580 [ 108.265538] RDX: 0000000000000027 RSI: 0000000000000002 RDI: ffff8de250a9c588 [ 108.265539] RBP: ffff8ddb935b8000 R08: ffffffffab2655a0 R09: ffff9e798125f898 [ 108.267914] RBP: ffff8ddb8a5b8d80 R08: 0000005648eba354 R09: 0000000000000000 [ 108.270196] R10: 0000000000000001 R11: 000000002d2d2d2d R12: ffff9e798125f948 [ 108.270197] R13: ffff9e798125fa1c R14: ffff8ddb8a5b8d80 R15: 7fffffffffffffff [ 108.273001] R10: 000000002d2d2d2d R11: 000000002d2d2d2d R12: ffff8ddb8a5b8ed4 [ 108.276410] FS: 00007f605851b740(0000) GS:ffff8de250a80000(0000) knlGS:0000000000000000 [ 108.280597] R13: 00000000000002ac R14: 00000000ffffff99 R15: ffff8ddb92561b80 [ 108.282966] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 108.282967] CR2: 00007f053c039248 CR3: 0000000185850003 CR4: 0000000000f70ee0 [ 108.286206] FS: 0000000000000000(0000) GS:ffff8de250a00000(0000) knlGS:0000000000000000 [ 108.289701] PKRU: 55555554 [ 108.289702] Call Trace: [ 108.289704] <TASK> [ 108.293977] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 108.297562] sock_alloc_send_pskb+0x20c/0x240 [ 108.301494] CR2: 00007f053c03a168 CR3: 0000000184394002 CR4: 0000000000f70ef0 [ 108.301495] PKRU: 55555554 [ 108.306464] __ip_append_data.isra.0+0x96f/0x1040 [ 108.309441] Call Trace: [ 108.309443] ? __pfx_ip_generic_getfrag+0x10/0x10 [ 108.314927] <IRQ> [ 108.314928] sock_wfree+0x1c7/0x1d0 [ 108.318078] ? __pfx_ip_generic_getfrag+0x10/0x10 [ 108.320276] skb_release_head_state+0x32/0x90 [ 108.324812] ip_make_skb+0xf6/0x130 [ 108.327188] skb_release_all+0x16/0x40 [ 108.330775] ? udp_sendmsg+0x9f3/0xcb0 [ 108.332626] napi_consume_skb+0x48/0xf0 [ 108.334134] ? xfrm_lookup_route+0x23/0xb0 [ 108.344285] igc_poll+0x787/0x1620 [igc] [ 108.346659] udp_sendmsg+0x9f3/0xcb0 [ 108.360010] ? ttwu_do_activate+0x40/0x220 [ 108.365237] ? __pfx_ip_generic_getfrag+0x10/0x10 [ 108.366744] ? try_to_wake_up+0x289/0x5e0 [ 108.376987] ? sock_sendmsg+0x81/0x90 [ 108.395698] ? __pfx_process_timeout+0x10/0x10 [ 108.395701] sock_sendmsg+0x81/0x90 [ 108.409052] __napi_poll+0x29/0x1c0 [ 108.414279] ____sys_sendmsg+0x284/0x310 [ 108.419507] net_rx_action+0x257/0x2d0 [ 108.438216] ___sys_sendmsg+0x7c/0xc0 [ 108.439723] __do_softirq+0xc1/0x2a8 [ 108.444950] ? finish_task_switch+0xb4/0x2f0 [ 108.452077] irq_exit_rcu+0xa9/0xd0 [ 108.453584] ? __schedule+0x372/0xd00 [ 108.460713] common_interrupt+0x84/0xa0 [ 108.467840] ? clockevents_program_event+0x95/0x100 [ 108.474968] </IRQ> [ 108.482096] ? do_nanosleep+0x88/0x130 [ 108.489224] <TASK> [ 108.489225] asm_common_interrupt+0x26/0x40 [ 108.496353] ? __rseq_handle_notify_resume+0xa9/0x4f0 [ 108.503478] RIP: 0010:cpu_idle_poll+0x2c/0x100 [ 108.510607] __sys_sendmsg+0x5d/0xb0 [ 108.518687] Code: 05 e1 d9 c8 00 65 8b 15 de 64 85 55 85 c0 7f 57 e8 b9 ef ff ff fb 65 48 8b 1c 25 00 cc 02 00 48 8b 03 a8 08 74 0b eb 1c f3 90 <48> 8b 03 a8 08 75 13 8b 05 77 63 cd 00 85 c0 75 ed e8 ce ec ff ff [ 108.525817] do_syscall_64+0x44/0xa0 [ 108.531563] RSP: 0018:ffffffffab203e70 EFLAGS: 00000202 [ 108.538693] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 108.546775] [ 108.546777] RIP: 0033:0x7f605862b7f7 [ 108.549495] RAX: 0000000000000001 RBX: ffffffffab20c940 RCX: 000000000000003b [ 108.551955] Code: 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 [ 108.554068] RDX: 4000000000000000 RSI: 000000002da97f6a RDI: 00000000002b8ff4 [ 108.559816] RSP: 002b:00007ffc99264058 EFLAGS: 00000246 [ 108.564178] RBP: 0000000000000000 R08: 00000000002b8ff4 R09: ffff8ddb01554c80 [ 108.571302] ORIG_RAX: 000000000000002e [ 108.571303] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f605862b7f7 [ 108.574023] R10: 000000000000015b R11: 000000000000000f R12: ffffffffab20c940 [ 108.574024] R13: 0000000000000000 R14: ffff8de26fbeef40 R15: ffffffffab20c940 [ 108.578727] RDX: 0000000000000000 RSI: 00007ffc992640a0 RDI: 0000000000000003 [ 108.578728] RBP: 00007ffc99264110 R08: 0000000000000000 R09: 175f48ad1c3a9c00 [ 108.581187] do_idle+0x62/0x230 [ 108.585890] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc992642d8 [ 108.585891] R13: 00005577814ab2ba R14: 00005577814addf0 R15: 00007f605876d000 [ 108.587920] cpu_startup_entry+0x1d/0x20 [ 108.591422] </TASK> [ 108.596127] rest_init+0xc5/0xd0 [ 108.600490] ---[ end trace 0000000000000000 ]--- Test Setup: DUT: - Change mac address on DUT Side. Ensure NIC not having same MAC Address - Running udp_tai on DUT side. Let udp_tai running throughout the test Example: ./udp_tai -i enp170s0 -P 100000 -p 90 -c 1 -t 0 -u 30004 Host: - Perform link up/down every 5 second. Result: Kernel panic will happen on DUT Side. Fixes: 13b5b7fd6a4a ("igc: Add support for Tx/Rx rings") Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-12zswap: do not shrink if cgroup may not zswapNhat Pham1-2/+9
Before storing a page, zswap first checks if the number of stored pages exceeds the limit specified by memory.zswap.max, for each cgroup in the hierarchy. If this limit is reached or exceeded, then zswap shrinking is triggered and short-circuits the store attempt. However, since the zswap's LRU is not memcg-aware, this can create the following pathological behavior: the cgroup whose zswap limit is 0 will evict pages from other cgroups continually, without lowering its own zswap usage. This means the shrinking will continue until the need for swap ceases or the pool becomes empty. As a result of this, we observe a disproportionate amount of zswap writeback and a perpetually small zswap pool in our experiments, even though the pool limit is never hit. More generally, a cgroup might unnecessarily evict pages from other cgroups before we drive the memcg back below its limit. This patch fixes the issue by rejecting zswap store attempt without shrinking the pool when obj_cgroup_may_zswap() returns false. [akpm@linux-foundation.org: fix return of unintialized value] [akpm@linux-foundation.org: s/ENOSPC/ENOMEM/] Link: https://lkml.kernel.org/r/20230530222440.2777700-1-nphamcs@gmail.com Link: https://lkml.kernel.org/r/20230530232435.3097106-1-nphamcs@gmail.com Fixes: f4840ccfca25 ("zswap: memcg accounting") Signed-off-by: Nhat Pham <nphamcs@gmail.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Seth Jennings <sjenning@redhat.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12page cache: fix page_cache_next/prev_miss off by oneMike Kravetz1-10/+16
Ackerley Tng reported an issue with hugetlbfs fallocate here[1]. The issue showed up after the conversion of hugetlb page cache lookup code to use page_cache_next_miss. Code in hugetlb fallocate, userfaultfd and GUP is now using page_cache_next_miss to determine if a page is present the page cache. The following statement is used. present = page_cache_next_miss(mapping, index, 1) != index; There are two issues with page_cache_next_miss when used in this way. 1) If the passed value for index is equal to the 'wrap-around' value, the same index will always be returned. This wrap-around value is 0, so 0 will be returned even if page is present at index 0. 2) If there is no gap in the range passed, the last index in the range will be returned. When passed a range of 1 as above, the passed index value will be returned even if the page is present. The end result is the statement above will NEVER indicate a page is present in the cache, even if it is. As noted by Ackerley in [1], users can see this by hugetlb fallocate incorrectly returning EEXIST if pages are already present in the file. In addition, hugetlb pages will not be included in core dumps if they need to be brought in via GUP. userfaultfd UFFDIO_COPY also uses this code and will not notice pages already present in the cache. It may try to allocate a new page and potentially return ENOMEM as opposed to EEXIST. Both page_cache_next_miss and page_cache_prev_miss have similar issues. Fix by: - Check for index equal to 'wrap-around' value and do not exit early. - If no gap is found in range, return index outside range. - Update function description to say 'wrap-around' value could be returned if passed as index. [1] https://lore.kernel.org/linux-mm/cover.1683069252.git.ackerleytng@google.com/ Link: https://lkml.kernel.org/r/20230602225747.103865-2-mike.kravetz@oracle.com Fixes: d0ce0e47b323 ("mm/hugetlb: convert hugetlb fault paths to use alloc_hugetlb_folio()") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: Ackerley Tng <ackerleytng@google.com> Reviewed-by: Ackerley Tng <ackerleytng@google.com> Tested-by: Ackerley Tng <ackerleytng@google.com> Cc: Erdem Aktas <erdemaktas@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Vishal Annapurve <vannapurve@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12ocfs2: check new file size on fallocate callLuís Henriques1-1/+7
When changing a file size with fallocate() the new size isn't being checked. In particular, the FSIZE ulimit isn't being checked, which makes fstest generic/228 fail. Simply adding a call to inode_newsize_ok() fixes this issue. Link: https://lkml.kernel.org/r/20230529152645.32680-1-lhenriques@suse.de Signed-off-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Mark Fasheh <mark@fasheh.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12mailmap: add entry for John KeepingJohn Keeping1-0/+1
Map my corporate address to my personal one, as I am leaving the company. Link: https://lkml.kernel.org/r/20230531144839.1157112-1-john@keeping.me.uk Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12mm/damon/core: fix divide error in damon_nr_accesses_to_accesses_bp()Kefeng Wang1-0/+2
If 'aggr_interval' is smaller than 'sample_interval', max_nr_accesses in damon_nr_accesses_to_accesses_bp() becomes zero which leads to divide error, let's validate the values of them in damon_set_attrs() to fix it, which similar to others attrs check. Link: https://lkml.kernel.org/r/20230527032101.167788-1-wangkefeng.wang@huawei.com Fixes: 2f5bef5a590b ("mm/damon/core: update monitoring results for new monitoring attributes") Reported-by: syzbot+841a46899768ec7bec67@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=841a46899768ec7bec67 Link: https://lore.kernel.org/damon/00000000000055fc4e05fc975bc2@google.com/ Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12epoll: ep_autoremove_wake_function should use list_del_init_carefulBenjamin Segall1-1/+5
autoremove_wake_function uses list_del_init_careful, so should epoll's more aggressive variant. It only doesn't because it was copied from an older wait.c rather than the most recent. [bsegall@google.com: add comment] Link: https://lkml.kernel.org/r/xm26bki0ulsr.fsf_-_@google.com Link: https://lkml.kernel.org/r/xm26pm6hvfer.fsf@google.com Fixes: a16ceb139610 ("epoll: autoremove wakers even more aggressively") Signed-off-by: Ben Segall <bsegall@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12mm/gup_test: fix ioctl fail for compat taskHaibo Li1-0/+1
When tools/testing/selftests/mm/gup_test.c is compiled as 32bit, then run on arm64 kernel, it reports "ioctl: Inappropriate ioctl for device". Fix it by filling compat_ioctl in gup_test_fops Link: https://lkml.kernel.org/r/20230526022125.175728-1-haibo.li@mediatek.com Signed-off-by: Haibo Li <haibo.li@mediatek.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12nilfs2: reject devices with insufficient block countRyusuke Konishi1-1/+42
The current sanity check for nilfs2 geometry information lacks checks for the number of segments stored in superblocks, so even for device images that have been destructively truncated or have an unusually high number of segments, the mount operation may succeed. This causes out-of-bounds block I/O on file system block reads or log writes to the segments, the latter in particular causing "a_ops->writepages" to repeatedly fail, resulting in sync_inodes_sb() to hang. Fix this issue by checking the number of segments stored in the superblock and avoiding mounting devices that can cause out-of-bounds accesses. To eliminate the possibility of overflow when calculating the number of blocks required for the device from the number of segments, this also adds a helper function to calculate the upper bound on the number of segments and inserts a check using it. Link: https://lkml.kernel.org/r/20230526021332.3431-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+7d50f1e54a12ba3aeae2@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=7d50f1e54a12ba3aeae2 Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12ocfs2: fix use-after-free when unmounting read-only filesystemLuís Henriques1-2/+4
It's trivial to trigger a use-after-free bug in the ocfs2 quotas code using fstest generic/452. After a read-only remount, quotas are suspended and ocfs2_mem_dqinfo is freed through ->ocfs2_local_free_info(). When unmounting the filesystem, an UAF access to the oinfo will eventually cause a crash. BUG: KASAN: slab-use-after-free in timer_delete+0x54/0xc0 Read of size 8 at addr ffff8880389a8208 by task umount/669 ... Call Trace: <TASK> ... timer_delete+0x54/0xc0 try_to_grab_pending+0x31/0x230 __cancel_work_timer+0x6c/0x270 ocfs2_disable_quotas.isra.0+0x3e/0xf0 [ocfs2] ocfs2_dismount_volume+0xdd/0x450 [ocfs2] generic_shutdown_super+0xaa/0x280 kill_block_super+0x46/0x70 deactivate_locked_super+0x4d/0xb0 cleanup_mnt+0x135/0x1f0 ... </TASK> Allocated by task 632: kasan_save_stack+0x1c/0x40 kasan_set_track+0x21/0x30 __kasan_kmalloc+0x8b/0x90 ocfs2_local_read_info+0xe3/0x9a0 [ocfs2] dquot_load_quota_sb+0x34b/0x680 dquot_load_quota_inode+0xfe/0x1a0 ocfs2_enable_quotas+0x190/0x2f0 [ocfs2] ocfs2_fill_super+0x14ef/0x2120 [ocfs2] mount_bdev+0x1be/0x200 legacy_get_tree+0x6c/0xb0 vfs_get_tree+0x3e/0x110 path_mount+0xa90/0xe10 __x64_sys_mount+0x16f/0x1a0 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc Freed by task 650: kasan_save_stack+0x1c/0x40 kasan_set_track+0x21/0x30 kasan_save_free_info+0x2a/0x50 __kasan_slab_free+0xf9/0x150 __kmem_cache_free+0x89/0x180 ocfs2_local_free_info+0x2ba/0x3f0 [ocfs2] dquot_disable+0x35f/0xa70 ocfs2_susp_quotas.isra.0+0x159/0x1a0 [ocfs2] ocfs2_remount+0x150/0x580 [ocfs2] reconfigure_super+0x1a5/0x3a0 path_mount+0xc8a/0xe10 __x64_sys_mount+0x16f/0x1a0 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc Link: https://lkml.kernel.org/r/20230522102112.9031-1-lhenriques@suse.de Signed-off-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12lib/test_vmalloc.c: avoid garbage in page arrayLorenzo Stoakes1-1/+1
It turns out that alloc_pages_bulk_array() does not treat the page_array parameter as an output parameter, but rather reads the array and skips any entries that have already been allocated. This is somewhat unexpected and breaks this test, as we allocate the pages array uninitialised on the assumption it will be overwritten. As a result, the test was referencing uninitialised data and causing the PFN to not be valid and thus a WARN_ON() followed by a null pointer deref and panic. In addition, this is an array of pointers not of struct page objects, so we need only allocate an array with elements of pointer size. We solve both problems by simply using kcalloc() and referencing sizeof(struct page *) rather than sizeof(struct page). Link: https://lkml.kernel.org/r/20230524082424.10022-1-lstoakes@gmail.com Fixes: 869cb29a61a1 ("lib/test_vmalloc.c: add vm_map_ram()/vm_unmap_ram() test case") Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Baoquan He <bhe@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12nilfs2: fix possible out-of-bounds segment allocation in resize ioctlRyusuke Konishi1-0/+9
Syzbot reports that in its stress test for resize ioctl, the log writing function nilfs_segctor_do_construct hits a WARN_ON in nilfs_segctor_truncate_segments(). It turned out that there is a problem with the current implementation of the resize ioctl, which changes the writable range on the device (the range of allocatable segments) at the end of the resize process. This order is necessary for file system expansion to avoid corrupting the superblock at trailing edge. However, in the case of a file system shrink, if log writes occur after truncating out-of-bounds trailing segments and before the resize is complete, segments may be allocated from the truncated space. The userspace resize tool was fine as it limits the range of allocatable segments before performing the resize, but it can run into this issue if the resize ioctl is called alone. Fix this issue by changing nilfs_sufile_resize() to update the range of allocatable segments immediately after successful truncation of segment space in case of file system shrink. Link: https://lkml.kernel.org/r/20230524094348.3784-1-konishi.ryusuke@gmail.com Fixes: 4e33f9eab07e ("nilfs2: implement resize ioctl") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+33494cd0df2ec2931851@syzkaller.appspotmail.com Closes: https://lkml.kernel.org/r/0000000000005434c405fbbafdc5@google.com Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12riscv/purgatory: remove PGO flagsRicardo Ribalda1-0/+5
If profile-guided optimization is enabled, the purgatory ends up with multiple .text sections. This is not supported by kexec and crashes the system. Link: https://lkml.kernel.org/r/20230321-kexec_clang16-v7-4-b05c520b7296@chromium.org Fixes: 930457057abe ("kernel/kexec_file.c: split up __kexec_load_puragory") Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Cc: <stable@vger.kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Philipp Rudo <prudo@redhat.com> Cc: Ross Zwisler <zwisler@google.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Rix <trix@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12powerpc/purgatory: remove PGO flagsRicardo Ribalda1-0/+5
If profile-guided optimization is enabled, the purgatory ends up with multiple .text sections. This is not supported by kexec and crashes the system. Link: https://lkml.kernel.org/r/20230321-kexec_clang16-v7-3-b05c520b7296@chromium.org Fixes: 930457057abe ("kernel/kexec_file.c: split up __kexec_load_puragory") Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: <stable@vger.kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Philipp Rudo <prudo@redhat.com> Cc: Ross Zwisler <zwisler@google.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Rix <trix@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12x86/purgatory: remove PGO flagsRicardo Ribalda1-0/+5
If profile-guided optimization is enabled, the purgatory ends up with multiple .text sections. This is not supported by kexec and crashes the system. Link: https://lkml.kernel.org/r/20230321-kexec_clang16-v7-2-b05c520b7296@chromium.org Fixes: 930457057abe ("kernel/kexec_file.c: split up __kexec_load_puragory") Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Cc: <stable@vger.kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Philipp Rudo <prudo@redhat.com> Cc: Ross Zwisler <zwisler@google.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Rix <trix@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12kexec: support purgatories with .text.hot sectionsRicardo Ribalda1-1/+13
Patch series "kexec: Fix kexec_file_load for llvm16 with PGO", v7. When upreving llvm I realised that kexec stopped working on my test platform. The reason seems to be that due to PGO there are multiple .text sections on the purgatory, and kexec does not supports that. This patch (of 4): Clang16 links the purgatory text in two sections when PGO is in use: [ 1] .text PROGBITS 0000000000000000 00000040 00000000000011a1 0000000000000000 AX 0 0 16 [ 2] .rela.text RELA 0000000000000000 00003498 0000000000000648 0000000000000018 I 24 1 8 ... [17] .text.hot. PROGBITS 0000000000000000 00003220 000000000000020b 0000000000000000 AX 0 0 1 [18] .rela.text.hot. RELA 0000000000000000 00004428 0000000000000078 0000000000000018 I 24 17 8 And both of them have their range [sh_addr ... sh_addr+sh_size] on the area pointed by `e_entry`. This causes that image->start is calculated twice, once for .text and another time for .text.hot. The second calculation leaves image->start in a random location. Because of this, the system crashes immediately after: kexec_core: Starting new kernel Link: https://lkml.kernel.org/r/20230321-kexec_clang16-v7-0-b05c520b7296@chromium.org Link: https://lkml.kernel.org/r/20230321-kexec_clang16-v7-1-b05c520b7296@chromium.org Fixes: 930457057abe ("kernel/kexec_file.c: split up __kexec_load_puragory") Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Reviewed-by: Ross Zwisler <zwisler@google.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Philipp Rudo <prudo@redhat.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Simon Horman <horms@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Rix <trix@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12mm/uffd: allow vma to merge as much as possiblePeter Xu1-2/+6
We used to not pass in the pgoff correctly when register/unregister uffd regions, it caused incorrect behavior on vma merging and can cause mergeable vmas being separate after ioctls return. For example, when we have: vma1(range 0-9, with uffd), vma2(range 10-19, no uffd) Then someone unregisters uffd on range (5-9), it should logically become: vma1(range 0-4, with uffd), vma2(range 5-19, no uffd) But with current code we'll have: vma1(range 0-4, with uffd), vma3(range 5-9, no uffd), vma2(range 10-19, no uffd) This patch allows such merge to happen correctly before ioctl returns. This behavior seems to have existed since the 1st day of uffd. Since pgoff for vma_merge() is only used to identify the possibility of vma merging, meanwhile here what we did was always passing in a pgoff smaller than what we should, so there should have no other side effect besides not merging it. Let's still tentatively copy stable for this, even though I don't see anything will go wrong besides vma being split (which is mostly not user visible). Link: https://lkml.kernel.org/r/20230517190916.3429499-3-peterx@redhat.com Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization") Signed-off-by: Peter Xu <peterx@redhat.com> Reported-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12mm/uffd: fix vma operation where start addr cuts part of vmaPeter Xu1-0/+5
Patch series "mm/uffd: Fix vma merge/split", v2. This series contains two patches that fix vma merge/split for userfaultfd on two separate issues. Patch 1 fixes a regression since 6.1+ due to something we overlooked when converting to maple tree apis. The plan is we use patch 1 to replace the commit "2f628010799e (mm: userfaultfd: avoid passing an invalid range to vma_merge())" in mm-hostfixes-unstable tree if possible, so as to bring uffd vma operations back aligned with the rest code again. Patch 2 fixes a long standing issue that vma can be left unmerged even if we can for either uffd register or unregister. Many thanks to Lorenzo on either noticing this issue from the assert movement patch, looking at this problem, and also provided a reproducer on the unmerged vma issue [1]. [1] https://gist.github.com/lorenzo-stoakes/a11a10f5f479e7a977fc456331266e0e This patch (of 2): It seems vma merging with uffd paths is broken with either register/unregister, where right now we can feed wrong parameters to vma_merge() and it's found by recent patch which moved asserts upwards in vma_merge() by Lorenzo Stoakes: https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/ It's possible that "start" is contained within vma but not clamped to its start. We need to convert this into either "cannot merge" case or "can merge" case 4 which permits subdivision of prev by assigning vma to prev. As we loop, each subsequent VMA will be clamped to the start. This patch will eliminate the report and make sure vma_merge() calls will become legal again. One thing to mention is that the "Fixes: 29417d292bd0" below is there only to help explain where the warning can start to trigger, the real commit to fix should be 69dbe6daf104. Commit 29417d292bd0 helps us to identify the issue, but unfortunately we may want to keep it in Fixes too just to ease kernel backporters for easier tracking. Link: https://lkml.kernel.org/r/20230517190916.3429499-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20230517190916.3429499-2-peterx@redhat.com Fixes: 69dbe6daf104 ("userfaultfd: use maple tree iterator to iterate VMAs") Signed-off-by: Peter Xu <peterx@redhat.com> Reported-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Closes: https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/ Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12radix-tree: move declarations to headerArnd Bergmann4-6/+15
The xarray.c file contains the only call to radix_tree_node_rcu_free(), and it comes with its own extern declaration for it. This means the function definition causes a missing-prototype warning: lib/radix-tree.c:288:6: error: no previous prototype for 'radix_tree_node_rcu_free' [-Werror=missing-prototypes] Instead, move the declaration for this function to a new header that can be included by both, and do the same for the radix_tree_node_cachep variable that has the same underlying problem but does not cause a warning with gcc. [zhangpeng.00@bytedance.com: fix building radix tree test suite] Link: https://lkml.kernel.org/r/20230521095450.21332-1-zhangpeng.00@bytedance.com Link: https://lkml.kernel.org/r/20230516194212.548910-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12nilfs2: fix incomplete buffer cleanup in nilfs_btnode_abort_change_key()Ryusuke Konishi1-2/+10
A syzbot fault injection test reported that nilfs_btnode_create_block, a helper function that allocates a new node block for b-trees, causes a kernel BUG for disk images where the file system block size is smaller than the page size. This was due to unexpected flags on the newly allocated buffer head, and it turned out to be because the buffer flags were not cleared by nilfs_btnode_abort_change_key() after an error occurred during a b-tree update operation and the buffer was later reused in that state. Fix this issue by using nilfs_btnode_delete() to abandon the unused preallocated buffer in nilfs_btnode_abort_change_key(). Link: https://lkml.kernel.org/r/20230513102428.10223-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+b0a35a5c1f7e846d3b09@syzkaller.appspotmail.com Closes: https://lkml.kernel.org/r/000000000000d1d6c205ebc4d512@google.com Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12io_uring/io-wq: don't clear PF_IO_WORKER on exitJens Axboe1-3/+0
A recent commit gated the core dumping task exit logic on current->flags remaining consistent in terms of PF_{IO,USER}_WORKER at task exit time. This exposed a problem with the io-wq handling of that, which explicitly clears PF_IO_WORKER before calling do_exit(). The reasons for this manual clear of PF_IO_WORKER is historical, where io-wq used to potentially trigger a sleep on exit. As the io-wq thread is exiting, it should not participate any further accounting. But these days we don't need to rely on current->flags anymore, so we can safely remove the PF_IO_WORKER clearing. Reported-by: Zorro Lang <zlang@redhat.com> Reported-by: Dave Chinner <david@fromorbit.com> Link: https://lore.kernel.org/all/ZIZSPyzReZkGBEFy@dread.disaster.area/ Fixes: f9010dbdce91 ("fork, vhost: Use CLONE_THREAD to fix freezer/ps regression") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-12Merge tag 'for-6.4-rc6-tag' of ↵Linus Torvalds3-11/+30
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A more fixes and regression fixes: - in subpage mode, fix crash when repairing metadata at the end of a stripe - properly enable async discard when remounting from read-only to read-write - scrub regression fixes: - respect read-only scrub when attempting to do a repair - fix reporting of found errors, the stats don't get properly accounted after a stripe repair" * tag 'for-6.4-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: scrub: also report errors hit during the initial read btrfs: scrub: respect the read-only flag during repair btrfs: properly enable async discard when switching from RO->RW btrfs: subpage: fix a crash in metadata repair path
2023-06-12sctp: fix an error code in sctp_sf_eat_auth()Dan Carpenter1-1/+1
The sctp_sf_eat_auth() function is supposed to enum sctp_disposition values and returning a kernel error code will cause issues in the caller. Change -ENOMEM to SCTP_DISPOSITION_NOMEM. Fixes: 65b07e5d0d09 ("[SCTP]: API updates to suport SCTP-AUTH extensions.") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12sctp: handle invalid error codes without calling BUG()Dan Carpenter1-1/+4
The sctp_sf_eat_auth() function is supposed to return enum sctp_disposition values but if the call to sctp_ulpevent_make_authkey() fails, it returns -ENOMEM. This results in calling BUG() inside the sctp_side_effects() function. Calling BUG() is an over reaction and not helpful. Call WARN_ON_ONCE() instead. This code predates git. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12ipvlan: fix bound dev checking for IPv6 l3s modeHangbin Liu1-0/+4
The commit 59a0b022aa24 ("ipvlan: Make skb->skb_iif track skb->dev for l3s mode") fixed ipvlan bonded dev checking by updating skb skb_iif. This fix works for IPv4, as in raw_v4_input() the dif is from inet_iif(skb), which is skb->skb_iif when there is no route. But for IPv6, the fix is not enough, because in ipv6_raw_deliver() -> raw_v6_match(), the dif is inet6_iif(skb), which is returns IP6CB(skb)->iif instead of skb->skb_iif if it's not a l3_slave. To fix the IPv6 part issue. Let's set IP6CB(skb)->iif to correct ifindex. BTW, ipvlan handles NS/NA specifically. Since it works fine, I will not reset IP6CB(skb)->iif when addr->atype is IPVL_ICMPV6. Fixes: c675e06a98a4 ("ipvlan: decouple l3s mode dependencies from other modes") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2196710 Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12wifi: mac80211: fragment per STA profile correctlyBenjamin Berg3-5/+6
When fragmenting the ML per STA profile, the element ID should be IEEE80211_MLE_SUBELEM_PER_STA_PROFILE rather than WLAN_EID_FRAGMENT. Change the helper function to take the to be used element ID and pass the appropriate value for each of the fragmentation levels. Fixes: 81151ce462e5 ("wifi: mac80211: support MLO authentication/association with one link") Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230611121219.9b5c793d904b.I7dad952bea8e555e2f3139fbd415d0cd2b3a08c3@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-12net: ethtool: correct MAX attribute value for statsJakub Kicinski1-1/+1
When compiling YNL generated code compiler complains about array-initializer-out-of-bounds. Turns out the MAX value for STATS_GRP uses the value for STATS. This may lead to random corruptions in user space (kernel itself doesn't use this value as it never parses stats). Fixes: f09ea6fb1272 ("ethtool: add a new command for reading standard stats") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-11cifs: fix max_credits implementationShyam Prasad N2-4/+30
The current implementation of max_credits on the client does not work because the CreditRequest logic for several commands does not take max_credits into account. Still, we can end up asking the server for more credits, depending on the number of credits in flight. For this, we need to limit the credits while parsing the responses too. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-11cifs: fix sockaddr comparison in iface_cmpShyam Prasad N4-37/+88
iface_cmp used to simply do a memcmp of the two provided struct sockaddrs. The comparison needs to do more based on the address family. Similar logic was already present in cifs_match_ipaddr. Doing something similar now. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-11smb/client: print "Unknown" instead of bogus link speed valueEnzo Matsumiya1-1/+46
The virtio driver for Linux guests will not set a link speed to its paravirtualized NICs. This will be seen as -1 in the ethernet layer, and when some servers (e.g. samba) fetches it, it's converted to an unsigned value (and multiplied by 1000 * 1000), so in client side we end up with: 1) Speed: 4294967295000000 bps in DebugData. This patch introduces a helper that returns a speed string (in Mbps or Gbps) if interface speed is valid (>= SPEED_10 and <= SPEED_800000), or "Unknown" otherwise. The reason to not change the value in iface->speed is because we don't know the real speed of the HW backing the server NIC, so let's keep considering these as the fastest NICs available. Also print "Capabilities: None" when the interface doesn't support any. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-11cifs: print all credit counters in DebugDataShyam Prasad N1-3/+8
Output of /proc/fs/cifs/DebugData shows only the per-connection counter for the number of credits of regular type. i.e. the credits reserved for echo and oplocks are not displayed. There have been situations recently where having this info would have been useful. This change prints the credit counters of all three types: regular, echo, oplocks. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-11cifs: fix status checks in cifs_tree_connectShyam Prasad N2-8/+10
The ordering of status checks at the beginning of cifs_tree_connect is wrong. As a result, a tcon which is good may stay marked as needing reconnect infinitely. Fixes: 2f0e4f034220 ("cifs: check only tcon status on tcon related functions") Cc: stable@vger.kernel.org # 6.3 Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-11smb: remove obsolete comment鑫华1-1/+1
Because do_gettimeofday has been removed and replaced by ktime_get_real_ts64, So just remove the comment as it's not needed now. Signed-off-by: 鑫华 <jixianghua@xfusion.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-11blk-cgroup: Flush stats before releasing blkcg_gqMing Lei1-9/+31
As noted by Michal, the blkg_iostat_set's in the lockless list hold reference to blkg's to protect against their removal. Those blkg's hold reference to blkcg. When a cgroup is being destroyed, cgroup_rstat_flush() is only called at css_release_work_fn() which is called when the blkcg reference count reaches 0. This circular dependency will prevent blkcg and some blkgs from being freed after they are made offline. It is less a problem if the cgroup to be destroyed also has other controllers like memory that will call cgroup_rstat_flush() which will clean up the reference count. If block is the only controller that uses rstat, these offline blkcg and blkgs may never be freed leaking more and more memory over time. To prevent this potential memory leak: - flush blkcg per-cpu stats list in __blkg_release(), when no new stat can be added - add global blkg_stat_lock for covering concurrent parent blkg stat update - don't grab bio->bi_blkg reference when adding the stats into blkcg's per-cpu stat list since all stats are guaranteed to be consumed before releasing blkg instance, and grabbing blkg reference for stats was the most fragile part of original patch Based on Waiman's patch: https://lore.kernel.org/linux-block/20221215033132.230023-3-longman@redhat.com/ Fixes: 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()") Cc: stable@vger.kernel.org Reported-by: Jay Shin <jaeshin@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: mkoutny@suse.com Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20230609234249.1412858-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-11Linux 6.4-rc6v6.4-rc6Linus Torvalds1-1/+1
2023-06-11IB/isert: Fix incorrect release of isert connectionSaravanan Vajravel1-2/+0
The ib_isert module is releasing the isert connection both in isert_wait_conn() handler as well as isert_free_conn() handler. In isert_wait_conn() handler, it is expected to wait for iSCSI session logout operation to complete. It should free the isert connection only in isert_free_conn() handler. When a bunch of iSER target is cleared, this issue can lead to use-after-free memory issue as isert conn is twice released Fixes: b02efbfc9a05 ("iser-target: Fix implicit termination of connections") Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/20230606102531.162967-4-saravanan.vajravel@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11IB/isert: Fix possible list corruption in CMA handlerSaravanan Vajravel1-0/+4
When ib_isert module receives connection error event, it is releasing the isert session and removes corresponding list node but it doesn't take appropriate mutex lock to remove the list node. This can lead to linked list corruption Fixes: bd3792205aae ("iser-target: Fix pending connections handling in target stack shutdown sequnce") Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Link: https://lore.kernel.org/r/20230606102531.162967-3-saravanan.vajravel@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11IB/isert: Fix dead lock in ib_isertSaravanan Vajravel1-2/+8
- When a iSER session is released, ib_isert module is taking a mutex lock and releasing all pending connections. As part of this, ib_isert is destroying rdma cm_id. To destroy cm_id, rdma_cm module is sending CM events to CMA handler of ib_isert. This handler is taking same mutex lock. Hence it leads to deadlock between ib_isert & rdma_cm modules. - For fix, created local list of pending connections and release the connection outside of mutex lock. Calltrace: --------- [ 1229.791410] INFO: task kworker/10:1:642 blocked for more than 120 seconds. [ 1229.791416] Tainted: G OE --------- - - 4.18.0-372.9.1.el8.x86_64 #1 [ 1229.791418] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1229.791419] task:kworker/10:1 state:D stack: 0 pid: 642 ppid: 2 flags:0x80004000 [ 1229.791424] Workqueue: ib_cm cm_work_handler [ib_cm] [ 1229.791436] Call Trace: [ 1229.791438] __schedule+0x2d1/0x830 [ 1229.791445] ? select_idle_sibling+0x23/0x6f0 [ 1229.791449] schedule+0x35/0xa0 [ 1229.791451] schedule_preempt_disabled+0xa/0x10 [ 1229.791453] __mutex_lock.isra.7+0x310/0x420 [ 1229.791456] ? select_task_rq_fair+0x351/0x990 [ 1229.791459] isert_cma_handler+0x224/0x330 [ib_isert] [ 1229.791463] ? ttwu_queue_wakelist+0x159/0x170 [ 1229.791466] cma_cm_event_handler+0x25/0xd0 [rdma_cm] [ 1229.791474] cma_ib_handler+0xa7/0x2e0 [rdma_cm] [ 1229.791478] cm_process_work+0x22/0xf0 [ib_cm] [ 1229.791483] cm_work_handler+0xf4/0xf30 [ib_cm] [ 1229.791487] ? move_linked_works+0x6e/0xa0 [ 1229.791490] process_one_work+0x1a7/0x360 [ 1229.791491] ? create_worker+0x1a0/0x1a0 [ 1229.791493] worker_thread+0x30/0x390 [ 1229.791494] ? create_worker+0x1a0/0x1a0 [ 1229.791495] kthread+0x10a/0x120 [ 1229.791497] ? set_kthread_struct+0x40/0x40 [ 1229.791499] ret_from_fork+0x1f/0x40 [ 1229.791739] INFO: task targetcli:28666 blocked for more than 120 seconds. [ 1229.791740] Tainted: G OE --------- - - 4.18.0-372.9.1.el8.x86_64 #1 [ 1229.791741] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1229.791742] task:targetcli state:D stack: 0 pid:28666 ppid: 5510 flags:0x00004080 [ 1229.791743] Call Trace: [ 1229.791744] __schedule+0x2d1/0x830 [ 1229.791746] schedule+0x35/0xa0 [ 1229.791748] schedule_preempt_disabled+0xa/0x10 [ 1229.791749] __mutex_lock.isra.7+0x310/0x420 [ 1229.791751] rdma_destroy_id+0x15/0x20 [rdma_cm] [ 1229.791755] isert_connect_release+0x115/0x130 [ib_isert] [ 1229.791757] isert_free_np+0x87/0x140 [ib_isert] [ 1229.791761] iscsit_del_np+0x74/0x120 [iscsi_target_mod] [ 1229.791776] lio_target_np_driver_store+0xe9/0x140 [iscsi_target_mod] [ 1229.791784] configfs_write_file+0xb2/0x110 [ 1229.791788] vfs_write+0xa5/0x1a0 [ 1229.791792] ksys_write+0x4f/0xb0 [ 1229.791794] do_syscall_64+0x5b/0x1a0 [ 1229.791798] entry_SYSCALL_64_after_hwframe+0x65/0xca Fixes: bd3792205aae ("iser-target: Fix pending connections handling in target stack shutdown sequnce") Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Link: https://lore.kernel.org/r/20230606102531.162967-2-saravanan.vajravel@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11Merge tag 'x86_urgent_for_v6.4_rc6' of ↵Linus Torvalds1-9/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Borislav Petkov: - Set up the kernel CS earlier in the boot process in case EFI boots the kernel after bypassing the decompressor and the CS descriptor used ends up being the EFI one which is not mapped in the identity page table, leading to early SEV/SNP guest communication exceptions resulting in the guest crashing * tag 'x86_urgent_for_v6.4_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/head/64: Switch to KERNEL_CS as soon as new GDT is installed
2023-06-11Merge tag '6.4-rc5-smb3-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds6-56/+62
Pull smb server fixes from Steve French: "Five smb3 server fixes, all also for stable: - Fix four slab out of bounds warnings: improve checks for protocol id, and for small packet length, and for create context parsing, and for negotiate context parsing - Fix for incorrect dereferencing POSIX ACLs" * tag '6.4-rc5-smb3-server-fixes' of git://git.samba.org/ksmbd: ksmbd: validate smb request protocol id ksmbd: check the validation of pdu_size in ksmbd_conn_handler_loop ksmbd: fix posix_acls and acls dereferencing possible ERR_PTR() ksmbd: fix out-of-bound read in parse_lease_state() ksmbd: fix out-of-bound read in deassemble_neg_contexts()
2023-06-11RDMA/mlx5: Fix affinity assignmentMark Bloch4-12/+18
The cited commit aimed to ensure that Virtual Functions (VFs) assign a queue affinity to a Queue Pair (QP) to distribute traffic when the LAG master creates a hardware LAG. If the affinity was set while the hardware was not in LAG, the firmware would ignore the affinity value. However, this commit unintentionally assigned an affinity to QPs on the LAG master's VPORT even if the RDMA device was not marked as LAG-enabled. In most cases, this was not an issue because when the hardware entered hardware LAG configuration, the RDMA device of the LAG master would be destroyed and a new one would be created, marked as LAG-enabled. The problem arises when a user configures Equal-Cost Multipath (ECMP). In ECMP mode, traffic can be directed to different physical ports based on the queue affinity, which is intended for use by VPORTS other than the E-Switch manager. ECMP mode is supported only if both E-Switch managers are in switchdev mode and the appropriate route is configured via IP. In this configuration, the RDMA device is not destroyed, and we retain the RDMA device that is not marked as LAG-enabled. To ensure correct behavior, Send Queues (SQs) opened by the E-Switch manager through verbs should be assigned strict affinity. This means they will only be able to communicate through the native physical port associated with the E-Switch manager. This will prevent the firmware from assigning affinity and will not allow the SQs to be remapped in case of failover. Fixes: 802dcc7fc5ec ("RDMA/mlx5: Support TX port affinity for VF drivers in LAG mode") Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://lore.kernel.org/r/425b05f4da840bc684b0f7e8ebf61aeb5cef09b0.1685960567.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11IB/uverbs: Fix to consider event queue closing also upon non-blocking modeYishai Hadas1-7/+5
Fix ib_uverbs_event_read() to consider event queue closing also upon non-blocking mode. Once the queue is closed (e.g. hot-plug flow) all the existing events are cleaned-up as part of ib_uverbs_free_event_queue(). An application that uses the non-blocking FD mode should get -EIO in that case to let it knows that the device was removed already. Otherwise, it can loose the indication that the device was removed and won't recover. As part of that, refactor the code to have a single flow with regards to 'is_closed' for both blocking and non-blocking modes. Fixes: 14e23bd6d221 ("RDMA/core: Fix locking in ib_uverbs_event_read") Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/97b00116a1e1e13f8dc4ec38a5ea81cf8c030210.1685960567.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11RDMA/uverbs: Restrict usage of privileged QKEYsEdward Srouji1-1/+6
According to the IB specification rel-1.6, section 3.5.3: "QKEYs with the most significant bit set are considered controlled QKEYs, and a HCA does not allow a consumer to arbitrarily specify a controlled QKEY." Thus, block non-privileged users from setting such a QKEY. Cc: stable@vger.kernel.org Fixes: bc38a6abdd5a ("[PATCH] IB uverbs: core implementation") Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://lore.kernel.org/r/c00c809ddafaaf87d6f6cb827978670989a511b3.1685960567.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11RDMA/cma: Always set static rate to 0 for RoCEMark Zhang2-25/+2
Set static rate to 0 as it should be discovered by path query and has no meaning for RoCE. This also avoid of using the rtnl lock and ethtool API, which is a bottleneck when try to setup many rdma-cm connections at the same time, especially with multiple processes. Fixes: 3c86aa70bf67 ("RDMA/cm: Add RDMA CM support for IBoE devices") Signed-off-by: Mark Zhang <markzhang@nvidia.com> Link: https://lore.kernel.org/r/f72a4f8b667b803aee9fa794069f61afb5839ce4.1685960567.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11RDMA/mlx5: Fix Q-counters query in LAG modePatrisious Haddad1-1/+6
Previously we used the core device associated to the IB device in order to do the Q-counters query to the FW, but in LAG mode it is possible that the core device isn't the one that created this VF. Hence instead of using the core device to query the Q-counters we use the ESW core device which is guaranteed to be that of the VF. Fixes: d22467a71ebe ("RDMA/mlx5: Expand switchdev Q-counters to expose representor statistics") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Link: https://lore.kernel.org/r/778d7d7a24892348d0bdef17d2e5f9e044717e86.1685960567.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11RDMA/mlx5: Remove vport Q-counters dependency on normal Q-countersPatrisious Haddad1-18/+40
Previously the Q-counters initialization assumed that the vport Q-counters structures and the normal Q-counters structures are identical in size, and hence when a Q-counter was added to normal Q-counters structure but not to the vport Q-counters struct it would lead to that counter name being NULL in switchdev mode, which could cause the kernel crash below. Currently break the dependency between those two structure and always use the appropriate struct size, in order to remove the assumption that both structure sizes are equal. BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 20c64a067 P4D 20c64a067 PUD 20152b067 PMD 0 Oops: 0000 [#1] SMP CPU: 19 PID: 11717 Comm: devlink Tainted: G OE 6.2.0_mlnx #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:strlen+0x0/0x20 Code: 66 2e 0f 1f 84 00 00 00 00 00 48 01 fe eb 0f 0f b6 07 38 d0 74 10 48 83 c7 01 84 c0 74 05 48 39 f7 75 ec 31 c0 c3 48 89 f8 c3 <80> 3f 00 48 89 f8 74 10 48 83 c7 01 80 3f 00 75 f7 48 29 c7 48 89 RSP: 0018:ffffc9000318b618 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000002c00 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffff888211918110 R09: ffff888211918000 R10: 000000000000001e R11: ffff888211918000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff8881038ec250 FS: 00007fa53342fe80(0000) GS:ffff88885fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000002042b2003 CR4: 0000000000770ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> kernfs_name_hash+0x12/0x80 kernfs_find_ns+0x35/0xb0 kernfs_remove_by_name_ns+0x46/0xc0 remove_files.isra.1+0x30/0x70 internal_create_group+0x253/0x380 internal_create_groups.part.4+0x3e/0xa0 setup_port+0x27a/0x8c0 [ib_core] ib_setup_port_attrs+0x9d/0x300 [ib_core] ib_register_device+0x48e/0x550 [ib_core] __mlx5_ib_add+0x2b/0x80 [mlx5_ib] mlx5_ib_vport_rep_load+0x141/0x360 [mlx5_ib] mlx5_esw_offloads_rep_load+0x48/0xa0 [mlx5_core] esw_offloads_enable+0x41e/0xd10 [mlx5_core] mlx5_eswitch_enable_locked+0x1e3/0x340 [mlx5_core] ? __cond_resched+0x15/0x30 mlx5_devlink_eswitch_mode_set+0x204/0x3c0 [mlx5_core] devlink_nl_cmd_eswitch_set_doit+0x8d/0x100 genl_family_rcv_msg_doit.isra.19+0xea/0x110 genl_rcv_msg+0x19b/0x290 ? devlink_nl_cmd_region_read_dumpit+0x760/0x760 ? devlink_nl_cmd_port_param_get_doit+0x30/0x30 ? devlink_put+0x50/0x50 ? genl_get_cmd_both+0x60/0x60 netlink_rcv_skb+0x54/0x100 genl_rcv+0x24/0x40 netlink_unicast+0x1be/0x2a0 netlink_sendmsg+0x361/0x4d0 sock_sendmsg+0x30/0x40 __sys_sendto+0x11a/0x150 ? handle_mm_fault+0x101/0x2b0 ? do_user_addr_fault+0x21d/0x720 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x34/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fa533611cba Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 c3 0f 1f 44 00 00 55 48 83 ec 30 44 89 4c RSP: 002b:00007ffdb6a898a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000daab00 RCX: 00007fa533611cba RDX: 0000000000000038 RSI: 0000000000daab00 RDI: 0000000000000003 RBP: 0000000000daa910 R08: 00007fa533822000 R09: 000000000000000c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 </TASK> Modules linked in: rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) mlx5_core(OE) mlxdevm(OE) ib_uverbs(OE) ib_core(OE) mlx_compat(OE) mlxfw(OE) memtrack(OE) pci_hyperv_intf nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat dns_resolver nf_nat br_netfilter nfs bridge stp llc lockd grace fscache netfs rfkill overlay iTCO_wdt iTCO_vendor_support kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel i2c_i801 sunrpc lpc_ich sha512_ssse3 pcspkr i2c_smbus mfd_core drm sch_fq_codel i2c_core ip_tables fuse crc32c_intel serio_raw virtio_net net_failover failover [last unloaded: mlxfw] CR2: 0000000000000000 ---[ end trace 0000000000000000 ]--- RIP: 0010:strlen+0x0/0x20 Code: 66 2e 0f 1f 84 00 00 00 00 00 48 01 fe eb 0f 0f b6 07 38 d0 74 10 48 83 c7 01 84 c0 74 05 48 39 f7 75 ec 31 c0 c3 48 89 f8 c3 <80> 3f 00 48 89 f8 74 10 48 83 c7 01 80 3f 00 75 f7 48 29 c7 48 89 RSP: 0018:ffffc9000318b618 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000002c00 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffff888211918110 R09: ffff888211918000 R10: 000000000000001e R11: ffff888211918000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff8881038ec250 FS: 00007fa53342fe80(0000) GS:ffff88885fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000002042b2003 CR4: 0000000000770ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Kernel panic - not syncing: Fatal exception Kernel Offset: disabled ---[ end Kernel panic - not syncing: Fatal exception ]--- Fixes: d22467a71ebe ("RDMA/mlx5: Expand switchdev Q-counters to expose representor statistics") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Link: https://lore.kernel.org/r/016777b7f16eb6bb178999ff59097d0c0f91f68a.1685960567.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11RDMA/mlx5: Fix Q-counters per vport allocationPatrisious Haddad1-8/+16
Previously Q-counters data was being allocated over the PF for all of the available vports, however that isn't necessary. Since each VF or SF has a Q-counter allocated for itself. So we only need to allocate two counters data structures, one for the device counters, and one for all the other vports to expose the representors, since they only need to read from it in order to determine mainly counters numbers and names, so they can all share. This in turn also solves a bug we previously had where we couldn't switch the device to switchdev mode when there were more than 128 SF/VFs configured, since that is the maximum amount of Q-counters available for a single port Fixes: d22467a71ebe ("RDMA/mlx5: Expand switchdev Q-counters to expose representor statistics") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Link: https://lore.kernel.org/r/f54671df16e2227a069b229b33b62cd9ee24c475.1685960567.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11RDMA/mlx5: Create an indirect flow table for steering anchorMark Bloch3-7/+296
A misbehaved user can create a steering anchor that points to a kernel flow table and then destroy the anchor without freeing the associated STC. This creates a problem as the kernel can't destroy the flow table since there is still a reference to it. As a result, this can exhaust all available flow table resources, preventing other users from using the RDMA device. To prevent this problem, a solution is implemented where a special flow table with two steering rules is created when a user creates a steering anchor for the first time. The rules include one that drops all traffic and another that points to the kernel flow table. If the steering anchor is destroyed, only the rule pointing to the kernel's flow table is removed. Any traffic reaching the special flow table after that is dropped. Since the special flow table is not destroyed when the steering anchor is destroyed, any issues are prevented from occurring. The remaining resources are only destroyed when the RDMA device is destroyed, which happens after all DEVX objects are freed, including the STCs, thus mitigating the issue. Fixes: 0c6ab0ca9a66 ("RDMA/mlx5: Expose steering anchor to userspace") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Link: https://lore.kernel.org/r/b4a88a871d651fa4e8f98d552553c1cfe9ba2cd6.1685960567.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11RDMA/mlx5: Initiate dropless RQ for RAW Ethernet functionsMaher Sanalla1-0/+3
Delay drop data is initiated for PFs that have the capability of rq_delay_drop and are in roce profile. However, PFs with RAW ethernet profile do not initiate delay drop data on function load, causing kernel panic if delay drop struct members are accessed later on in case a dropless RQ is created. Thus, stage the delay drop initialization as part of RAW ethernet PF loading process. Fixes: b5ca15ad7e61 ("IB/mlx5: Add proper representors support") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Link: https://lore.kernel.org/r/2e9d386785043d48c38711826eb910315c1de141.1685960567.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-10Merge tag 'i2c-for-6.4-rc6' of ↵Linus Torvalds8-7/+37
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Biggest news is that Andi Shyti steps in for maintaining the controller drivers. Thank you very much! Other than that, one new driver maintainer and the rest is usual driver bugfixes. at24 has a Kconfig dependecy fix" * tag 'i2c-for-6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: MAINTAINERS: Add entries for Renesas RZ/V2M I2C driver eeprom: at24: also select REGMAP i2c: sprd: Delete i2c adapter in .remove's error path i2c: mv64xxx: Fix reading invalid status value in atomic mode i2c: designware: fix idx_write_cnt in read loop i2c: mchp-pci1xxxx: Avoid cast to incompatible function type i2c: img-scb: Fix spelling mistake "innacurate" -> "inaccurate" MAINTAINERS: Add myself as I2C host drivers maintainer
2023-06-10Merge tag 'soundwire-6.4-fixes' of ↵Linus Torvalds3-5/+23
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire Pull soundwire fixes from Vinod Koul: "Core fix for missing flag clear, error patch handling in qcom driver and BIOS quirk for HP Spectre x360: - HP Spectre x360 soundwire DMI quirk - Error path handling for qcom driver - Core fix for missing clear of alloc_slave_rt" * tag 'soundwire-6.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: soundwire: stream: Add missing clear of alloc_slave_rt soundwire: qcom: add proper error paths in qcom_swrm_startup() soundwire: dmi-quirks: add new mapping for HP Spectre x360
2023-06-10Merge tag 'arm-fixes-6.4-2' of ↵Linus Torvalds82-226/+479
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "Most of the changes this time are for the Qualcomm Snapdragon platforms. There are bug fixes for error handling in Qualcomm icc-bwmon, rpmh-rsc, ramp_controller and rmtfs driver as well as the AMD tee firmware driver and a missing initialization in the Arm ff-a firmware driver. The Qualcomm RPMh and EDAC drivers need some rework to work correctly on all supported chips. The DT fixes include: - i.MX8 fixes for gpio, pinmux and clock settings - ADS touchscreen gpio polarity settings in several machines - Address dtb warnings for caches, panel and input-enable properties on Qualcomm platforms - Incorrect data on qualcomm platforms fir SA8155P power domains, SM8550 LLCC, SC7180-lite SDRAM frequencies and SM8550 soundwire - Remoteproc firmware paths are corrected for Sony Xperia 10 IV" * tag 'arm-fixes-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (36 commits) firmware: arm_ffa: Set handle field to zero in memory descriptor ARM: dts: Fix erroneous ADS touchscreen polarities arm64: dts: imx8mn-beacon: Fix SPI CS pinmux arm64: dts: imx8-ss-dma: assign default clock rate for lpuarts arm64: dts: imx8qm-mek: correct GPIOs for USDHC2 CD and WP signals EDAC/qcom: Get rid of hardcoded register offsets EDAC/qcom: Remove superfluous return variable assignment in qcom_llcc_core_setup() arm64: dts: qcom: sm8550: Use the correct LLCC register scheme dt-bindings: cache: qcom,llcc: Fix SM8550 description arm64: dts: qcom: sc7180-lite: Fix SDRAM freq for misidentified sc7180-lite boards arm64: dts: qcom: sm8550: use uint16 for Soundwire interval soc: qcom: rpmhpd: Add SA8155P power domains arm64: dts: qcom: Split out SA8155P and use correct RPMh power domains dt-bindings: power: qcom,rpmpd: Add SA8155P soc: qcom: Rename ice to qcom_ice to avoid module name conflict soc: qcom: rmtfs: Fix error code in probe() soc: qcom: ramp_controller: Fix an error handling path in qcom_ramp_controller_probe() ARM: dts: at91: sama7g5ek: fix debounce delay property for shdwc ARM: at91: pm: fix imbalanced reference counter for ethernet devices arm64: dts: qcom: sm6375-pdx225: Fix remoteproc firmware paths ...
2023-06-10bnx2x: fix page fault following EEH recoveryDavid Christensen1-2/+7
In the last step of the EEH recovery process, the EEH driver calls into bnx2x_io_resume() to re-initialize the NIC hardware via the function bnx2x_nic_load(). If an error occurs during bnx2x_nic_load(), OS and hardware resources are released and an error code is returned to the caller. When called from bnx2x_io_resume(), the return code is ignored and the network interface is brought up unconditionally. Later attempts to send a packet via this interface result in a page fault due to a null pointer reference. This patch checks the return code of bnx2x_nic_load(), prints an error message if necessary, and does not enable the interface. Signed-off-by: David Christensen <drc@linux.vnet.ibm.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-10Merge tag 'nf-23-06-08' of ↵David S. Miller4-18/+103
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf netfilter pull request 23-06-08 Pablo Neira Ayuso says: ==================== The following patchset contains Netfilter fixes for net: 1) Add commit and abort set operation to pipapo set abort path. 2) Bail out immediately in case of ENOMEM in nfnetlink batch. 3) Incorrect error path handling when creating a new rule leads to dangling pointer in set transaction list. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-10netlabel: fix shift wrapping bug in netlbl_catmap_setlong()Dmitry Mastykin1-1/+2
There is a shift wrapping bug in this code on 32-bit architectures. NETLBL_CATMAP_MAPTYPE is u64, bitmap is unsigned long. Every second 32-bit word of catmap becomes corrupted. Signed-off-by: Dmitry Mastykin <dmastykin@astralinux.ru> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-10Merge branch 'octeontx2-af-fixes'David S. Miller1-5/+2
Naveen Mamindlapalli says: ==================== RVU NIX AF driver fixes This patch series contains few fixes to the RVU NIX AF driver. The first patch modifies the driver check to validate whether the req count for contiguous and non-contiguous arrays is less than the maximum limit. The second patch fixes HW lbk interface link credit programming. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-10octeontx2-af: fix lbk link credits on cn10kNithin Dabilpuram1-4/+0
Fix LBK link credits on CN10K to be same as CN9K i.e 16 * MAX_LBK_DATA_RATE instead of current scheme of calculation based on LBK buf length / FIFO size. Fixes: 6e54e1c5399a ("octeontx2-af: cn10K: Add MTU configuration") Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-10octeontx2-af: fixed resource availability checkSatha Rao1-1/+2
txschq_alloc response have two different arrays to store continuous and non-continuous schedulers of each level. Requested count should be checked for each array separately. Fixes: 5d9b976d4480 ("octeontx2-af: Support fixed transmit scheduler topology") Signed-off-by: Satha Rao <skoteshwar@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-10Merge branch '100GbE' of ↵Jakub Kicinski2-7/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-06-08 (ice) This series contains updates to ice driver only. Simon Horman stops null pointer dereference for GNSS error path. Kamil fixes memory leak when downing interface when XDP is enabled. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Fix XDP memory leak when NIC is brought up and down ice: Don't dereference NULL in ice_gnss_read error path ==================== Link: https://lore.kernel.org/r/20230608200051.451752-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10iavf: remove mask from iavf_irq_enable_queues()Ahmed Zaki3-11/+8
Enable more than 32 IRQs by removing the u32 bit mask in iavf_irq_enable_queues(). There is no need for the mask as there are no callers that select individual IRQs through the bitmask. Also, if the PF allocates more than 32 IRQs, this mask will prevent us from using all of them. Modify the comment in iavf_register.h to show that the maximum number allowed for the IRQ index is 63 as per the iAVF standard 1.0 [1]. link: [1] https://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/ethernet-adaptive-virtual-function-hardware-spec.pdf Fixes: 5eae00c57f5e ("i40evf: main driver core") Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230608200226.451861-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10Merge branch 'selftests-mptcp-skip-tests-not-supported-by-old-kernels-part-2'Jakub Kicinski8-44/+135
Matthieu Baerts says: ==================== selftests: mptcp: skip tests not supported by old kernels (part 2) After a few years of increasing test coverage in the MPTCP selftests, we realised [1] the last version of the selftests is supposed to run on old kernels without issues. Supporting older versions is not that easy for this MPTCP case: these selftests are often validating the internals by checking packets that are exchanged, when some MIB counters are incremented after some actions, how connections are getting opened and closed in some cases, etc. In other words, it is not limited to the socket interface between the userspace and the kernelspace. In addition to that, the current MPTCP selftests run a lot of different sub-tests but the TAP13 protocol used in the selftests don't support sub-tests: one failure in sub-tests implies that the whole selftest is seen as failed at the end because sub-tests are not tracked. It is then important to skip sub-tests not supported by old kernels. To minimise the modifications and reduce the complexity to support old versions, the idea is to look at external signs and skip the whole selftests or just some sub-tests before starting them. This cannot be applied in all cases. This second part focuses on marking different sub-tests as skipped if some MPTCP features are not supported. A few techniques are used here: - Before starting some tests: - Check if a file (sysctl knob) is present: that's what patch 13/14 is doing for the userspace PM feature. - Check if a symbol is present in /proc/kallsyms: patch 1/14 adds some helpers in mptcp_lib.sh to ease its use. Then these helpers are used in patches 2, 3, 4, 10, 11 and 14/14. - Set a flag and get the status to check if a feature is supported: patch 8/14 is doing that with the 'fullmesh' flag. - After having launched the tests: - Retrieve the counters after a test and check if they are different than 0. Similar to the check with the flag, that's not ideal but in this case, the counters were already present before the introduction of MPTCP but they have been supported by MPTCP sockets only later. Patches 5 and 6/14 are using this technique. Before skipping tests, SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var value is checked: if it is set to 1, the test is marked as "failed" instead of "skipped". MPTCP public CI expects to have all features supported and it sets this env var to 1 to catch regressions in these new checks. Patches 7/14 and 9/14 are a bit different because they don't skip tests: - Patch 7/14 retrieves the default values instead of using hardcoded ones because these default values have been modified at some points. Then the comparisons are done with the default values. - patch 9/14 relaxes the expected returned size from MPTCP's getsockopt because the different structures gathering various info can get new fields and get bigger over time. We cannot expect that the userspace is using the same structure as the kernel. Patch 12/14 marks the test as "skipped" instead of "failed" if the "ip" tool is not available. In this second part, the "mptcp_join" selftest is not modified yet. This will come soon after in the third part with quite a few patches. Link: https://lore.kernel.org/stable/CA+G9fYtDGpgT4dckXD-y-N92nqUxuvue_7AtDdBcHrbOMsDZLg@mail.gmail.com/ [1] Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 ==================== Link: https://lore.kernel.org/r/20230608-upstream-net-20230608-mptcp-selftests-support-old-kernels-part-2-v1-0-20997a6fd841@tessares.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10selftests: mptcp: userspace pm: skip PM listener events tests if unavailableMatthieu Baerts1-0/+6
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the new listener events linked to the path-manager introduced by commit f8c9dfbd875b ("mptcp: add pm listener events"). It is possible to look for "mptcp_event_pm_listener" in kallsyms to know in advance if the kernel supports this feature and skip these sub-tests if the feature is not supported. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 6c73008aa301 ("selftests: mptcp: listener test for userspace PM") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10selftests: mptcp: userspace pm: skip if not supportedMatthieu Baerts1-0/+5
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the MPTCP Userspace PM introduced by commit 4638de5aefe5 ("mptcp: handle local addrs announced by userspace PMs"). We can skip all these tests if the feature is not supported simply by looking for the MPTCP pm_type's sysctl knob. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 259a834fadda ("selftests: mptcp: functional tests for the userspace PM type") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10selftests: mptcp: userspace pm: skip if 'ip' tool is unavailableMatthieu Baerts1-1/+1
When a required tool is missing, the return code 4 (SKIP) should be returned instead of 1 (FAIL). Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 259a834fadda ("selftests: mptcp: functional tests for the userspace PM type") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10selftests: mptcp: sockopt: skip TCP_INQ checks if not supportedMatthieu Baerts1-2/+12
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is TCP_INQ cmsg support introduced in commit 2c9e77659a0c ("mptcp: add TCP_INQ cmsg support"). It is possible to look for "mptcp_ioctl" in kallsyms because it was needed to introduce the mentioned feature. We can skip these tests and not set TCPINQ option if the feature is not supported. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 5cbd886ce2a9 ("selftests: mptcp: add TCP_INQ support") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10selftests: mptcp: sockopt: skip getsockopt checks if not supportedMatthieu Baerts1-0/+6
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the getsockopt(SOL_MPTCP) to get info about the MPTCP connections introduced by commit 55c42fa7fa33 ("mptcp: add MPTCP_INFO getsockopt") and the following ones. It is possible to look for "mptcp_diag_fill_info" in kallsyms because it is introduced by the mentioned feature. So we can know in advance if the feature is supported and skip the sub-test if not. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: ce9979129a0b ("selftests: mptcp: add mptcp getsockopt test cases") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10selftests: mptcp: sockopt: relax expected returned sizeMatthieu Baerts1-6/+12
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the getsockopt(SOL_MPTCP) to get info about the MPTCP connections introduced by commit 55c42fa7fa33 ("mptcp: add MPTCP_INFO getsockopt") and the following ones. We cannot guess in advance which sizes the kernel will returned: older kernel can returned smaller sizes, e.g. recently the tcp_info structure has been modified in commit 71fc704768f6 ("tcp: add rcv_wnd and plb_rehash to TCP_INFO") where a new field has been added. The userspace can also expect a smaller size if it is compiled with old uAPI kernel headers. So for these sizes, we can only check if they are above a certain threshold, 0 for the moment. We can also only compared sizes with the ones set by the kernel. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: ce9979129a0b ("selftests: mptcp: add mptcp getsockopt test cases") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10selftests: mptcp: pm nl: skip fullmesh flag checks if not supportedMatthieu Baerts1-5/+10
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the fullmesh flag that can be given to the MPTCP in-kernel path-manager and introduced in commit 2843ff6f36db ("mptcp: remote addresses fullmesh"). If the flag is not visible in the dump after having set it, we don't check the content. Note that if we expect to have this feature and SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var is set to 1, we always check the content to avoid regressions. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 6da1dfdd037e ("selftests: mptcp: add set_flags tests in pm_netlink.sh") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10selftests: mptcp: pm nl: remove hardcoded default limitsMatthieu Baerts1-5/+7
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the checks of the default limits returned by the MPTCP in-kernel path-manager. The default values have been modified by commit 72bcbc46a5c3 ("mptcp: increase default max additional subflows to 2"). Instead of comparing with hardcoded values, we can get the default one and compare with them. Note that if we expect to have the latest version, we continue to check the hardcoded values to avoid unexpected behaviour changes. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: eedbc685321b ("selftests: add PM netlink functional tests") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10selftests: mptcp: diag: skip inuse tests if not supportedMatthieu Baerts1-1/+1
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the reporting of the MPTCP sockets being used, introduced by commit c558246ee73e ("mptcp: add statistics for mptcp socket in use"). Similar to the parent commit, it looks like there is no good pre-check to do here, i.e. dedicated function available in kallsyms. Instead, we try to get info and if nothing is returned, the test is marked as skipped. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: e04a30f78809 ("selftest: mptcp: add test for mptcp socket in use") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10selftests: mptcp: diag: skip listen tests if not supportedMatthieu Baerts1-25/+17
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the listen diag dump support introduced by commit 4fa39b701ce9 ("mptcp: listen diag dump support"). It looks like there is no good pre-check to do here, i.e. dedicated function available in kallsyms. Instead, we try to get info if nothing is returned, the test is marked as skipped. That's not ideal because something could be wrong with the feature and instead of reporting an error, the test could be marked as skipped. If we know in advanced that the feature is supposed to be supported, the tester can set SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var to 1: in this case the test will report an error instead of marking the test as skipped if nothing is returned. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: f2ae0fa68e28 ("selftests/mptcp: add diag listen tests") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10selftests: mptcp: connect: skip TFO tests if not supportedMatthieu Baerts1-0/+5
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of TCP_FASTOPEN socket option with MPTCP connections introduced by commit 4ffb0a02346c ("mptcp: add TCP_FASTOPEN sock option"). It is possible to look for "mptcp_fastopen_" in kallsyms to know if the feature is supported or not. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: ca7ae8916043 ("selftests: mptcp: mptfo Initiator/Listener") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10selftests: mptcp: connect: skip disconnect tests if not supportedMatthieu Baerts1-0/+5
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the full support of disconnections from the userspace introduced by commit b29fcfb54cd7 ("mptcp: full disconnect implementation"). It is possible to look for "mptcp_pm_data_reset" in kallsyms because a preparation patch added it to ease the introduction of the mentioned feature. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 05be5e273c84 ("selftests: mptcp: add disconnect tests") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10selftests: mptcp: connect: skip transp tests if not supportedMatthieu Baerts1-0/+10
Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of IP(V6)_TRANSPARENT socket option with MPTCP connections introduced by commit c9406a23c116 ("mptcp: sockopt: add SOL_IP freebind & transparent options"). It is possible to look for "__ip_sock_set_tos" in kallsyms because IP(V6)_TRANSPARENT socket option support has been added after TOS support which came with the required infrastructure in MPTCP sockopt code. To support TOS, the following function has been exported (T). Not great but better than checking for a specific kernel version. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 5fb62e9cd3ad ("selftests: mptcp: add tproxy test case") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>