aboutsummaryrefslogtreecommitdiffstats
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2024-05-13Merge tag 'soc-arm-6.10' of ↵Linus Torvalds7-38/+124
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC code changes from Arnd Bergmann: "The code changes are fairly minimal, there is a bit of conversion of the old orion5x platform to modern gpio descriptors, the Kconfig entry for the added EN7581 platform and a sysfs change for the i.MX PMU device" * tag 'soc-arm-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: arm64: add Airoha EN7581 platform ARM: orion5x: Convert TS409 board to GPIO descriptors for LEDs ARM: orion5x: Convert Net2big board to GPIO descriptors for LEDs ARM: orion5x: Convert MV2120 board to GPIO descriptors for LEDs ARM: orion5x: Convert DNS323 board to GPIO descriptors for LEDs ARM: orion5x: Convert D2Net board to GPIO descriptors for LEDs ARM: imx: Assign parents for mmdc event_source devices
2024-05-13Merge tag 'soc-drivers-6.10' of ↵Linus Torvalds8-20/+21
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC driver updates from Arnd Bergmann: "As usual, these are updates for drivers that are specific to certain SoCs or firmware running on them. Notable updates include - The new STMicroelectronics STM32 "firewall" bus driver that is used to provide a barrier between different parts of an SoC - Lots of updates for the Qualcomm platform drivers, in particular SCM, which gets a rewrite of its initialization code - Firmware driver updates for Arm FF-A notification interrupts and indirect messaging, SCMI firmware support for pin control and vendor specific interfaces, and TEE firmware interface changes across multiple TEE drivers - A larger cleanup of the Mediatek CMDQ driver and some related bits - Kconfig changes for riscv drivers to prepare for adding Kanaan k230 support - Multiple minor updates for the TI sysc bus driver, memory controllers, hisilicon hccs and more" * tag 'soc-drivers-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (103 commits) firmware: qcom: uefisecapp: Allow on sc8180x Primus and Flex 5G soc: qcom: pmic_glink: Make client-lock non-sleeping dt-bindings: soc: qcom,wcnss: fix bluetooth address example soc/tegra: pmc: Add EQOS wake event for Tegra194 and Tegra234 bus: stm32_firewall: fix off by one in stm32_firewall_get_firewall() bus: etzpc: introduce ETZPC firewall controller driver firmware: arm_ffa: Avoid queuing work when running on the worker queue bus: ti-sysc: Drop legacy idle quirk handling bus: ti-sysc: Drop legacy quirk handling for smartreflex bus: ti-sysc: Drop legacy quirk handling for uarts bus: ti-sysc: Add a description and copyrights bus: ti-sysc: Move check for no-reset-on-init soc: hisilicon: kunpeng_hccs: replace MAILBOX dependency with PCC soc: hisilicon: kunpeng_hccs: Add the check for obtaining complete port attribute firmware: arm_ffa: Fix memory corruption in ffa_msg_send2() bus: rifsc: introduce RIFSC firewall controller driver of: property: fw_devlink: Add support for "access-controller" soc: mediatek: mtk-socinfo: Correct the marketing name for MT8188GV soc: mediatek: mtk-socinfo: Add entry for MT8395AV/ZA Genio 1200 soc: mediatek: mtk-mutex: Add support for MT8188 VPPSYS ...
2024-05-13Merge tag 'soc-dt-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds544-5000/+23503
Pull SoC devicetree updates from Arnd Bergmann: "The updates this time are a bit smaller than most times, mainly because it is not totally dominated by new Qualcomm hardware support. Instead, we larger than average updates for Rockchips, NXP, Allwinner and TI. The only two new SoCs this time are both from NXP and are minor variants of already supported ones. The updates for aspeed, amlogic and mediatek came a little late, so I'm saving those for part 2 in a few days if everything turns out fine. New machines this time contain: - two Broadcom SoC based wireless routers from Asus - Five allwinner based consumer devices for gaming, set-top-box and eboot reader applications - Three older phones based on Qualcomm chips, plus the more recent Sony Xperia 1 V - 14 industrial and embedded boards based on NXP i.MX6, i.MX8, layerscape and s32g3 SoCs - six rockchips boards including another handheld game console and a few single-board computers On top of these, we have the usual cleanups for dtc warnings and updates to add more features to already merged machines" * tag 'soc-dt-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (612 commits) arm64: dts: marvell: espressobin-ultra: fix Ethernet Switch unit address arm64: dts: marvell: turris-mox: drop unneeded flash address/size-cells arm64: dts: marvell: eDPU: drop redundant address/size-cells arm64: dts: qcom: pm6150: correct USB VBUS regulator compatible arm64: dts: rockchip: add rk3588 pcie and php IOMMUs arm64: dts: rockchip: enable onboard spi flash for rock-3a arm64: dts: rockchip: add USB-C support to rk3588s-orangepi-5 arm64: dts: rockchip: Enable GPU on Orange Pi 5 arm64: dts: rockchip: enable GPU on khadas-edge2 arm64: dts: rockchip: Add USB3 on Edgeble NCM6A-IO board arm64: dts: rockchip: Support poweroff on Edgeble Neural Compute Module arm64: dts: rockchip: Add Radxa ROCK 3C dt-bindings: arm: rockchip: add Radxa ROCK 3C arm64: dts: exynos: gs101: specify empty clocks for remaining pinctrl arm64: dts: exynos: gs101: specify bus clock for pinctrl_hsi2 arm64: dts: exynos: gs101: specify bus clock for pinctrl_peric[01] arm64: dts: exynos: gs101: specify bus clock for pinctrl (far) alive arm64: dts: Add/fix /memory node unit-addresses arm64: dts: qcom: qcs404: fix bluetooth device address arm64: dts: qcom: sc8280xp-x13s: enable USB MP and fingerprint reader ...
2024-05-13Merge tag 's390-6.10-1' of ↵Linus Torvalds47-455/+698
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Store AP Query Configuration Information in a static buffer - Rework the AP initialization and add missing cleanups to the error path - Swap IRQ and AP bus/device registration to avoid race conditions - Export prot_virt_guest symbol - Introduce AP configuration changes notifier interface to facilitate modularization of the AP bus - Add CONFIG_AP kernel configuration option to allow modularization of the AP bus - Rework CONFIG_ZCRYPT_DEBUG kernel configuration option description and dependency and rename it to CONFIG_AP_DEBUG - Convert sprintf() and snprintf() to sysfs_emit() in CIO code - Adjust indentation of RELOCS command build step - Make crypto performance counters upward compatible - Convert make_page_secure() and gmap_make_secure() to use folio - Rework channel-utilization-block (CUB) handling in preparation of introducing additional CUBs - Use attribute groups to simplify registration, removal and extension of measurement-related channel-path sysfs attributes - Add a per-channel-path binary "ext_measurement" sysfs attribute that provides access to extended channel-path measurement data - Export measurement data for all channel-measurement-groups (CMG), not only for a specific ones. This enables support of new CMG data formats in userspace without the need for kernel changes - Add a per-channel-path sysfs attribute "speed_bps" that provides the operating speed in bits per second or 0 if the operating speed is not available - The CIO tracepoint subchannel-type field "st" is incorrectly set to the value of subchannel-enabled SCHIB "ena" field. Fix that - Do not forcefully limit vmemmap starting address to MAX_PHYSMEM_BITS - Consider the maximum physical address available to a DCSS segment (512GB) when memory layout is set up - Simplify the virtual memory layout setup by reducing the size of identity mapping vs vmemmap overlap - Swap vmalloc and Lowcore/Real Memory Copy areas in virtual memory. This will allow to place the kernel image next to kernel modules - Move everyting KASLR related from <asm/setup.h> to <asm/page.h> - Put virtual memory layout information into a structure to improve code generation - Currently __kaslr_offset is the kernel offset in both physical and virtual memory spaces. Uncouple these offsets to allow uncoupling of the addresses spaces - Currently the identity mapping base address is implicit and is always set to zero. Make it explicit by putting into __identity_base persistent boot variable and use it in proper context - Introduce .amode31 section start and end macros AMODE31_START and AMODE31_END - Introduce OS_INFO entries that do not reference any data in memory, but rather provide only values - Store virtual memory layout in OS_INFO. It is read out by makedumpfile, crash and other tools - Store virtual memory layout in VMCORE_INFO. It is read out by crash and other tools when /proc/kcore device is used - Create additional PT_LOAD ELF program header that covers kernel image only, so that vmcore tools could locate kernel text and data when virtual and physical memory spaces are uncoupled - Uncouple physical and virtual address spaces - Map kernel at fixed location when KASLR mode is disabled. The location is defined by CONFIG_KERNEL_IMAGE_BASE kernel configuration value. - Rework deployment of kernel image for both compressed and uncompressed variants as defined by CONFIG_KERNEL_UNCOMPRESSED kernel configuration value - Move .vmlinux.relocs section in front of the compressed kernel. The interim section rescue step is avoided as result - Correct modules thunk offset calculation when branch target is more than 2GB away - Kernel modules contain their own set of expoline thunks. Now that the kernel modules area is less than 4GB away from kernel expoline thunks, make modules use kernel expolines. Also make EXPOLINE_EXTERN the default if the compiler supports it - userfaultfd can insert shared zeropages into processes running VMs, but that is not allowed for s390. Fallback to allocating a fresh zeroed anonymous folio and insert that instead - Re-enable shared zeropages for non-PV and non-skeys KVM guests - Rename hex2bitmap() to ap_hex2bitmap() and export it for external use - Add ap_config sysfs attribute to provide the means for setting or displaying adapters, domains and control domains assigned to a vfio-ap mediated device in a single operation - Make vfio_ap_mdev_link_queue() ignore duplicate link requests - Add write support to ap_config sysfs attribute to allow atomic update a vfio-ap mediated device state - Document ap_config sysfs attribute - Function os_info_old_init() is expected to be called only from a regular kdump kernel. Enable it to be called from a stand-alone dump kernel - Address gcc -Warray-bounds warning and fix array size in struct os_info - s390 does not support SMBIOS, so drop unneeded CONFIG_DMI checks - Use unwinder instead of __builtin_return_address() with ftrace to prevent returning of undefined values - Sections .hash and .gnu.hash are only created when CONFIG_PIE_BUILD kernel is enabled. Drop these for the case CONFIG_PIE_BUILD is disabled - Compile kernel with -fPIC and link with -no-pie to allow kpatch feature always succeed and drop the whole CONFIG_PIE_BUILD option-enabled code - Add missing virt_to_phys() converter for VSIE facility and crypto control blocks * tag 's390-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (54 commits) Revert "s390: Relocate vmlinux ELF data to virtual address space" KVM: s390: vsie: Use virt_to_phys for crypto control block s390: Relocate vmlinux ELF data to virtual address space s390: Compile kernel with -fPIC and link with -no-pie s390: vmlinux.lds.S: Drop .hash and .gnu.hash for !CONFIG_PIE_BUILD s390/ftrace: Use unwinder instead of __builtin_return_address() s390/pci: Drop unneeded reference to CONFIG_DMI s390/os_info: Fix array size in struct os_info s390/os_info: Initialize old os_info in standalone dump kernel docs: Update s390 vfio-ap doc for ap_config sysfs attribute s390/vfio-ap: Add write support to sysfs attr ap_config s390/vfio-ap: Ignore duplicate link requests in vfio_ap_mdev_link_queue s390/vfio-ap: Add sysfs attr, ap_config, to export mdev state s390/ap: Externalize AP bus specific bitmap reading function s390/mm: Re-enable the shared zeropage for !PV and !skeys KVM guests mm/userfaultfd: Do not place zeropages when zeropages are disallowed s390/expoline: Make modules use kernel expolines s390/nospec: Correct modules thunk offset calculation s390/boot: Do not rescue .vmlinux.relocs section s390/boot: Rework deployment of the kernel image ...
2024-05-13sh: setup: Add missing forward declaration for sh_fdt_init()Geert Uytterhoeven1-0/+1
arch/sh/kernel/setup.c:244:12: warning: no previous prototype for 'sh_fdt_init' [-Wmissing-prototypes] Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/7e3ea09e706a075bceb6bfd172990676e79be1c2.1715606232.git.geert+renesas@glider.be Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2024-05-13sh: smp: Protect setup_profiling_timer() by CONFIG_PROFILINGGeert Uytterhoeven1-0/+2
arch/sh/kernel/smp.c:326:5: warning: no previous prototype for 'setup_profiling_timer' [-Wmissing-prototypes] The function is unconditionally defined in smp.c, but conditionally declared in <linux/profile.h>. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/effa5eecbd2389c6661974e91bb834db210989ea.1715606232.git.geert+renesas@glider.be Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2024-05-13sh: of-generic: Add missing #include <asm/clock.h>Geert Uytterhoeven1-0/+2
arch/sh/boards/of-generic.c:146:20: warning: no previous prototype for 'arch_init_clk_ops' [-Wmissing-prototypes] Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/942621553ed82e3331e2e91485b643892d2d08bc.1715606232.git.geert+renesas@glider.be Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2024-05-13Merge branch 'topic/kdump-hotplug' into nextMichael Ellerman11-327/+676
Merge our topic branch containing kdump hotplug changes, more detail from the original cover letter: Commit 247262756121 ("crash: add generic infrastructure for crash hotplug support") added a generic infrastructure that allows architectures to selectively update the kdump image component during CPU or memory add/remove events within the kernel itself. This patch series adds crash hotplug handler for PowerPC and enable support to update the kdump image on CPU/Memory add/remove events. Among the 6 patches in this series, the first two patches make changes to the generic crash hotplug handler to assist PowerPC in adding support for this feature. The last four patches add support for this feature. The following section outlines the problem addressed by this patch series, along with the current solution, its shortcomings, and the proposed resolution. Problem: ======== Due to CPU/Memory hotplug or online/offline events the elfcorehdr (which describes the CPUs and memory of the crashed kernel) and FDT (Flattened Device Tree) of kdump image becomes outdated. Consequently, attempting dump collection with an outdated elfcorehdr or FDT can lead to failed or inaccurate dump collection. Going forward CPU hotplug or online/offline events are referred as CPU/Memory add/remove events. Existing solution and its shortcoming: ====================================== The current solution to address the above issue involves monitoring the CPU/memory add/remove events in userspace using udev rules and whenever there are changes in CPU and memory resources, the entire kdump image is loaded again. The kdump image includes kernel, initrd, elfcorehdr, FDT, purgatory. Given that only elfcorehdr and FDT get outdated due to CPU/Memory add/remove events, reloading the entire kdump image is inefficient. More importantly, kdump remains inactive for a substantial amount of time until the kdump reload completes. Proposed solution: ================== Instead of initiating a full kdump image reload from userspace on CPU/Memory hotplug and online/offline events, the proposed solution aims to update only the necessary kdump image component within the kernel itself.
2024-05-13Merge branch 'topic/ppc-kvm' into nextMichael Ellerman3-7/+3
Merge our KVM topic branch.
2024-05-13Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'core' and 'x86/vt-d' ↵Joerg Roedel8-35/+16
into next
2024-05-13sh: dreamcast: Fix GAPS PCI bridge addressingArtur Rojek2-1/+5
The G2-to-PCI bridge chip found in SEGA Dreamcast assumes P2 area relative addresses. Set the appropriate IOPORT base offset. Tested-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Artur Rojek <contact@artur-rojek.eu> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/20240511191614.68561-2-contact@artur-rojek.eu Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2024-05-12riscv, bpf: make some atomic operations fully orderedPuranjay Mohan1-10/+10
The BPF atomic operations with the BPF_FETCH modifier along with BPF_XCHG and BPF_CMPXCHG are fully ordered but the RISC-V JIT implements all atomic operations except BPF_CMPXCHG with relaxed ordering. Section 8.1 of the "The RISC-V Instruction Set Manual Volume I: Unprivileged ISA" [1], titled, "Specifying Ordering of Atomic Instructions" says: | To provide more efficient support for release consistency [5], each | atomic instruction has two bits, aq and rl, used to specify additional | memory ordering constraints as viewed by other RISC-V harts. and | If only the aq bit is set, the atomic memory operation is treated as | an acquire access. | If only the rl bit is set, the atomic memory operation is treated as a | release access. | | If both the aq and rl bits are set, the atomic memory operation is | sequentially consistent. Fix this by setting both aq and rl bits as 1 for operations with BPF_FETCH and BPF_XCHG. [1] https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf Fixes: dd642ccb45ec ("riscv, bpf: Implement more atomic operations for RV64") Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Reviewed-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/r/20240505201633.123115-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12riscv, bpf: Fix typo in commentXiao Wang1-2/+2
We can use either "instruction" or "insn" in the comment. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/r/20240507111618.437121-1-xiao.w.wang@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12s390/bpf: Emit a barrier for BPF_FETCH instructionsIlya Leoshkevich1-2/+6
BPF_ATOMIC_OP() macro documentation states that "BPF_ADD | BPF_FETCH" should be the same as atomic_fetch_add(), which is currently not the case on s390x: the serialization instruction "bcr 14,0" is missing. This applies to "and", "or" and "xor" variants too. s390x is allowed to reorder stores with subsequent fetches from different addresses, so code relying on BPF_FETCH acting as a barrier, for example: stw [%r0], 1 afadd [%r1], %r2 ldxw %r3, [%r4] may be broken. Fix it by emitting "bcr 14,0". Note that a separate serialization instruction is not needed for BPF_XCHG and BPF_CMPXCHG, because COMPARE AND SWAP performs serialization itself. Fixes: ba3b86b9cef0 ("s390/bpf: Implement new atomic ops") Reported-by: Puranjay Mohan <puranjay12@gmail.com> Closes: https://lore.kernel.org/bpf/mb61p34qvq3wf.fsf@kernel.org/ Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20240507000557.12048-1-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12bpf, arm64: inline bpf_get_smp_processor_id() helperPuranjay Mohan3-0/+28
Inline calls to bpf_get_smp_processor_id() helper in the JIT by emitting a read from struct thread_info. The SP_EL0 system register holds the pointer to the task_struct and thread_info is the first member of this struct. We can read the cpu number from the thread_info. Here is how the ARM64 JITed assembly changes after this commit: ARM64 JIT =========== BEFORE AFTER -------- ------- int cpu = bpf_get_smp_processor_id(); int cpu = bpf_get_smp_processor_id(); mov x10, #0xfffffffffffff4d0 mrs x10, sp_el0 movk x10, #0x802b, lsl #16 ldr w7, [x10, #24] movk x10, #0x8000, lsl #32 blr x10 add x7, x0, #0x0 Performance improvement using benchmark[1] ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc +---------------+-------------------+-------------------+--------------+ | Name | Before | After | % change | |---------------+-------------------+-------------------+--------------| | glob-arr-inc | 23.380 ± 1.675M/s | 25.893 ± 0.026M/s | + 10.74% | | arr-inc | 23.928 ± 0.034M/s | 25.213 ± 0.063M/s | + 5.37% | | hash-inc | 12.352 ± 0.005M/s | 12.609 ± 0.013M/s | + 2.08% | +---------------+-------------------+-------------------+--------------+ [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240502151854.9810-5-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12arm64, bpf: add internal-only MOV instruction to resolve per-CPU addrsPuranjay Mohan4-0/+38
Support an instruction for resolving absolute addresses of per-CPU data from their per-CPU offsets. This instruction is internal-only and users are not allowed to use them directly. They will only be used for internal inlining optimizations for now between BPF verifier and BPF JITs. Since commit 7158627686f0 ("arm64: percpu: implement optimised pcpu access using tpidr_el1"), the per-cpu offset for the CPU is stored in the tpidr_el1/2 register of that CPU. To support this BPF instruction in the ARM64 JIT, the following ARM64 instructions are emitted: mov dst, src // Move src to dst, if src != dst mrs tmp, tpidr_el1/2 // Move per-cpu offset of the current cpu in tmp. add dst, dst, tmp // Add the per cpu offset to the dst. To measure the performance improvement provided by this change, the benchmark in [1] was used: Before: glob-arr-inc : 23.597 ± 0.012M/s arr-inc : 23.173 ± 0.019M/s hash-inc : 12.186 ± 0.028M/s After: glob-arr-inc : 23.819 ± 0.034M/s arr-inc : 23.285 ± 0.017M/s hash-inc : 12.419 ± 0.011M/s [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240502151854.9810-4-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12riscv, bpf: inline bpf_get_smp_processor_id()Puranjay Mohan1-0/+26
Inline the calls to bpf_get_smp_processor_id() in the riscv bpf jit. RISCV saves the pointer to the CPU's task_struct in the TP (thread pointer) register. This makes it trivial to get the CPU's processor id. As thread_info is the first member of task_struct, we can read the processor id from TP + offsetof(struct thread_info, cpu). RISCV64 JIT output for `call bpf_get_smp_processor_id` ====================================================== Before After -------- ------- auipc t1,0x848c ld a5,32(tp) jalr 604(t1) mv a5,a0 Benchmark using [1] on Qemu. ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc +---------------+------------------+------------------+--------------+ | Name | Before | After | % change | |---------------+------------------+------------------+--------------| | glob-arr-inc | 1.077 ± 0.006M/s | 1.336 ± 0.010M/s | + 24.04% | | arr-inc | 1.078 ± 0.002M/s | 1.332 ± 0.015M/s | + 23.56% | | hash-inc | 0.494 ± 0.004M/s | 0.653 ± 0.001M/s | + 32.18% | +---------------+------------------+------------------+--------------+ NOTE: This benchmark includes changes from this patch and the previous patch that implemented the per-cpu insn. [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20240502151854.9810-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12riscv, bpf: add internal-only MOV instruction to resolve per-CPU addrsPuranjay Mohan1-0/+24
Support an instruction for resolving absolute addresses of per-CPU data from their per-CPU offsets. This instruction is internal-only and users are not allowed to use them directly. They will only be used for internal inlining optimizations for now between BPF verifier and BPF JITs. RISC-V uses generic per-cpu implementation where the offsets for CPUs are kept in an array called __per_cpu_offset[cpu_number]. RISCV stores the address of the task_struct in TP register. The first element in task_struct is struct thread_info, and we can get the cpu number by reading from the TP register + offsetof(struct thread_info, cpu). Once we have the cpu number in a register we read the offset for that cpu from address: &__per_cpu_offset + cpu_number << 3. Then we add this offset to the destination register. To measure the improvement from this change, the benchmark in [1] was used on Qemu: Before: glob-arr-inc : 1.127 ± 0.013M/s arr-inc : 1.121 ± 0.004M/s hash-inc : 0.681 ± 0.052M/s After: glob-arr-inc : 1.138 ± 0.011M/s arr-inc : 1.366 ± 0.006M/s hash-inc : 0.676 ± 0.001M/s [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20240502151854.9810-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12ARC: Add eBPF JIT supportShahab Vahedi6-0/+4602
This will add eBPF JIT support to the 32-bit ARCv2 processors. The implementation is qualified by running the BPF tests on a Synopsys HSDK board with "ARC HS38 v2.1c at 500 MHz" as the 4-core CPU. The test_bpf.ko reports 2-10 fold improvements in execution time of its tests. For instance: test_bpf: #33 tcpdump port 22 jited:0 704 1766 2104 PASS test_bpf: #33 tcpdump port 22 jited:1 120 224 260 PASS test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:0 238 PASS test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:1 23 PASS test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:0 2034681 PASS test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:1 1020022 PASS Deployment and structure ------------------------ The related codes are added to "arch/arc/net": - bpf_jit.h -- The interface that a back-end translator must provide - bpf_jit_core.c -- Knows how to handle the input eBPF byte stream - bpf_jit_arcv2.c -- The back-end code that knows the translation logic The bpf_int_jit_compile() at the end of bpf_jit_core.c is the entrance to the whole process. Normally, the translation is done in one pass, namely the "normal pass". In case some relocations are not known during this pass, some data (arc_jit_data) is allocated for the next pass to come. This possible next (and last) pass is called the "extra pass". 1. Normal pass # The necessary pass 1a. Dry run # Get the whole JIT length, epilogue offset, etc. 1b. Emit phase # Allocate memory and start emitting instructions 2. Extra pass # Only needed if there are relocations to be fixed 2a. Patch relocations Support status -------------- The JIT compiler supports BPF instructions up to "cpu=v4". However, it does not yet provide support for: - Tail calls - Atomic operations - 64-bit division/remainder - BPF_PROBE_MEM* (exception table) The result of "test_bpf" test suite on an HSDK board is: hsdk-lnx# insmod test_bpf.ko test_suite=test_bpf test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed] All the failing test cases are due to the ones that were not JIT'ed. Categorically, they can be represented as: .-----------.------------.-------------. | test type | opcodes | # of cases | |-----------+------------+-------------| | atomic | 0xC3, 0xDB | 149 | | div64 | 0x37, 0x3F | 22 | | mod64 | 0x97, 0x9F | 15 | `-----------^------------+-------------| | (total) 186 | `-------------' Setup: build config ------------------- The following configs must be set to have a working JIT test: CONFIG_BPF_JIT=y CONFIG_BPF_JIT_ALWAYS_ON=y CONFIG_TEST_BPF=m The following options are not necessary for the tests module, but are good to have: CONFIG_DEBUG_INFO=y # prerequisite for below CONFIG_DEBUG_INFO_BTF=y # so bpftool can generate vmlinux.h CONFIG_FTRACE=y # CONFIG_BPF_SYSCALL=y # all these options lead to CONFIG_KPROBE_EVENTS=y # having CONFIG_BPF_EVENTS=y CONFIG_PERF_EVENTS=y # Some BPF programs provide data through /sys/kernel/debug: CONFIG_DEBUG_FS=y arc# mount -t debugfs debugfs /sys/kernel/debug Setup: elfutils --------------- The libdw.{so,a} library that is used by pahole for processing the final binary must come from elfutils 0.189 or newer. The support for ARCv2 [1] has been added since that version. [1] https://sourceware.org/git/?p=elfutils.git;a=commit;h=de3d46b3e7 Setup: pahole ------------- The line below in linux/scripts/Makefile.btf must be commented out: pahole-flags-$(call test-ge, $(pahole-ver), 121) += --btf_gen_floats Or else, the build will fail: $ make V=1 ... BTF .btf.vmlinux.bin.o pahole -J --btf_gen_floats \ -j --lang_exclude=rust \ --skip_encoding_btf_inconsistent_proto \ --btf_gen_optimized .tmp_vmlinux.btf Complex, interval and imaginary float types are not supported Encountered error while encoding BTF. ... BTFIDS vmlinux ./tools/bpf/resolve_btfids/resolve_btfids vmlinux libbpf: failed to find '.BTF' ELF section in vmlinux FAILED: load BTF from vmlinux: No data available This is due to the fact that the ARC toolchains generate "complex float" DIE entries in libgcc and at the moment, pahole can't handle such entries. Running the tests ----------------- host$ scp /bld/linux/lib/test_bpf.ko arc: arc # sysctl net.core.bpf_jit_enable=1 arc # insmod test_bpf.ko test_suite=test_bpf ... test_bpf: #1048 Staggered jumps: JMP32_JSLE_X jited:1 697811 PASS test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed] Acknowledgments --------------- - Claudiu Zissulescu for his unwavering support - Yuriy Kolerov for testing and troubleshooting - Vladimir Isaev for the pahole workaround - Sergey Matyukevich for paving the road by adding the interpreter support Signed-off-by: Shahab Vahedi <shahab@synopsys.com> Link: https://lore.kernel.org/r/20240430145604.38592-1-list+bpf@vahedi.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12Merge tag 'for-linus-6.9' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-1/+1
Pull kvm fix from Paolo Bonzini: - Fix NULL pointer read on s390 in ioctl(KVM_CHECK_EXTENSION) for /dev/kvm * tag 'for-linus-6.9' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: s390: Check kvm pointer when testing KVM_CAP_S390_HPAGE_1M
2024-05-12Merge tag 'x86_urgent_for_v6.9' of ↵Linus Torvalds2-9/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add a new PCI ID which belongs to a new AMD CPU family 0x1a - Ensure that that last level cache ID is set in all cases, in the AMD CPU topology parsing code, in order to prevent invalid scheduling domain CPU masks * tag 'x86_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/topology/amd: Ensure that LLC ID is initialized x86/amd_nb: Add new PCI IDs for AMD family 0x1a
2024-05-12Merge tag 'kvm-x86-misc-6.10' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini6-30/+52
KVM x86 misc changes for 6.10: - Advertise the max mappable GPA in the "guest MAXPHYADDR" CPUID field, which is unused by hardware, so that KVM can communicate its inability to map GPAs that set bits 51:48 due to lack of 5-level paging. Guest firmware is expected to use the information to safely remap BARs in the uppermost GPA space, i.e to avoid placing a BAR at a legal, but unmappable, GPA. - Use vfree() instead of kvfree() for allocations that always use vcalloc() or __vcalloc(). - Don't completely ignore same-value writes to immutable feature MSRs, as doing so results in KVM failing to reject accesses to MSR that aren't supposed to exist given the vCPU model and/or KVM configuration. - Don't mark APICv as being inhibited due to ABSENT if APICv is disabled KVM-wide to avoid confusing debuggers (KVM will never bother clearing the ABSENT inhibit, even if userspace enables in-kernel local APIC).
2024-05-12Merge tag 'kvm-x86-mmu-6.10' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini2-29/+66
KVM x86 MMU changes for 6.10: - Process TDP MMU SPTEs that are are zapped while holding mmu_lock for read after replacing REMOVED_SPTE with '0' and flushing remote TLBs, which allows vCPU tasks to repopulate the zapped region while the zapper finishes tearing down the old, defunct page tables. - Fix a longstanding, likely benign-in-practice race where KVM could fail to detect a write from kvm_mmu_track_write() to a shadowed GPTE if the GPTE is first page table being shadowed.
2024-05-12Merge tag 'kvm-x86-vmx-6.10' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini5-16/+34
KVM VMX changes for 6.10: - Clear vmcs.EXIT_QUALIFICATION when synthesizing an EPT Misconfig VM-Exit to L1, as per the SDM. - Move kvm_vcpu_arch's exit_qualification into x86_exception, as the field is used only when synthesizing nested EPT violation, i.e. it's not the vCPU's "real" exit_qualification, which is tracked elsewhere. - Add a sanity check to assert that EPT Violations are the only sources of nested PML Full VM-Exits.
2024-05-12Merge tag 'kvmarm-6.10-1' of ↵Paolo Bonzini48-754/+1249
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 6.10 - Move a lot of state that was previously stored on a per vcpu basis into a per-CPU area, because it is only pertinent to the host while the vcpu is loaded. This results in better state tracking, and a smaller vcpu structure. - Add full handling of the ERET/ERETAA/ERETAB instructions in nested virtualisation. The last two instructions also require emulating part of the pointer authentication extension. As a result, the trap handling of pointer authentication has been greattly simplified. - Turn the global (and not very scalable) LPI translation cache into a per-ITS, scalable cache, making non directly injected LPIs much cheaper to make visible to the vcpu. - A batch of pKVM patches, mostly fixes and cleanups, as the upstreaming process seems to be resuming. Fingers crossed! - Allocate PPIs and SGIs outside of the vcpu structure, allowing for smaller EL2 mapping and some flexibility in implementing more or less than 32 private IRQs. - Purge stale mpidr_data if a vcpu is created after the MPIDR map has been created. - Preserve vcpu-specific ID registers across a vcpu reset. - Various minor cleanups and improvements.
2024-05-11csky: Emulate one-byte cmpxchgPaul E. McKenney2-0/+11
Use the new cmpxchg_emu_u8() to emulate one-byte cmpxchg() on csky. [ paulmck: Apply kernel test robot feedback. ] [ paulmck: Drop two-byte support per Arnd Bergmann feedback. ] Co-developed-by: Yujie Liu <yujie.liu@intel.com> Signed-off-by: Yujie Liu <yujie.liu@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Yujie Liu <yujie.liu@intel.com> Reviewed-by: Guo Ren <guoren@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: <linux-csky@vger.kernel.org>
2024-05-10Merge tag 'loongarch-kvm-6.10' of ↵Paolo Bonzini213-879/+2156
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.10 1. Add ParaVirt IPI support. 2. Add software breakpoint support. 3. Add mmio trace events support.
2024-05-10Merge branch 'kvm-sev-es-ghcbv2' into HEADPaolo Bonzini4-14/+111
While the main additions from GHCB protocol version 1 to version 2 revolve mostly around SEV-SNP support, there are a number of changes applicable to SEV-ES guests as well. Pluck a handful patches from the SNP hypervisor patchset for GHCB-related changes that are also applicable to SEV-ES. A KVM_SEV_INIT2 field lets userspace can control the maximum GHCB protocol version advertised to guests and manage compatibility across kernels/versions.
2024-05-10Merge branch 'kvm-coco-pagefault-prep' into HEADPaolo Bonzini6-98/+174
A combination of prep work for TDX and SNP, and a clean up of the page fault path to (hopefully) make it easier to follow the rules for private memory, noslot faults, writes to read-only slots, etc.
2024-05-10Merge branch 'kvm-vmx-ve' into HEADPaolo Bonzini11-41/+152
Allow a non-zero value for non-present SPTE and removed SPTE, so that TDX can set the "suppress VE" bit.
2024-05-10x86/topology/amd: Ensure that LLC ID is initializedThomas Gleixner1-9/+7
The original topology evaluation code initialized cpu_data::topo::llc_id with the die ID initialy and then eventually overwrite it with information gathered from a CPUID leaf. The conversion analysis failed to spot that particular detail and omitted this initial assignment under the assumption that each topology evaluation path will set it up. That assumption is mostly correct, but turns out to be wrong in case that the CPUID leaf 0x80000006 does not provide a LLC ID. In that case, LLC ID is invalid and as a consequence the setup of the scheduling domain CPU masks is incorrect which subsequently causes the scheduler core to complain about it during CPU hotplug: BUG: arch topology borken the CLS domain not a subset of the MC domain Cure it by reusing legacy_set_llc() and assigning the die ID if the LLC ID is invalid after all possible parsers have been tried. Fixes: f7fb3b2dd92c ("x86/cpu: Provide an AMD/HYGON specific topology parser") Reported-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Link: https://lore.kernel.org/r/PUZPR04MB63168AC442C12627E827368581292@PUZPR04MB6316.apcprd04.prod.outlook.com
2024-05-10arm64: defconfig: enable Airoha platformDaniel Danzberger1-0/+1
Enables the ARCH_AIROHA config by default. Signed-off-by: Daniel Danzberger <dd@embedd.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/65737ca5506371ef84c3a055e68d280f314e3b41.1709975956.git.lorenzo@kernel.org Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-10arm64: add Airoha EN7581 platformDaniel Danzberger1-0/+7
Introduce the Kconfig entry for the Airoha EN7581 multicore architecture available in the Airoha EN7581 evaluation board. Signed-off-by: Daniel Danzberger <dd@embedd.com> Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/d52d95db313e6a58ba997ba2181faf78a1014bcc.1709975956.git.lorenzo@kernel.org Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-10Merge branch 'for-next/errata' into for-next/coreWill Deacon8-0/+92
* for-next/errata: arm64: errata: Add workaround for Arm errata 3194386 and 3312417 arm64: cputype: Add Neoverse-V3 definitions arm64: cputype: Add Cortex-X4 definitions arm64: barrier: Restore spec_bar() macro
2024-05-10parisc/math-emu: Remove unused struct 'exc_reg'Dr. David Alan Gilbert1-6/+0
This has been here since pre-git. Build tested. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Helge Deller <deller@gmx.de>
2024-05-10x86/amd_nb: Add new PCI IDs for AMD family 0x1aShyam Sundar S K1-0/+1
Add the new PCI Device IDs to the MISC IDs list to support new generation of AMD 1Ah family 70h Models of processors. [ bp: Massage commit message. ] Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240510111829.969501-1-Shyam-sundar.S-k@amd.com
2024-05-10powerpc/85xx: fix compile error without CONFIG_CRASH_DUMPHari Bathini1-3/+6
Since commit 5c4233cc0920 ("powerpc/kdump: Split KEXEC_CORE and CRASH_DUMP dependency"), crashing_cpu is not available without CONFIG_CRASH_DUMP. Fix compile error on 64-BIT 85xx owing to this change. Fixes: 5c4233cc0920 ("powerpc/kdump: Split KEXEC_CORE and CRASH_DUMP dependency") Cc: stable@vger.kernel.org # v6.9+ Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Closes: https://lore.kernel.org/all/fa247ae4-5825-4dbe-a737-d93b7ab4d4b9@xenosoft.de/ Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240510080757.560159-1-hbathini@linux.ibm.com
2024-05-10arm64: errata: Add workaround for Arm errata 3194386 and 3312417Mark Rutland6-0/+84
Cortex-X4 and Neoverse-V3 suffer from errata whereby an MSR to the SSBS special-purpose register does not affect subsequent speculative instructions, permitting speculative store bypassing for a window of time. This is described in their Software Developer Errata Notice (SDEN) documents: * Cortex-X4 SDEN v8.0, erratum 3194386: https://developer.arm.com/documentation/SDEN-2432808/0800/ * Neoverse-V3 SDEN v6.0, erratum 3312417: https://developer.arm.com/documentation/SDEN-2891958/0600/ To workaround these errata, it is necessary to place a speculation barrier (SB) after MSR to the SSBS special-purpose register. This patch adds the requisite SB after writes to SSBS within the kernel, and hides the presence of SSBS from EL0 such that userspace software which cares about SSBS will manipulate this via prctl(PR_GET_SPECULATION_CTRL, ...). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240508081400.235362-5-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-10arm64: cputype: Add Neoverse-V3 definitionsMark Rutland1-0/+2
Add cputype definitions for Neoverse-V3. These will be used for errata detection in subsequent patches. These values can be found in Table B-249 ("MIDR_EL1 bit descriptions") in issue 0001-04 of the Neoverse-V3 TRM, which can be found at: https://developer.arm.com/documentation/107734/0001/?lang=en Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240508081400.235362-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-10arm64: cputype: Add Cortex-X4 definitionsMark Rutland1-0/+2
Add cputype definitions for Cortex-X4. These will be used for errata detection in subsequent patches. These values can be found in Table B-249 ("MIDR_EL1 bit descriptions") in issue 0002-05 of the Cortex-X4 TRM, which can be found at: https://developer.arm.com/documentation/102484/0002/?lang=en Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240508081400.235362-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-10arm64: barrier: Restore spec_bar() macroMark Rutland1-0/+4
Upcoming errata workarounds will need to use SB from C code. Restore the spec_bar() macro so that we can use SB. This is effectively a revert of commit: 4f30ba1cce36d413 ("arm64: barrier: Remove spec_bar() macro") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240508081400.235362-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-10Merge 6.9-rc7 into usb-nextGreg Kroah-Hartman117-377/+510
We want the USB fixes in here as well, and resolve a merge conflict in drivers/usb/dwc3/core.c Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-10arm64: Properly clean up iommu-dma remnantsRobin Murphy2-9/+0
Thanks to the somewhat asymmetrical nature, while removing iommu_setup_dma_ops() from the arch_setup_dma_ops() flow, I managed to forget that arm64's teardown path was also specific to iommu-dma. Clean that up to match, otherwise probe deferral will lead to the arch code erroneously removing DMA ops set elsewhere. Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/linux-iommu/Zi_LV28TR-P-PzXi@eriador.lumag.spb.ru/ Fixes: b67483b3c44e ("iommu/dma: Centralise iommu_setup_dma_ops()") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Acked-by: Will Deacon <will@kernel.org> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/d4cc20cbb0c45175e98dd76bf187e2ad6421296d.1714472573.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-05-10powerpc/fadump: pass additional parameters when fadump is activeHari Bathini3-0/+40
Append the additional parameters passed/set in the dedicated parameter area (RTAS_FADUMP_PARAM_AREA) to bootargs in fadump capture kernel. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240509115755.519982-4-hbathini@linux.ibm.com
2024-05-10powerpc/fadump: setup additional parameters for dump capture kernelHari Bathini5-9/+133
For fadump case, passing additional parameters to dump capture kernel helps in minimizing the memory footprint for it and also provides the flexibility to disable components/modules, like hugepages, that are hindering the boot process of the special dump capture environment. Set up a dedicated parameter area to be passed to the capture kernel. This area type is defined as RTAS_FADUMP_PARAM_AREA. Sysfs attribute '/sys/kernel/fadump/bootargs_append' is exported to the userspace to specify the additional parameters to be passed to the capture kernel Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240509115755.519982-3-hbathini@linux.ibm.com
2024-05-10powerpc/pseries/fadump: add support for multiple boot memory regionsHari Bathini5-120/+197
Currently, fadump on pseries assumes a single boot memory region even though f/w supports more than one boot memory region. Add support for more boot memory regions to make the implementation flexible for any enhancements that introduce other region types. For this, rtas memory structure for fadump is updated to have multiple boot memory regions instead of just one. Additionally, methods responsible for creating the fadump memory structure during both the first and second kernel boot have been modified to take these multiple boot memory regions into account. Also, a new callback has been added to the fadump_ops structure to get the maximum boot memory regions supported by the platform. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240509115755.519982-2-hbathini@linux.ibm.com
2024-05-10kbuild: use $(src) instead of $(srctree)/$(src) for source directoryMasahiro Yamada37-51/+43
Kbuild conventionally uses $(obj)/ for generated files, and $(src)/ for checked-in source files. It is merely a convention without any functional difference. In fact, $(obj) and $(src) are exactly the same, as defined in scripts/Makefile.build: src := $(obj) When the kernel is built in a separate output directory, $(src) does not accurately reflect the source directory location. While Kbuild resolves this discrepancy by specifying VPATH=$(srctree) to search for source files, it does not cover all cases. For example, when adding a header search path for local headers, -I$(srctree)/$(src) is typically passed to the compiler. This introduces inconsistency between upstream and downstream Makefiles because $(src) is used instead of $(srctree)/$(src) for the latter. To address this inconsistency, this commit changes the semantics of $(src) so that it always points to the directory in the source tree. Going forward, the variables used in Makefiles will have the following meanings: $(obj) - directory in the object tree $(src) - directory in the source tree (changed by this commit) $(objtree) - the top of the kernel object tree $(srctree) - the top of the kernel source tree Consequently, $(srctree)/$(src) in upstream Makefiles need to be replaced with $(src). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-05-09Merge branch kvm-arm64/mpidr-reset into kvmarm-master/nextMarc Zyngier3-31/+38
* kvm-arm64/mpidr-reset: : . : Fixes for CLIDR_EL1 and MPIDR_EL1 being accidentally mutable across : a vcpu reset, courtesy of Oliver. From the cover letter: : : "For VM-wide feature ID registers we ensure they get initialized once for : the lifetime of a VM. On the other hand, vCPU-local feature ID registers : get re-initialized on every vCPU reset, potentially clobbering the : values userspace set up. : : MPIDR_EL1 and CLIDR_EL1 are the only registers in this space that we : allow userspace to modify for now. Clobbering the value of MPIDR_EL1 has : some disastrous side effects as the compressed index used by the : MPIDR-to-vCPU lookup table assumes MPIDR_EL1 is immutable after KVM_RUN. : : Series + reproducer test case to address the problem of KVM wiping out : userspace changes to these registers. Note that there are still some : differences between VM and vCPU scoped feature ID registers from the : perspective of userspace. We do not allow the value of VM-scope : registers to change after KVM_RUN, but vCPU registers remain mutable." : . KVM: selftests: arm64: Test vCPU-scoped feature ID registers KVM: selftests: arm64: Test that feature ID regs survive a reset KVM: selftests: arm64: Store expected register value in set_id_regs KVM: selftests: arm64: Rename helper in set_id_regs to imply VM scope KVM: arm64: Only reset vCPU-scoped feature ID regs once KVM: arm64: Reset VM feature ID regs from kvm_reset_sys_regs() KVM: arm64: Rename is_id_reg() to imply VM scope Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-09KVM: arm64: Only reset vCPU-scoped feature ID regs onceOliver Upton3-13/+26
The general expecation with feature ID registers is that they're 'reset' exactly once by KVM for the lifetime of a vCPU/VM, such that any userspace changes to the CPU features / identity are honored after a vCPU gets reset (e.g. PSCI_ON). KVM handles what it calls VM-scoped feature ID registers correctly, but feature ID registers local to a vCPU (CLIDR_EL1, MPIDR_EL1) get wiped after every reset. What's especially concerning is that a potentially-changing MPIDR_EL1 breaks MPIDR compression for indexing mpidr_data, as the mask of useful bits to build the index could change. This is absolutely no good. Avoid resetting vCPU feature ID registers more than once. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502233529.1958459-4-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-09KVM: arm64: Reset VM feature ID regs from kvm_reset_sys_regs()Oliver Upton1-17/+10
A subsequent change to KVM will expand the range of feature ID registers that get special treatment at reset. Fold the existing ones back in to kvm_reset_sys_regs() to avoid the need for an additional table walk. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502233529.1958459-3-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-09KVM: arm64: Rename is_id_reg() to imply VM scopeOliver Upton1-5/+6
The naming of some of the feature ID checks is ambiguous. Rephrase the is_id_reg() helper to make its purpose slightly clearer. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502233529.1958459-2-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski28-128/+148
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 35d92abfbad8 ("net: hns3: fix kernel crash when devlink reload during initialization") 2a1a1a7b5fd7 ("net: hns3: add command queue trace for hns3") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-09Merge tag 'net-6.9-rc8' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth and IPsec. The bridge patch is actually a follow-up to a recent fix in the same area. We have a pending v6.8 AF_UNIX regression; it should be solved soon, but not in time for this PR. Current release - regressions: - eth: ks8851: Queue RX packets in IRQ handler instead of disabling BHs - net: bridge: fix corrupted ethernet header on multicast-to-unicast Current release - new code bugs: - xfrm: fix possible bad pointer derferencing in error path Previous releases - regressionis: - core: fix out-of-bounds access in ops_init - ipv6: - fix potential uninit-value access in __ip6_make_skb() - fib6_rules: avoid possible NULL dereference in fib6_rule_action() - tcp: use refcount_inc_not_zero() in tcp_twsk_unique(). - rtnetlink: correct nested IFLA_VF_VLAN_LIST attribute validation - rxrpc: fix congestion control algorithm - bluetooth: - l2cap: fix slab-use-after-free in l2cap_connect() - msft: fix slab-use-after-free in msft_do_close() - eth: hns3: fix kernel crash when devlink reload during initialization - eth: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family Previous releases - always broken: - xfrm: preserve vlan tags for transport mode software GRO - tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets - eth: hns3: keep using user config after hardware reset" * tag 'net-6.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits) net: dsa: mv88e6xxx: read cmode on mv88e6320/21 serdes only ports net: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family net: hns3: fix kernel crash when devlink reload during initialization net: hns3: fix port vlan filter not disabled issue net: hns3: use appropriate barrier function after setting a bit value net: hns3: release PTP resources if pf initialization failed net: hns3: change type of numa_node_mask as nodemask_t net: hns3: direct return when receive a unknown mailbox message net: hns3: using user configure after hardware reset net/smc: fix neighbour and rtable leak in smc_ib_find_route() ipv6: prevent NULL dereference in ip6_output() hsr: Simplify code for announcing HSR nodes timer setup ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action() dt-bindings: net: mediatek: remove wrongly added clocks and SerDes rxrpc: Only transmit one ACK per jumbo packet received rxrpc: Fix congestion control algorithm selftests: test_bridge_neigh_suppress.sh: Fix failures due to duplicate MAC ipv6: Fix potential uninit-value access in __ip6_make_skb() net: phy: marvell-88q2xxx: add support for Rev B1 and B2 appletalk: Improve handling of broadcast packets ...
2024-05-09blk-throttle: remove CONFIG_BLK_DEV_THROTTLING_LOWYu Kuai1-1/+0
One the one hand, it's marked EXPERIMENTAL since 2017, and looks like there are no users since then, and no testers and no developers, it's just not active at all. On the other hand, even if the config is disabled, there are still many fields in throtl_grp and throtl_data and many functions that are only used for throtl low. At last, currently blk-throtl is initialized during disk initialization, and destroyed during disk removal, and it exposes many functions to be called directly from block layer. Remove throtl low to make code much more cleaner and follow up work much easier. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20240509121107.3195568-2-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-05-09Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linuxLinus Torvalds1-0/+4
Pull ARM fix from Russell King: - clear stale KASan stack poison when a CPU resumes * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: ARM: 9381/1: kasan: clear stale stack poison
2024-05-09Merge branch 'for-next/tlbi' into for-next/coreWill Deacon1-22/+31
* for-next/tlbi: arm64: tlb: Allow range operation for MAX_TLBI_RANGE_PAGES arm64: tlb: Improve __TLBI_VADDR_RANGE() arm64: tlb: Fix TLBI RANGE operand
2024-05-09Merge branch 'for-next/perf' into for-next/coreWill Deacon5-121/+134
* for-next/perf: (41 commits) arm64: Add USER_STACKTRACE support drivers/perf: hisi: hns3: Actually use devm_add_action_or_reset() drivers/perf: hisi: hns3: Fix out-of-bound access when valid event group drivers/perf: hisi_pcie: Fix out-of-bound access when valid event group perf/arm-spe: Assign parents for event_source device perf/arm-smmuv3: Assign parents for event_source device perf/arm-dsu: Assign parents for event_source device perf/arm-dmc620: Assign parents for event_source device perf/arm-ccn: Assign parents for event_source device perf/arm-cci: Assign parents for event_source device perf/alibaba_uncore: Assign parents for event_source device perf/arm_pmu: Assign parents for event_source devices perf/imx_ddr: Assign parents for event_source devices perf/qcom: Assign parents for event_source devices Documentation: qcom-pmu: Use /sys/bus/event_source/devices paths perf/riscv: Assign parents for event_source devices perf/thunderx2: Assign parents for event_source devices Documentation: thunderx2-pmu: Use /sys/bus/event_source/devices paths perf/xgene: Assign parents for event_source devices Documentation: xgene-pmu: Use /sys/bus/event_source/devices paths ...
2024-05-09Merge branch 'for-next/mm' into for-next/coreWill Deacon4-78/+157
* for-next/mm: arm64/mm: Fix pud_user_accessible_page() for PGTABLE_LEVELS <= 2 arm64/mm: Add uffd write-protect support arm64/mm: Move PTE_PRESENT_INVALID to overlay PTE_NG arm64/mm: Remove PTE_PROT_NONE bit arm64/mm: generalize PMD_PRESENT_INVALID for all levels arm64: mm: Don't remap pgtables for allocate vs populate arm64: mm: Batch dsb and isb when populating pgtables arm64: mm: Don't remap pgtables per-cont(pte|pmd) block
2024-05-09Merge branch 'for-next/misc' into for-next/coreWill Deacon9-43/+48
* for-next/misc: arm64: simplify arch_static_branch/_jump function arm64: Add the arm64.no32bit_el0 command line option arm64: defer clearing DAIF.D arm64: assembler: update stale comment for disable_step_tsk arm64/sysreg: Update PIE permission encodings arm64: Add Neoverse-V2 part arm64: Remove unnecessary irqflags alternative.h include
2024-05-09Merge branch 'for-next/kbuild' into for-next/coreWill Deacon3-3/+15
* for-next/kbuild: arm64: boot: Support Flat Image Tree arm64: Add BOOT_TARGETS variable
2024-05-09arm64/mm: Fix pud_user_accessible_page() for PGTABLE_LEVELS <= 2Ryan Roberts1-0/+1
The recent change to use pud_valid() as part of the implementation of pud_user_accessible_page() fails to build when PGTABLE_LEVELS <= 2 because pud_valid() is not defined in that case. Fix this by defining pud_valid() to false for this case. This means that pud_user_accessible_page() will correctly always return false for this config. Fixes: f0f5863a0fb0 ("arm64/mm: Remove PTE_PROT_NONE bit") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405082221.43rfWxz5-lkp@intel.com/ Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20240509122844.563320-1-ryan.roberts@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-08Merge tag 'riscv-dt-for-v6.10-take2' of ↵Arnd Bergmann8-711/+692
https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/dt-late RISC-V Devicetrees for v6.10 Microchip: A simple addition of a power-monitor on the Icicle dev board, as the binding for it is now in mainline. StarFive: Support for the Milk-V Mars. This board is incredibly similar to the VisionFive v2 that is already supported, with only the really ethernet configuration being slightly different. Emil requested that a common dtsi file, so my fixes branch is pulled into for-next to avoid an annoying conflict between moved content and some erroneously added nodes that were removed as fixes this cycle. T-Head: Re-ordering of some nodes to match the DTS coding style on the th1520. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> * tag 'riscv-dt-for-v6.10-take2' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux: riscv: dts: microchip: add pac1934 power-monitor to icicle riscv: dts: thead: Fix node ordering in TH1520 device tree riscv: dts: starfive: add Milkv Mars board device tree riscv: dts: starfive: introduce a common board dtsi for jh7110 based boards riscv: dts: starfive: visionfive 2: add "disable-wp" for tfcard riscv: dts: starfive: visionfive 2: add tf cd-gpios riscv: dts: starfive: visionfive 2: use cpus label for timebase freq riscv: dts: starfive: visionfive 2: update sound and codec dt node name dt-bindings: riscv: starfive: add Milkv Mars board riscv: dts: starfive: add 'cpus' label to jh7110 and jh7100 soc dtsi riscv: dts: starfive: visionfive 2: Remove non-existing I2S hardware riscv: dts: starfive: visionfive 2: Remove non-existing TDM hardware riscv: dts: starfive: Remove PMIC interrupt info for Visionfive 2 board Link: https://lore.kernel.org/r/20240508-crafter-cement-4f54e4182270@spud Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-08sparc/leon: Remove on-stack cpumask varDawei Li1-4/+3
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_subset() and cpumask_first_and() to avoid the need for a temporary cpumask on the stack. Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Tested-by: Andreas Larsson <andreas@gaisler.com> Link: https://lore.kernel.org/r/20240424025548.3765250-6-dawei.li@shingroup.cn Signed-off-by: Andreas Larsson <andreas@gaisler.com>
2024-05-08sparc/pci_msi: Remove on-stack cpumask varDawei Li1-4/+1
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. @cpumask of irq_set_affinity() is read-only and free of change, drop unneeded cpumask var. Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Link: https://lore.kernel.org/r/20240424025548.3765250-5-dawei.li@shingroup.cn Signed-off-by: Andreas Larsson <andreas@gaisler.com>
2024-05-08sparc/of: Remove on-stack cpumask varDawei Li1-4/+1
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. @cpumask of irq_set_affinity() is read-only and free of change, drop unneeded cpumask var. Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Link: https://lore.kernel.org/r/20240424025548.3765250-4-dawei.li@shingroup.cn Signed-off-by: Andreas Larsson <andreas@gaisler.com>
2024-05-08sparc/irq: Remove on-stack cpumask varDawei Li1-7/+3
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. - Both 2 arguments of cpumask_equal() is constant and free of change, no need to allocate extra cpumask variables. - Merge cpumask_and(), cpumask_first() and cpumask_empty() into cpumask_first_and(). Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Link: https://lore.kernel.org/r/20240424025548.3765250-3-dawei.li@shingroup.cn Signed-off-by: Andreas Larsson <andreas@gaisler.com>
2024-05-08sparc/srmmu: Remove on-stack cpumask varDawei Li1-28/+12
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_but() to avoid the need for a temporary cpumask on the stack and simplify code. Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Link: https://lore.kernel.org/r/20240424025548.3765250-2-dawei.li@shingroup.cn Signed-off-by: Andreas Larsson <andreas@gaisler.com>
2024-05-08Merge tag 'soc-fixes-6.9-3' of ↵Linus Torvalds1-17/+13
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "These are a couple of last minute fixes that came in over the previous week, addressing: - A pin configuration bug on a qualcomm board that caused issues with ethernet and mmc - Two minor code fixes for misleading console output in the microchip firmware driver - A build warning in the sifive cache driver" * tag 'soc-fixes-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: firmware: microchip: clarify that sizes and addresses are in hex firmware: microchip: don't unconditionally print validation success arm64: dts: qcom: sa8155p-adp: fix SDHC2 CD pin configuration cache: sifive_ccache: Silence unused variable warning
2024-05-08m68k: defconfig: Update defconfigs for v6.9-rc1Geert Uytterhoeven12-36/+12
- Enable trimming of unused exported kernel symbols, - Drop CONFIG_IP_NF_ARPTABLES=m (auto-enabled since commit 4654467dc7e111e8 ("netfilter: arptables: allow xtables-nft only builds")), - Drop CONFIG_STRING_SELFTEST=m (replaced by auto-modular CONFIG_STRING_KUNIT_TEST in commit 29d8568849fe5937 ("string: Convert selftest to KUnit")), - Drop CONFIG_TEST_STRING_HELPERS=m (replaced by auto-modular CONFIG_STRING_HELPERS_KUNIT_TEST in commit fb57550fcbd86839 ("string: Convert helpers selftest to KUnit")). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/e17b3ac60832a3ff92d25d1a05bf814e8f15d0c5.1711475325.git.geert@linux-m68k.org
2024-05-08m68k: Move ARCH_HAS_CPU_CACHE_ALIASINGGeert Uytterhoeven1-1/+1
Move the recently added ARCH_HAS_CPU_CACHE_ALIASING to restore alphabetical sort order. Fixes: 8690bbcf3b7010b3 ("Introduce cpu_dcache_is_aliasing() across all architectures") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/r/4574ad6cc1117e4b5d29812c165bf7f6e5b60773.1714978406.git.geert@linux-m68k.org
2024-05-08m68k: mac: Fix reboot hang on Mac IIciFinn Thain1-18/+18
Calling mac_reset() on a Mac IIci does reset the system, but what follows is a POST failure that requires a manual reset to resolve. Avoid that by using the 68030 asm implementation instead of the C implementation. Apparently the SE/30 has a similar problem as it has used the asm implementation since before git. This patch extends that solution to other systems with a similar ROM. After this patch, the only systems still using the C implementation are 68040 systems where adb_type is either MAC_ADB_IOP or MAC_ADB_II. This implies a 1 MiB Quadra ROM. This now includes the Quadra 900/950, which previously fell through to the "should never get here" catch-all. Reported-and-tested-by: Stan Johnson <userm57@yahoo.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Finn Thain <fthain@linux-m68k.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/480ebd1249d229c6dc1f3f1c6d599b8505483fd8.1714797072.git.fthain@linux-m68k.org Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2024-05-08m68k: Fix spinlock race in kernel thread creationMichael Schmitz1-1/+3
Context switching does take care to retain the correct lock owner across the switch from 'prev' to 'next' tasks. This does rely on interrupts remaining disabled for the entire duration of the switch. This condition is guaranteed for normal process creation and context switching between already running processes, because both 'prev' and 'next' already have interrupts disabled in their saved copies of the status register. The situation is different for newly created kernel threads. The status register is set to PS_S in copy_thread(), which does leave the IPL at 0. Upon restoring the 'next' thread's status register in switch_to() aka resume(), interrupts then become enabled prematurely. resume() then returns via ret_from_kernel_thread() and schedule_tail() where run queue lock is released (see finish_task_switch() and finish_lock_switch()). A timer interrupt calling scheduler_tick() before the lock is released in finish_task_switch() will find the lock already taken, with the current task as lock owner. This causes a spinlock recursion warning as reported by Guenter Roeck. As far as I can ascertain, this race has been opened in commit 533e6903bea0 ("m68k: split ret_from_fork(), simplify kernel_thread()") but I haven't done a detailed study of kernel history so it may well predate that commit. Interrupts cannot be disabled in the saved status register copy for kernel threads (init will complain about interrupts disabled when finally starting user space). Disable interrupts temporarily when switching the tasks' register sets in resume(). Note that a simple oriw 0x700,%sr after restoring sr is not enough here - this leaves enough of a race for the 'spinlock recursion' warning to still be observed. Tested on ARAnyM and qemu (Quadra 800 emulation). Fixes: 533e6903bea0 ("m68k: split ret_from_fork(), simplify kernel_thread()") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/all/07811b26-677c-4d05-aeb4-996cd880b789@roeck-us.net Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20240411033631.16335-1-schmitzmic@gmail.com Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2024-05-08m68k: Let GENERIC_IOMAP depend on HAS_IOPORTNiklas Schnelle1-1/+1
In a future patch HAS_IOPORT=n will disable inb()/outb() and friends at compile time. With that choosing dynamically between I/O port and MMIO access via GNERIC_IOMAP will not work. So only select GENERIC_IOMAP when HAS_IOPORT is selected. Co-developed-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20240403122851.38808-2-schnelle@linux.ibm.com Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2024-05-08Merge branch kvm-arm64/misc-6.10 into kvmarm-master/nextMarc Zyngier6-47/+113
* kvm-arm64/misc-6.10: : . : Misc fixes and updates targeting 6.10 : : - Improve boot-time diagnostics when the sysreg tables : are not correctly sorted : : - Allow FFA_MSG_SEND_DIRECT_REQ in the FFA proxy : : - Fix duplicate XNX field in the ID_AA64MMFR1_EL1 : writeable mask : : - Allocate PPIs and SGIs outside of the vcpu structure, allowing : for smaller EL2 mapping and some flexibility in implementing : more or less than 32 private IRQs. : : - Use bitmap_gather() instead of its open-coded equivalent : : - Make protected mode use hVHE if available : : - Purge stale mpidr_data if a vcpu is created after the MPIDR : map has been created : . KVM: arm64: Destroy mpidr_data for 'late' vCPU creation KVM: arm64: Use hVHE in pKVM by default on CPUs with VHE support KVM: arm64: Fix hvhe/nvhe early alias parsing KVM: arm64: Convert kvm_mpidr_index() to bitmap_gather() KVM: arm64: vgic: Allocate private interrupts on demand KVM: arm64: Remove duplicated AA64MMFR1_EL1 XNX KVM: arm64: Remove FFA_MSG_SEND_DIRECT_REQ from the denylist KVM: arm64: Improve out-of-order sysreg table diagnostics Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-08KVM: arm64: Destroy mpidr_data for 'late' vCPU creationOliver Upton1-9/+41
A particularly annoying userspace could create a vCPU after KVM has computed mpidr_data for the VM, either by racing against VGIC initialization or having a userspace irqchip. In any case, this means mpidr_data no longer fully describes the VM, and attempts to find the new vCPU with kvm_mpidr_to_vcpu() will fail. The fix is to discard mpidr_data altogether, as it is only a performance optimization and not required for correctness. In all likelihood KVM will recompute the mappings when KVM_RUN is called on the new vCPU. Note that reads of mpidr_data are not guarded by a lock; promote to RCU to cope with the possibility of mpidr_data being invalidated at runtime. Fixes: 54a8006d0b49 ("KVM: arm64: Fast-track kvm_mpidr_to_vcpu() when mpidr_data is available") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240508071952.2035422-1-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-08bpf, arm64: Add support for lse atomics in bpf_arenaPuranjay Mohan1-8/+40
When LSE atomics are available, BPF atomic instructions are implemented as single ARM64 atomic instructions, therefore it is easy to enable these in bpf_arena using the currently available exception handling setup. LL_SC atomics use loops and therefore would need more work to enable in bpf_arena. Enable LSE atomics based instructions in bpf_arena and use the bpf_jit_supports_insn() callback to reject atomics in bpf_arena if LSE atomics are not available. All atomics and arena_atomics selftests are passing: [root@ip-172-31-2-216 bpf]# ./test_progs -a atomics,arena_atomics #3/1 arena_atomics/add:OK #3/2 arena_atomics/sub:OK #3/3 arena_atomics/and:OK #3/4 arena_atomics/or:OK #3/5 arena_atomics/xor:OK #3/6 arena_atomics/cmpxchg:OK #3/7 arena_atomics/xchg:OK #3 arena_atomics:OK #10/1 atomics/add:OK #10/2 atomics/sub:OK #10/3 atomics/and:OK #10/4 atomics/or:OK #10/5 atomics/xor:OK #10/6 atomics/cmpxchg:OK #10/7 atomics/xchg:OK #10 atomics:OK Summary: 2/14 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20240426161116.441-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-08x86/irq: Use existing helper for pending vector checkJacob Pan1-7/+1
lapic_vector_set_in_irr() is already available, use it for checking pending vectors at the local APIC. No functional change. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Imran Khan <imran.f.khan@oracle.com> Link: https://lore.kernel.org/r/20240506175612.1141095-1-jacob.jun.pan@linux.intel.com
2024-05-08perf/x86/cstate: Remove unused 'struct perf_cstate_msr'Ingo Molnar1-6/+0
Use of this structure was removed in: 8f2a28c5859b ("perf/x86/cstate: Use new probe function") Remove the now stale type as well. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-05-08x86/pci/ce4100: Remove unused 'struct sim_reg_op'Dr. David Alan Gilbert1-6/+0
'struct sim_reg_op' wasn't ever used since it was introduced 14 years ago via: 91d8037f563e ("ce4100: Add PCI register emulation for CE4100") Remove it. [ mingo: Improved the changelog. ] Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240507232348.46677-1-linux@treblig.org
2024-05-08Merge tag 'qcom-arm64-for-6.10-2' of ↵Arnd Bergmann5-4/+171
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/dt A few more Qualcomm Arm64 DeviceTree updates for v6.10 This corrects the obviously broken compatible of the USB VBUS regulator in PM6150. It clears the odd-looking default address on QCS404 EVB, with the expectation that a proper address is provides by other means. The newly added SM8650 GPU node is corrected with a missing memory region. The third DWC3 instance on SC8280XP is added, and enabled on Lenovo Thinkpad X13s to give working fingerprint sensor. * tag 'qcom-arm64-for-6.10-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: arm64: dts: qcom: pm6150: correct USB VBUS regulator compatible arm64: dts: qcom: qcs404: fix bluetooth device address arm64: dts: qcom: sc8280xp-x13s: enable USB MP and fingerprint reader arm64: dts: qcom: sc8280xp: Add USB DWC3 Multiport controller arm64: dts: qcom: sm8650: Fix GPU cx_mem size Link: https://lore.kernel.org/r/20240508021820.206441-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-08Merge tag 'qcom-arm64-defconfig-for-6.10-2' of ↵Arnd Bergmann1-0/+1
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/defconfig One more Qualcomm Arm64 defconfig update for v6.10 This enables the SM6115 interconnect provider, to make it possible to boot boards on this SoC. * tag 'qcom-arm64-defconfig-for-6.10-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: arm64: defconfig: select INTERCONNECT_QCOM_SM6115 as built-in Link: https://lore.kernel.org/r/20240508021312.206121-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-08KVM: arm64: Use hVHE in pKVM by default on CPUs with VHE supportWill Deacon1-1/+1
The early command line parsing treats "kvm-arm.mode=protected" as an alias for "id_aa64mmfr1.vh=0", forcing the use of nVHE so that the host kernel runs at EL1 with the pKVM hypervisor at EL2. With the introduction of hVHE support in ad744e8cb346 ("arm64: Allow arm64_sw.hvhe on command line"), the hypervisor can run using the EL2+0 translation regime. This is interesting for unusual CPUs that have VH stuck to 1, but also because it opens the possibility of a hypervisor "userspace" in the distant future which could be used to isolate vCPU contexts in the hypervisor (see Marc's talk from KVM Forum 2022 [1]). Repaint the "kvm-arm.mode=protected" alias to map to "arm64_sw.hvhe=1", which will use hVHE on CPUs that support it and remain with nVHE otherwise. [1] https://www.youtube.com/watch?v=1F_Mf2j9eIo Signed-off-by: Will Deacon <will@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240501163400.15838-3-will@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-08KVM: arm64: Fix hvhe/nvhe early alias parsingWill Deacon1-1/+1
Booting a kernel with "arm64_sw.hvhe=1 kvm-arm.mode=nvhe" on the command-line results in KVM initialising using hVHE, whereas one might expect the latter option to override the former. Fix this by adding "arm64_sw.hvhe=0" to the alias expansion for "kvm-arm.mode=nvhe". Signed-off-by: Will Deacon <will@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240501163400.15838-2-will@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-07mm: fix race between __split_huge_pmd_locked() and GUP-fastRyan Roberts4-1/+7
__split_huge_pmd_locked() can be called for a present THP, devmap or (non-present) migration entry. It calls pmdp_invalidate() unconditionally on the pmdp and only determines if it is present or not based on the returned old pmd. This is a problem for the migration entry case because pmd_mkinvalid(), called by pmdp_invalidate() must only be called for a present pmd. On arm64 at least, pmd_mkinvalid() will mark the pmd such that any future call to pmd_present() will return true. And therefore any lockless pgtable walker could see the migration entry pmd in this state and start interpretting the fields as if it were present, leading to BadThings (TM). GUP-fast appears to be one such lockless pgtable walker. x86 does not suffer the above problem, but instead pmd_mkinvalid() will corrupt the offset field of the swap entry within the swap pte. See link below for discussion of that problem. Fix all of this by only calling pmdp_invalidate() for a present pmd. And for good measure let's add a warning to all implementations of pmdp_invalidate[_ad](). I've manually reviewed all other pmdp_invalidate[_ad]() call sites and believe all others to be conformant. This is a theoretical bug found during code review. I don't have any test case to trigger it in practice. Link: https://lkml.kernel.org/r/20240501143310.1381675-1-ryan.roberts@arm.com Link: https://lore.kernel.org/all/0dd7827a-6334-439a-8fd0-43c98e6af22b@arm.com/ Fixes: 84c3fc4e9c56 ("mm: thp: check pmd migration entry in common path") Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@kernel.org> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-07KVM: SEV: Allow per-guest configuration of GHCB protocol versionMichael Roth3-4/+33
The GHCB protocol version may be different from one guest to the next. Add a field to track it for each KVM instance and extend KVM_SEV_INIT2 to allow it to be configured by userspace. Now that all SEV-ES support for GHCB protocol version 2 is in place, go ahead and default to it when creating SEV-ES guests through the new KVM_SEV_INIT2 interface. Keep the older KVM_SEV_ES_INIT interface restricted to GHCB protocol version 1. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240501071048.2208265-5-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: SEV: Add GHCB handling for termination requestsMichael Roth1-0/+9
GHCB version 2 adds support for a GHCB-based termination request that a guest can issue when it reaches an error state and wishes to inform the hypervisor that it should be terminated. Implement support for that similarly to GHCB MSR-based termination requests that are already available to SEV-ES guests via earlier versions of the GHCB protocol. See 'Termination Request' in the 'Invoking VMGEXIT' section of the GHCB specification for more details. Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240501071048.2208265-4-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: SEV: Add GHCB handling for Hypervisor Feature Support requestsBrijesh Singh2-0/+16
Version 2 of the GHCB specification introduced advertisement of features that are supported by the Hypervisor. Now that KVM supports version 2 of the GHCB specification, bump the maximum supported protocol version. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240501071048.2208265-3-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: SEV: Add support to handle AP reset MSR protocolTom Lendacky3-10/+53
Add support for AP Reset Hold being invoked using the GHCB MSR protocol, available in version 2 of the GHCB specification. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240501071048.2208265-2-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86: Explicitly zero kvm_caps during vendor module loadSean Christopherson1-0/+7
Zero out all of kvm_caps when loading a new vendor module to ensure that KVM can't inadvertently rely on global initialization of a field, and add a comment above the definition of kvm_caps to call out that all fields needs to be explicitly computed during vendor module load. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240423165328.2853870-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86: Fully re-initialize supported_mce_cap on vendor module loadSean Christopherson1-3/+2
Effectively reset supported_mce_cap on vendor module load to ensure that capabilities aren't unintentionally preserved across module reload, e.g. if kvm-intel.ko added a module param to control LMCE support, or if someone somehow managed to load a vendor module that doesn't support LMCE after loading and unloading kvm-intel.ko. Practically speaking, this bug is a non-issue as kvm-intel.ko doesn't have a module param for LMCE, and there is no system in the world that supports both kvm-intel.ko and kvm-amd.ko. Fixes: c45dcc71b794 ("KVM: VMX: enable guest access to LMCE related MSRs") Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240423165328.2853870-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86: Fully re-initialize supported_vm_types on vendor module loadSean Christopherson1-1/+2
Recompute the entire set of supported VM types when a vendor module is loaded, as preserving supported_vm_types across vendor module unload and reload can result in VM types being incorrectly treated as supported. E.g. if a vendor module is loaded with TDP enabled, unloaded, and then reloaded with TDP disabled, KVM_X86_SW_PROTECTED_VM will be incorrectly retained. Ditto for SEV_VM and SEV_ES_VM and their respective module params in kvm-amd.ko. Fixes: 2a955c4db1dd ("KVM: x86: Add supported_vm_types to kvm_caps") Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240423165328.2853870-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07Merge tag 'kvm-riscv-6.10-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini19-99/+448
KVM/riscv changes for 6.10 - Support guest breakpoints using ebreak - Introduce per-VCPU mp_state_lock and reset_cntx_lock - Virtualize SBI PMU snapshot and counter overflow interrupts - New selftests for SBI PMU and Guest ebreak
2024-05-07riscv: dts: microchip: add pac1934 power-monitor to icicleConor Dooley1-0/+32
The binding for this landed in v6.9, add the description. In the off-chance that there were people carrying local patches for this based on the driver shipped on the Microchip website (or vendor kernel) both the binding and sysfs filenames changed during upstreaming. Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-05-07RISC-V: add Milkv Mars board devicetreeConor Dooley6-684/+633
The Milkv Mars is a development board based on the Starfive JH7110 SoC. The board features: - JH7110 SoC - 1/2/4/8 GiB LPDDR4 DRAM - AXP15060 PMIC - 40 pin GPIO header - 3x USB 3.0 host port - 1x USB 2.0 host port - 1x M.2 E-Key - 1x eMMC slot - 1x MicroSD slot - 1x QSPI Flash - 1x 1Gbps Ethernet port - 1x HDMI port - 1x 2-lane DSI and 1x 4-lane DSI - 1x 2-lane CSI I fixed up some nits Emil pointed out. This merges fixes into for-next to avoid messing around with some nodes that were removed as fixes this cycle. Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-05-07riscv: dts: thead: Fix node ordering in TH1520 device treeThomas Bonnefille1-27/+27
According to the device tree coding style, nodes shall be ordered by unit address in ascending order. Signed-off-by: Thomas Bonnefille <thomas.bonnefille@bootlin.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-05-07KVM: x86/mmu: Sanity check that __kvm_faultin_pfn() doesn't create noslot pfnsSean Christopherson1-1/+1
WARN if __kvm_faultin_pfn() generates a "no slot" pfn, and gracefully handle the unexpected behavior instead of continuing on with dangerous state, e.g. tdp_mmu_map_handle_target_level() _only_ checks fault->slot, and so could install a bogus PFN into the guest. The existing code is functionally ok, because kvm_faultin_pfn() pre-checks all of the cases that result in KVM_PFN_NOSLOT, but it is unnecessarily unsafe as it relies on __gfn_to_pfn_memslot() getting the _exact_ same memslot, i.e. not a re-retrieved pointer with KVM_MEMSLOT_INVALID set. And checking only fault->slot would fall apart if KVM ever added a flag or condition that forced emulation, similar to how KVM handles writes to read-only memslots. Cc: David Matlack <dmatlack@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Message-ID: <20240228024147.41573-17-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86/mmu: Initialize kvm_page_fault's pfn and hva to error valuesSean Christopherson1-0/+3
Explicitly set "pfn" and "hva" to error values in kvm_mmu_do_page_fault() to harden KVM against using "uninitialized" values. In quotes because the fields are actually zero-initialized, and zero is a legal value for both page frame numbers and virtual addresses. E.g. failure to set "pfn" prior to creating an SPTE could result in KVM pointing at physical address '0', which is far less desirable than KVM generating a SPTE with reserved PA bits set and thus effectively killing the VM. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Message-ID: <20240228024147.41573-16-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86/mmu: Set kvm_page_fault.hva to KVM_HVA_ERR_BAD for "no slot" faultsSean Christopherson1-0/+1
Explicitly set fault->hva to KVM_HVA_ERR_BAD when handling a "no slot" fault to ensure that KVM doesn't use a bogus virtual address, e.g. if there *was* a slot but it's unusable (APIC access page), or if there really was no slot, in which case fault->hva will be '0' (which is a legal address for x86). Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Message-ID: <20240228024147.41573-15-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86/mmu: Handle no-slot faults at the beginning of kvm_faultin_pfn()Sean Christopherson1-12/+17
Handle the "no memslot" case at the beginning of kvm_faultin_pfn(), just after the private versus shared check, so that there's no need to repeatedly query whether or not a slot exists. This also makes it more obvious that, except for private vs. shared attributes, the process of faulting in a pfn simply doesn't apply to gfns without a slot. Opportunistically stuff @fault's metadata in kvm_handle_noslot_fault() so that it doesn't need to be duplicated in all paths that invoke kvm_handle_noslot_fault(), and to minimize the probability of not stuffing the right fields. Leave the existing handle behind, but convert it to a WARN, to guard against __kvm_faultin_pfn() unexpectedly nullifying fault->slot. Cc: David Matlack <dmatlack@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Message-ID: <20240228024147.41573-14-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86/mmu: Move slot checks from __kvm_faultin_pfn() to kvm_faultin_pfn()Sean Christopherson1-43/+44
Move the checks related to the validity of an access to a memslot from the inner __kvm_faultin_pfn() to its sole caller, kvm_faultin_pfn(). This allows emulating accesses to the APIC access page, which don't need to resolve a pfn, even if there is a relevant in-progress mmu_notifier invalidation. Ditto for accesses to KVM internal memslots from L2, which KVM also treats as emulated MMIO. More importantly, this will allow for future cleanup by having the "no memslot" case bail from kvm_faultin_pfn() very early on. Go to rather extreme and gross lengths to make the change a glorified nop, e.g. call into __kvm_faultin_pfn() even when there is no slot, as the related code is very subtle. E.g. fault->slot can be nullified if it points at the APIC access page, some flows in KVM x86 expect fault->pfn to be KVM_PFN_NOSLOT, while others check only fault->slot, etc. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Message-ID: <20240228024147.41573-13-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86/mmu: Explicitly disallow private accesses to emulated MMIOSean Christopherson1-0/+5
Explicitly detect and disallow private accesses to emulated MMIO in kvm_handle_noslot_fault() instead of relying on kvm_faultin_pfn_private() to perform the check. This will allow the page fault path to go straight to kvm_handle_noslot_fault() without bouncing through __kvm_faultin_pfn(). Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240228024147.41573-12-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86/mmu: Don't force emulation of L2 accesses to non-APIC internal slotsSean Christopherson1-4/+13
Allow mapping KVM's internal memslots used for EPT without unrestricted guest into L2, i.e. allow mapping the hidden TSS and the identity mapped page tables into L2. Unlike the APIC access page, there is no correctness issue with letting L2 access the "hidden" memory. Allowing these memslots to be mapped into L2 fixes a largely theoretical bug where KVM could incorrectly emulate subsequent _L1_ accesses as MMIO, and also ensures consistent KVM behavior for L2. If KVM is using TDP, but L1 is using shadow paging for L2, then routing through kvm_handle_noslot_fault() will incorrectly cache the gfn as MMIO, and create an MMIO SPTE. Creating an MMIO SPTE is ok, but only because kvm_mmu_page_role.guest_mode ensure KVM uses different roots for L1 vs. L2. But vcpu->arch.mmio_gfn will remain valid, and could cause KVM to incorrectly treat an L1 access to the hidden TSS or identity mapped page tables as MMIO. Furthermore, forcing L2 accesses to be treated as "no slot" faults doesn't actually prevent exposing KVM's internal memslots to L2, it simply forces KVM to emulate the access. In most cases, that will trigger MMIO, amusingly due to filling vcpu->arch.mmio_gfn, but also because vcpu_is_mmio_gpa() unconditionally treats APIC accesses as MMIO, i.e. APIC accesses are ok. But the hidden TSS and identity mapped page tables could go either way (MMIO or access the private memslot's backing memory). Alternatively, the inconsistent emulator behavior could be addressed by forcing MMIO emulation for L2 access to all internal memslots, not just to the APIC. But that's arguably less correct than letting L2 access the hidden TSS and identity mapped page tables, not to mention that it's *extremely* unlikely anyone cares what KVM does in this case. From L1's perspective there is R/W memory at those memslots, the memory just happens to be initialized with non-zero data. Making the memory disappear when it is accessed by L2 is far more magical and arbitrary than the memory existing in the first place. The APIC access page is special because KVM _must_ emulate the access to do the right thing (emulate an APIC access instead of reading/writing the APIC access page). And despite what commit 3a2936dedd20 ("kvm: mmu: Don't expose private memslots to L2") said, it's not just necessary when L1 is accelerating L2's virtual APIC, it's just as important (likely *more* imporant for correctness when L1 is passing through its own APIC to L2. Fixes: 3a2936dedd20 ("kvm: mmu: Don't expose private memslots to L2") Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Message-ID: <20240228024147.41573-11-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86/mmu: Move private vs. shared check above slot validity checksSean Christopherson1-5/+15
Prioritize private vs. shared gfn attribute checks above slot validity checks to ensure a consistent userspace ABI. E.g. as is, KVM will exit to userspace if there is no memslot, but emulate accesses to the APIC access page even if the attributes mismatch. Fixes: 8dd2eee9d526 ("KVM: x86/mmu: Handle page fault for private memory") Cc: Yu Zhang <yu.c.zhang@linux.intel.com> Cc: Chao Peng <chao.p.peng@linux.intel.com> Cc: Fuad Tabba <tabba@google.com> Cc: Michael Roth <michael.roth@amd.com> Cc: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Message-ID: <20240228024147.41573-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86/mmu: WARN and skip MMIO cache on private, reserved page faultsSean Christopherson1-0/+3
WARN and skip the emulated MMIO fastpath if a private, reserved page fault is encountered, as private+reserved should be an impossible combination (KVM should never create an MMIO SPTE for a private access). Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240228024147.41573-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86/mmu: check for invalid async page faults involving private memoryPaolo Bonzini2-7/+12
Right now the error code is not used when an async page fault is completed. This is not a problem in the current code, but it is untidy. For protected VMs, we will also need to check that the page attributes match the current state of the page, because asynchronous page faults can only occur on shared pages (private pages go through kvm_faultin_pfn_private() instead of __gfn_to_pfn_memslot()). Start by piping the error code from kvm_arch_setup_async_pf() to kvm_arch_async_page_ready() via the architecture-specific async page fault data. For now, it can be used to assert that there are no async page faults on private memory. Extracted from a patch by Isaku Yamahata. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86/mmu: Use synthetic page fault error code to indicate private faultsSean Christopherson3-2/+21
Add and use a synthetic, KVM-defined page fault error code to indicate whether a fault is to private vs. shared memory. TDX and SNP have different mechanisms for reporting private vs. shared, and KVM's software-protected VMs have no mechanism at all. Usurp an error code flag to avoid having to plumb another parameter to kvm_mmu_page_fault() and friends. Alternatively, KVM could borrow AMD's PFERR_GUEST_ENC_MASK, i.e. set it for TDX and software-protected VMs as appropriate, but that would require *clearing* the flag for SEV and SEV-ES VMs, which support encrypted memory at the hardware layer, but don't utilize private memory at the KVM layer. Opportunistically add a comment to call out that the logic for software- protected VMs is (and was before this commit) broken for nested MMUs, i.e. for nested TDP, as the GPA is an L2 GPA. Punt on trying to play nice with nested MMUs as there is a _lot_ of functionality that simply doesn't work for software-protected VMs, e.g. all of the paths where KVM accesses guest memory need to be updated to be aware of private vs. shared memory. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20240228024147.41573-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86/mmu: WARN if upper 32 bits of legacy #PF error code are non-zeroSean Christopherson1-0/+7
WARN if bits 63:32 are non-zero when handling an intercepted legacy #PF, as the error code for #PF is limited to 32 bits (and in practice, 16 bits on Intel CPUS). This behavior is architectural, is part of KVM's ABI (see kvm_vcpu_events.error_code), and is explicitly documented as being preserved for intecerpted #PF in both the APM: The error code saved in EXITINFO1 is the same as would be pushed onto the stack by a non-intercepted #PF exception in protected mode. and even more explicitly in the SDM as VMCS.VM_EXIT_INTR_ERROR_CODE is a 32-bit field. Simply drop the upper bits if hardware provides garbage, as spurious information should do no harm (though in all likelihood hardware is buggy and the kernel is doomed). Handling all upper 32 bits in the #PF path will allow moving the sanity check on synthetic checks from kvm_mmu_page_fault() to npf_interception(), which in turn will allow deriving PFERR_PRIVATE_ACCESS from AMD's PFERR_GUEST_ENC_MASK without running afoul of the sanity check. Note, this is also why Intel uses bit 15 for SGX (highest bit on Intel CPUs) and AMD uses bit 31 for RMP (highest bit on AMD CPUs); using the highest bit minimizes the probability of a collision with the "other" vendor, without needing to plumb more bits through microcode. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Message-ID: <20240228024147.41573-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86/mmu: Pass full 64-bit error code when handling page faultsIsaku Yamahata3-5/+4
Plumb the full 64-bit error code throughout the page fault handling code so that KVM can use the upper 32 bits, e.g. SNP's PFERR_GUEST_ENC_MASK will be used to determine whether or not a fault is private vs. shared. Note, passing the 64-bit error code to FNAME(walk_addr)() does NOT change the behavior of permission_fault() when invoked in the page fault path, as KVM explicitly clears PFERR_IMPLICIT_ACCESS in kvm_mmu_page_fault(). Continue passing '0' from the async #PF worker, as guest_memfd and thus private memory doesn't support async page faults. Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> [mdr: drop references/changes on rebase, update commit message] Signed-off-by: Michael Roth <michael.roth@amd.com> [sean: drop truncation in call to FNAME(walk_addr)(), rewrite changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240228024147.41573-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86: Move synthetic PFERR_* sanity checks to SVM's #NPF handlerSean Christopherson3-11/+18
Move the sanity check that hardware never sets bits that collide with KVM- define synthetic bits from kvm_mmu_page_fault() to npf_interception(), i.e. make the sanity check #NPF specific. The legacy #PF path already WARNs if _any_ of bits 63:32 are set, and the error code that comes from VMX's EPT Violatation and Misconfig is 100% synthesized (KVM morphs VMX's EXIT_QUALIFICATION into error code flags). Add a compile-time assert in the legacy #PF handler to make sure that KVM- define flags are covered by its existing sanity check on the upper bits. Opportunistically add a description of PFERR_IMPLICIT_ACCESS, since we are removing the comment that defined it. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Message-ID: <20240228024147.41573-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86: Define more SEV+ page fault error bits/flags for #NPFSean Christopherson1-0/+4
Define more #NPF error code flags that are relevant to SEV+ (mostly SNP) guests, as specified by the APM: * Bit 31 (RMP): Set to 1 if the fault was caused due to an RMP check or a VMPL check failure, 0 otherwise. * Bit 34 (ENC): Set to 1 if the guest’s effective C-bit was 1, 0 otherwise. * Bit 35 (SIZEM): Set to 1 if the fault was caused by a size mismatch between PVALIDATE or RMPADJUST and the RMP, 0 otherwise. * Bit 36 (VMPL): Set to 1 if the fault was caused by a VMPL permission check failure, 0 otherwise. Note, the APM is *extremely* misleading, and strongly implies that the above flags can _only_ be set for #NPF exits from SNP guests. That is a lie, as bit 34 (C-bit=1, i.e. was encrypted) can be set when running _any_ flavor of SEV guest on SNP capable hardware. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240228024147.41573-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86: Remove separate "bit" defines for page fault error code masksSean Christopherson2-25/+12
Open code the bit number directly in the PFERR_* masks and drop the intermediate PFERR_*_BIT defines, as having to bounce through two macros just to see which flag corresponds to which bit is quite annoying, as is having to define two macros just to add recognition of a new flag. Use ternary operator to derive the bit in permission_fault(), the one function that actually needs the bit number as part of clever shifting to avoid conditional branches. Generally the compiler is able to turn it into a conditional move, and if not it's not really a big deal. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20240228024147.41573-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07KVM: x86/mmu: Exit to userspace with -EFAULT if private fault hits emulationSean Christopherson2-8/+19
Exit to userspace with -EFAULT / KVM_EXIT_MEMORY_FAULT if a private fault triggers emulation of any kind, as KVM doesn't currently support emulating access to guest private memory. Practically speaking, private faults and emulation are already mutually exclusive, but there are many flow that can result in KVM returning RET_PF_EMULATE, and adding one last check to harden against weird, unexpected combinations and/or KVM bugs is inexpensive. Suggested-by: Yan Zhao <yan.y.zhao@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240228024147.41573-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-08KVM: PPC: Book3S HV nestedv2: Fix an error handling path in ↵Christophe JAILLET1-2/+2
gs_msg_ops_kvmhv_nestedv2_config_fill_info() The return value of kvmppc_gse_put_buff_info() is not assigned to 'rc' and 'rc' is uninitialized at this point. So the error handling can not work. Assign the expected value to 'rc' to fix the issue. Fixes: 19d31c5f1157 ("KVM: PPC: Add support for nestedv2 guests") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/a7ed4cc12e0a0bbd97fac44fe6c222d1c393ec95.1706441651.git.christophe.jaillet@wanadoo.fr
2024-05-08KVM: PPC: code cleanup for kvmppc_book3s_irqprio_deliverKunwu Chan1-4/+0
This part was commented from commit 2f4cf5e42d13 ("Add book3s.c") in about 14 years before. If there are no plans to enable this part code in the future, we can remove this dead code. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240125083348.533883-1-chentao@kylinos.cn
2024-05-08KVM: PPC: Book3S HV nestedv2: Cancel pending DEC exceptionVaibhav Jain1-1/+1
This reverts commit 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending decrementer exception") [1] which prevented canceling a pending HDEC exception for nestedv2 KVM guests. It was done to avoid overhead of a H_GUEST_GET_STATE hcall to read the 'DEC expiry TB' register which was higher compared to handling extra decrementer exceptions. However recent benchmarks indicate that overhead of not handling 'DECR' expiry for Nested KVM Guest(L2) is higher and results in much larger exits to Pseries Host(L1) as indicated by the Unixbench-arithoh bench[2] Metric | Current upstream | Revert [1] | Difference % ======================================================================== arithoh-count (10) | 3244831634 | 3403089673 | +04.88% kvm_hv:kvm_guest_exit | 513558 | 152441 | -70.32% probe:kvmppc_gsb_recv | 28060 | 28110 | +00.18% N=1 As indicated by the data above that reverting [1] results in substantial reduction in number of L2->L1 exits with only slight increase in number of H_GUEST_GET_STATE hcalls to read the value of 'DEC expiry TB'. This results in an overall ~4% improvement of arithoh[2] throughput. [1] commit 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending decrementer exception") [2] https://github.com/kdlucas/byte-unixbench/ Fixes: 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending decrementer exception") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240415035731.103097-1-vaibhav@linux.ibm.com
2024-05-08powerpc/xmon: Check cpu id in commands "c#", "dp#" and "dx#"Greg Kurz1-3/+3
All these commands end up peeking into the PACA using the user originated cpu id as an index. Check the cpu id is valid in order to prevent xmon to crash. Instead of printing an error, this follows the same behavior as the "lp s #" command : ignore the buggy cpu id parameter and fall back to the #-less version of the command. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/161531347060.252863.10490063933688958044.stgit@bahia.lan
2024-05-08powerpc/code-patching: Use dedicated memory routines for patchingBenjamin Gray1-4/+27
The patching page set up as a writable alias may be in quadrant 0 (userspace) if the temporary mm path is used. This causes sanitiser failures if so. Sanitiser failures also occur on the non-mm path because the plain memset family is instrumented, and KASAN treats the patching window as poisoned. Introduce locally defined patch_* variants of memset that perform an uninstrumented lower level set, as well as detecting write errors like the original single patch variant does. copy_to_user() is not correct here, as the PTE makes it a proper kernel page (the EAA is privileged access only, RW). It just happens to be in quadrant 0 because that's the hardware's mechanism for using the current PID vs PID 0 in translations. Importantly, it's incorrect to allow user page accesses. Now that the patching memsets are used, we also propagate a failure up to the caller as the single patch variant does. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240325052815.854044-2-bgray@linux.ibm.com
2024-05-08powerpc/code-patching: Test patch_instructions() during bootBenjamin Gray1-0/+92
patch_instructions() introduces new behaviour with a couple of variations. Test each case of * a repeated 32-bit instruction, * a repeated 64-bit instruction (ppc64), and * a copied sequence of instructions for both on a single page and when it crosses a page boundary. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240325052815.854044-1-bgray@linux.ibm.com
2024-05-08powerpc64/kasan: Pass virtual addresses to kasan_init_phys_region()Benjamin Gray2-2/+2
The kasan_init_phys_region() function maps shadow pages necessary for the ranges of the linear map backed by physical pages. Currently kasan_init_phys_region() is being passed physical addresses, but kasan_mem_to_shadow() expects virtual addresses. It works right now because the lower bits (12:64) of the kasan_mem_to_shadow() calculation are the same for the real and virtual addresses, so the actual PTE value is the same in the end. But virtual addresses are the intended input, so fix it. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240212045020.70364-1-bgray@linux.ibm.com
2024-05-08powerpc: rename SPRN_HID2 define to SPRN_HID2_750FXMatthias Schiffer5-9/+13
This register number is hardware-specific, rename it for clarity. FIXME comments are added in a few places where it seems like the wrong register is used. As I can't test this, only the rename is done with no functional change. Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240124105031.45734-1-matthias.schiffer@ew.tq-group.com
2024-05-08powerpc: Fix typosBjorn Helgaas29-40/+40
Fix typos, most reported by "codespell arch/powerpc". Only touches comments, no code changes. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240103231605.1801364-8-helgaas@kernel.org
2024-05-08powerpc/eeh: Fix spelling of the word "auxillary" and update commentGhanshyam Agrawal2-4/+4
Fix spelling of the word "auxillary" in arch/powerpc/kernel/eeh_pe.c and arch/powerpc/include/asm/eeh.h. Also update the eeh_set_pe_aux_size() comment to include the units. Signed-off-by: Ghanshyam Agrawal <ghanshyam1898@gmail.com> [mpe: Squash into one commit] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/2ab034609285b21c309cd8ab26c937c846d37ee7.1703756365.git.ghanshyam1898@gmail.com
2024-05-07powerpc/Makefile: Remove bits related to the previous use of -mcmodel=largeNaveen N Rao9-21/+1
All supported compilers today (gcc v5.1+ and clang v11+) have support for -mcmodel=medium. As such, NO_MINIMAL_TOC is no longer being set. Remove NO_MINIMAL_TOC as well as the fallback to -mminimal-toc. Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240110141237.3179199-1-naveen@kernel.org
2024-05-07powerpc/pseries/pci: Code cleanupKunwu Chan1-27/+0
This part was commented in about 19 years before. If there are no plans to enable this part code in the future, we can remove this dead code. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240126025030.577795-1-chentao@kylinos.cn
2024-05-07powerpc/cell: Code cleanup for spufs_mfc_flushKunwu Chan1-16/+4
This part was commented from commit a33a7d7309d7 ("[PATCH] spufs: implement mfc access for PPE-side DMA") in about 18 years before. If there are no plans to enable this part code in the future, we can remove this dead code. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240126021258.574916-1-chentao@kylinos.cn
2024-05-07powerpc/iommu: Code cleanup for cell/iommu.cKunwu Chan1-17/+0
This part was commented from commit 165785e5c0be ("[POWERPC] Cell iommu support") in about 17 years before. If there are no plans to enable this part code in the future, we can remove this dead code. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240125082637.532826-1-chentao@kylinos.cn
2024-05-07powerpc: Fix preserved memory size for int-vectorsGUO Zihua1-1/+9
The first 32k of memory is reserved for interrupt vectors, however for powerpc64 this might not be enough. Fix this by reserving the maximum size between 32k and the real size of interrupt vectors. Signed-off-by: GUO Zihua <guozihua@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240113080509.1598290-1-guozihua@huawei.com
2024-05-07powerpc: dts: fsl: rename ifc node name to be memory-controllerLi Yang5-5/+5
Update the node name to be align with binding document. Signed-off-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240119203911.3143928-4-Frank.Li@nxp.com
2024-05-07powerpc: dts: mpc85xx: remove "simple-bus" compatible from ifc nodeLi Yang9-9/+9
Update dts to match dts binding document. Signed-off-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240119203911.3143928-3-Frank.Li@nxp.com
2024-05-07powerpc: dts: p1010rdb: fix INTx interrupt issue on P1010RDB-PBXiaowei Bao3-16/+32
Due to the INTA is shared with the active-low PHY2 interrupt on P1010RDB-PA board, so configure P1010RDB-PA's INTA with polarity as active-low, the P1010RDB-PB board is used separately, so configure P1010RDB-PB's INTA with polarity as active-high. The INTX in P1010RDB-PB do not work because of the pcie@0 node fixup will be overwrited by p1010si-post.dtsi file, so we move the pcie@0 node fixup to p1010rdb-pb.dts and p1010rdb-pb_36b.dts. Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240119203911.3143928-2-Frank.Li@nxp.com
2024-05-07powerpc: dts: add power management nodes to FSL chipsRan Wang14-12/+83
Enable Power Management feature on device tree, including MPC8536, MPC8544, MPC8548, MPC8572, P1010, P1020, P1021, P1022, P2020, P2041, P3041, T104X, T1024. Signed-off-by: Zhao Chenhui <chenhui.zhao@freescale.com> Signed-off-by: Ran Wang <ran.wang_1@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240119203911.3143928-1-Frank.Li@nxp.com
2024-05-07powerpc/rtas: Add kernel-doc comments to smp_startup_cpu()Yang Li1-0/+1
This commit adds kernel-doc style comments with complete parameter descriptions for the function smp_startup_cpu(). Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240408053109.96360-2-yang.lee@linux.alibaba.com
2024-05-07powerpc: Fix kernel-doc comments in fsl_gtm.cYang Li1-3/+3
Fix some function names in kernel-doc comments. Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240408053109.96360-1-yang.lee@linux.alibaba.com
2024-05-07powerpc: boot: Fix kernel-doc param for partial_decompressYang Li1-1/+1
Fix the kernel-doc annotation for the 'skip' parameter in the partial_decompress() function by adding a missing underscore and colon. Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240408083916.123369-1-yang.lee@linux.alibaba.com
2024-05-07powerpc: remove unused *_syscall_64.o variables in MakefileMasahiro Yamada1-3/+0
Commit ab1a517d55b0 ("powerpc/syscall: Rename syscall_64.c into interrupt.c") missed to update these three lines: GCOV_PROFILE_syscall_64.o := n KCOV_INSTRUMENT_syscall_64.o := n UBSAN_SANITIZE_syscall_64.o := n To restore the original behavior, we could replace them with: GCOV_PROFILE_interrupt.o := n KCOV_INSTRUMENT_interrupt.o := n UBSAN_SANITIZE_interrupt.o := n However, nobody has noticed the functional change in the past three years, so they were unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240216135517.2002749-1-masahiroy@kernel.org
2024-05-07powerpc/bpf/32: Fix failing test_bpf testsChristophe Leroy2-31/+110
Recent additions in BPF like cpu v4 instructions, test_bpf module exhibits the following failures: test_bpf: #82 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #83 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #84 ALU64_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #85 ALU64_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #86 ALU64_MOVSX | BPF_W jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #165 ALU_SDIV_X: -6 / 2 = -3 jited:1 ret 2147483645 != -3 (0x7ffffffd != 0xfffffffd)FAIL (1 times) test_bpf: #166 ALU_SDIV_K: -6 / 2 = -3 jited:1 ret 2147483645 != -3 (0x7ffffffd != 0xfffffffd)FAIL (1 times) test_bpf: #169 ALU_SMOD_X: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times) test_bpf: #170 ALU_SMOD_K: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times) test_bpf: #172 ALU64_SMOD_K: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times) test_bpf: #313 BSWAP 16: 0x0123456789abcdef -> 0xefcd eBPF filter opcode 00d7 (@2) unsupported jited:0 301 PASS test_bpf: #314 BSWAP 32: 0x0123456789abcdef -> 0xefcdab89 eBPF filter opcode 00d7 (@2) unsupported jited:0 555 PASS test_bpf: #315 BSWAP 64: 0x0123456789abcdef -> 0x67452301 eBPF filter opcode 00d7 (@2) unsupported jited:0 268 PASS test_bpf: #316 BSWAP 64: 0x0123456789abcdef >> 32 -> 0xefcdab89 eBPF filter opcode 00d7 (@2) unsupported jited:0 269 PASS test_bpf: #317 BSWAP 16: 0xfedcba9876543210 -> 0x1032 eBPF filter opcode 00d7 (@2) unsupported jited:0 460 PASS test_bpf: #318 BSWAP 32: 0xfedcba9876543210 -> 0x10325476 eBPF filter opcode 00d7 (@2) unsupported jited:0 320 PASS test_bpf: #319 BSWAP 64: 0xfedcba9876543210 -> 0x98badcfe eBPF filter opcode 00d7 (@2) unsupported jited:0 222 PASS test_bpf: #320 BSWAP 64: 0xfedcba9876543210 >> 32 -> 0x10325476 eBPF filter opcode 00d7 (@2) unsupported jited:0 273 PASS test_bpf: #344 BPF_LDX_MEMSX | BPF_B eBPF filter opcode 0091 (@5) unsupported jited:0 432 PASS test_bpf: #345 BPF_LDX_MEMSX | BPF_H eBPF filter opcode 0089 (@5) unsupported jited:0 381 PASS test_bpf: #346 BPF_LDX_MEMSX | BPF_W eBPF filter opcode 0081 (@5) unsupported jited:0 505 PASS test_bpf: #490 JMP32_JA: Unconditional jump: if (true) return 1 eBPF filter opcode 0006 (@1) unsupported jited:0 261 PASS test_bpf: Summary: 1040 PASSED, 10 FAILED, [924/1038 JIT'ed] Fix them by adding missing processing. Fixes: daabb2b098e0 ("bpf/tests: add tests for cpuv4 instructions") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/91de862dda99d170697eb79ffb478678af7e0b27.1709652689.git.christophe.leroy@csgroup.eu
2024-05-07x86/numa: Fix SRAT lookup of CFMWS ranges with numa_fill_memblks()Robert Richter2-4/+2
For configurations that have the kconfig option NUMA_KEEP_MEMINFO disabled, numa_fill_memblks() only returns with NUMA_NO_MEMBLK (-1). SRAT lookup fails then because an existing SRAT memory range cannot be found for a CFMWS address range. This causes the addition of a duplicate numa_memblk with a different node id and a subsequent page fault and kernel crash during boot. Fix this by making numa_fill_memblks() always available regardless of NUMA_KEEP_MEMINFO. As Dan suggested, the fix is implemented to remove numa_fill_memblks() from sparsemem.h and alos using __weak for the function. Note that the issue was initially introduced with [1]. But since phys_to_target_node() was originally used that returned the valid node 0, an additional numa_memblk was not added. Though, the node id was wrong too, a message is seen then in the logs: kernel/numa.c: pr_info_once("Unknown target node for memory at 0x%llx, assuming node 0\n", [1] commit fd49f99c1809 ("ACPI: NUMA: Add a node and memblk for each CFMWS not in SRAT") Suggested-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/all/66271b0072317_69102944c@dwillia2-xfh.jf.intel.com.notmuch/ Fixes: 8f1004679987 ("ACPI/NUMA: Apply SRAT proximity domain to entire CFMWS window") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-07Merge tag 'aspeed-6.10-devicetree' of ↵Arnd Bergmann20-829/+4452
git://git.kernel.org/pub/scm/linux/kernel/git/joel/bmc into soc/dt-late ASPEED device tree updates for 6.10 - New and removed machines: * IBM System1 AST2600 BMC, a x86 server * ASUS X4TF AST2600 BMC, a x86 server * ASRock SPC621D8HM3 AST2500 BMC, a Intel Xeon system * ASRock E3C256D4I AST2500 BMC, a Intel Xeon system * Add ASRock X570D4U's AST2500 BMC, an AMD Ryzen 5000 system * Facebook Harma's AST2600 BMC * Facebook Cloudripper is removed - Updates to machines merged this cycle, as well as bonnell, yosemite4, minerva and others * tag 'aspeed-6.10-devicetree' of git://git.kernel.org/pub/scm/linux/kernel/git/joel/bmc: (52 commits) ARM: dts: aspeed: Add ASRock E3C256D4I BMC dt-bindings: arm: aspeed: document ASRock E3C256D4I dt-bindings: trivial-devices: add isil,isl69269 ARM: dts: aspeed: x4tf: Add dts for asus x4tf project dt-bindings: arm: aspeed: add ASUS X4TF board ARM: dts: aspeed: Remove Facebook Cloudripper dts ARM: dts: aspeed: drop unused ref_voltage ADC property ARM: dts: aspeed: harma: correct Mellanox multi-host property ARM: dts: aspeed: yosemitev2: correct Mellanox multi-host property ARM: dts: aspeed: yosemite4: correct Mellanox multi-host property ARM: dts: aspeed: greatlakes: correct Mellanox multi-host property ARM: dts: aspeed: Modify I2C bus configuration ARM: dts: aspeed: Disable unused ADC channels for Asrock X570D4U BMC ARM: dts: aspeed: Modify GPIO table for Asrock X570D4U BMC ARM: dts: aspeed: yosemite4: set bus13 frequency to 100k ARM: dts: Aspeed: Bonnell: Fix NVMe LED labels ARM: dts: aspeed: yosemite4: Enable ipmb device for OCP debug card ARM: dts: aspeed: ahe50dc: Update lm25066 regulator name ARM: dts: aspeed: Add vendor prefixes to lm25066 compat strings ARM: dts: aspeed: asrock: Use MAC address from FRU EEPROM ... Link: https://lore.kernel.org/r/CACPK8Xd2Qc9MQUJ-8GuRjmyU50oMHpmmHPHLqAh9W_1Gyqi2ug@mail.gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-07Merge tag 'amlogic-arm64-dt-for-v6.10' of ↵Arnd Bergmann12-4/+1012
https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux into soc/dt-late Amlogic ARM64 DT changes for v6.10: - New Boards: - MNT Reform 2 CM4 adapter with a BPI-CM4 Module - AV400 (Amlogic A5) - BA400 (Amlogic A4) - Initial Amlogic A4 & A5 support - MIPI DSI support for G12A, G12B & SM1 SoCs - Overlay for Khadas TS050 panel for the Khadas VIM3/VIM3L - Amlogic T7 reset controller * tag 'amlogic-arm64-dt-for-v6.10' of https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux: arm64: dts: amlogic: Add Amlogic T7 reset controller arm64: dts: add support for A5 based Amlogic AV400 arm64: dts: add support for A4 based Amlogic BA400 dt-bindings: serial: amlogic,meson-uart: Add compatible string for A4 dt-bindings: arm: amlogic: add A5 support dt-bindings: arm: amlogic: add A4 support arm64: dts: meson: fix S4 power-controller node arm64: dts: amlogic: meson-g12b-bananapi-cm4: add support for MNT Reform2 with CM4 adaper arm64: meson: khadas-vim3l: add TS050 DSI panel overlay arm64: meson: g12-common: add the MIPI DSI nodes dt-bindings: arm: amlogic: Document the MNT Reform 2 CM4 adapter with a BPI-CM4 Module Link: https://lore.kernel.org/r/6030a450-fb4e-4e27-a5c3-6f4e0f326d9a@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-07ARM: 9393/1: mm: Use conditionals for CFI branchesLinus Walleij20-0/+42
Commit 9385/2 introduced a few branches inside function prototypes when using CFI in order to deal with the situation where CFI inserts a few bytes of function information in front of the symbol. This is not good for older CPUs where every cycle counts. Commit 9386/2 alleviated the situation a bit by using aliases for the cache functions with identical signatures. This leaves the coherent cache flush functions *_coherent_kern_range() with these branches to the corresponing *_coherent_user_range() around, since their return type differ and they therefore cannot be aliased. Solve this by a simple ifdef so at least we can use fallthroughs when compiling without CFI enabled. Link: https://lore.kernel.org/linux-arm-kernel/Zi+e9M%2Ff5b%2FSto9H@shell.armlinux.org.uk/ Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-05-07Merge tag 'riscv-config-for-v6.10' of ↵Arnd Bergmann6-20/+19
https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/drivers RISC-V SoC Kconfig Updates for v6.10 A few different bits of SoC-related Kconfig work. The first part of this is shared with the DT updates - the modification of all SOC_CANAAN users to SOC_CANAAN_K210 to split the existing m-mode nommu k210 away from the k230 that is able to be used in a "common" kernel. The other thing here is the removal of most of the SOC_VENDOR options, with their ARCH_VENDOR equivalents that've been waiting in the wings for 1 year+ now made visible. Due a lapse on my part when originally adding the ARCH_VENDOR stuff, the Microchip transition isn't complete - the _POLARFIRE was a mistake to keep as there's gonna be non-PolarFire RISC-V stuff from Microchip soonTM. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> * tag 'riscv-config-for-v6.10' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux: riscv: config: enable ARCH_CANAAN in defconfig RISC-V: drop SOC_VIRT for ARCH_VIRT RISC-V: drop SOC_SIFIVE for ARCH_SIFIVE RISC-V: drop SOC_MICROCHIP_POLARFIRE for ARCH_MICROCHIP RISC-V: Drop unused SOC_CANAAN reset: k210: Deprecate SOC_CANAAN and use SOC_CANAAN_K210 pinctrl: k210: Deprecate SOC_CANAAN and use SOC_CANAAN_K210 clk: k210: Deprecate SOC_CANAAN and use SOC_CANAAN_K210 soc: canaan: Deprecate SOC_CANAAN and use SOC_CANAAN_K210 for K210 riscv: Kconfig.socs: Split ARCH_CANAAN and SOC_CANAAN_K210 Link: https://lore.kernel.org/r/20240503-mardi-underling-3d81a9f97329@spud Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-07Merge tag 'amlogic-defconfig-for-v6.10' of ↵Arnd Bergmann1-0/+1
https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux into soc/defconfig Amlogic defconfig changes for v6.10: - Enable Khadas TS050 driver as module * tag 'amlogic-defconfig-for-v6.10' of https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux: arm64: defconfig: enable Khadas TS050 panel as module Link: https://lore.kernel.org/r/13bf8bc4-1cb7-4b94-8c98-9d1cdae5e1f8@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-07Merge tag 'mvebu-arm-6.10-1' of ↵Arnd Bergmann5-38/+116
git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into soc/arm mvebu arm for 6.10 (part 1) Decrease the usage of global GPIO numbers for LEDs for Orion5x boards * tag 'mvebu-arm-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu: ARM: orion5x: Convert TS409 board to GPIO descriptors for LEDs ARM: orion5x: Convert Net2big board to GPIO descriptors for LEDs ARM: orion5x: Convert MV2120 board to GPIO descriptors for LEDs ARM: orion5x: Convert DNS323 board to GPIO descriptors for LEDs ARM: orion5x: Convert D2Net board to GPIO descriptors for LEDs Link: https://lore.kernel.org/r/87h6fcndxj.fsf@BLaptop.bootlin.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-07Merge tag 'samsung-dt64-6.10-2' of ↵Arnd Bergmann2-0/+170
https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into soc/dt Samsung DTS ARM64 changes for v6.10, part two Few changes exclusively for Google GS101: 1. Add HSI0 and HSI2 clock controllers (CMUs). 2. Add USB 3.1 Dual Role Device (DRD) support. 3. Add UFS (Universal Flash Storage) support. 4. Document bus clocks in pin controllers necessary for accessing registers. * tag 'samsung-dt64-6.10-2' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: arm64: dts: exynos: gs101: specify empty clocks for remaining pinctrl arm64: dts: exynos: gs101: specify bus clock for pinctrl_hsi2 arm64: dts: exynos: gs101: specify bus clock for pinctrl_peric[01] arm64: dts: exynos: gs101: specify bus clock for pinctrl (far) alive arm64: dts: exynos: gs101: enable ufs, phy on oriole & define ufs regulator arm64: dts: exynos: gs101: Add ufs and ufs-phy dt nodes arm64: dts: exynos: gs101: Add the hsi2 sysreg node dt-bindings: soc: google: exynos-sysreg: add dedicated hsi2 sysreg compatible arm64: dts: exynos: gs101-oriole: enable USB on this board arm64: dts: exynos: gs101: add USB & USB-phy nodes arm64: dts: exynos: gs101: enable cmu-hsi2 clock controller arm64: dts: exynos: gs101: enable cmu-hsi0 clock controller dt-bindings: clock: google,gs101-clock: add HSI2 clock management unit dt-bindings: clock: google,gs101-clock: add HSI0 clock management unit Link: https://lore.kernel.org/r/20240504121233.7589-1-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-07Merge tag 'v6.10-rockchip-dts64-2' of ↵Arnd Bergmann8-0/+879
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into soc/dt Radxa Rock 3C board. More gpu+usb enablement on rk3588 boards as well as two new iommus on rk3588. * tag 'v6.10-rockchip-dts64-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: arm64: dts: rockchip: add rk3588 pcie and php IOMMUs arm64: dts: rockchip: enable onboard spi flash for rock-3a arm64: dts: rockchip: add USB-C support to rk3588s-orangepi-5 arm64: dts: rockchip: Enable GPU on Orange Pi 5 arm64: dts: rockchip: enable GPU on khadas-edge2 arm64: dts: rockchip: Add USB3 on Edgeble NCM6A-IO board arm64: dts: rockchip: Support poweroff on Edgeble Neural Compute Module arm64: dts: rockchip: Add Radxa ROCK 3C dt-bindings: arm: rockchip: add Radxa ROCK 3C Link: https://lore.kernel.org/r/13810480.dW097sEU6C@diego Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-07Merge tag 'mvebu-dt64-6.10-1' of ↵Arnd Bergmann8-67/+88
git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into soc/dt mvebu dt64 for 6.10 (part 1) Few dts fix for dt validation * tag 'mvebu-dt64-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu: arm64: dts: marvell: espressobin-ultra: fix Ethernet Switch unit address arm64: dts: marvell: turris-mox: drop unneeded flash address/size-cells arm64: dts: marvell: eDPU: drop redundant address/size-cells arm64: dts: marvell: cn9130-crb: drop unneeded "status" arm64: dts: marvell: cn9130-crb: drop wrong unit-addresses arm64: dts: marvell: cn9130-db: drop wrong unit-addresses arm64: dts: marvell: cn9131-db: drop unneeded flash address/size-cells arm64: dts: marvell: cn9130-db: drop unneeded flash address/size-cells arm64: dts: marvell: ap80x: fix IOMMU unit address Link: https://lore.kernel.org/r/87jzk8ndyy.fsf@BLaptop.bootlin.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-07Merge tag 'qcom-arm64-fixes-for-6.9-2' of ↵Arnd Bergmann1-17/+13
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes One more Qualcomm Arm64 DeviceTree fix for v6.9 On ths SA8155P automotive platform, the wrong gpio controller is defined for the SD-card detect pin, which depending on probe ordering of things cause ethernet to be broken. The card detect pin reference is corrected to solve this problem. * tag 'qcom-arm64-fixes-for-6.9-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: arm64: dts: qcom: sa8155p-adp: fix SDHC2 CD pin configuration Link: https://lore.kernel.org/r/20240427153817.1430382-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-07m68k: Avoid CONFIG_COLDFIRE switch in uapi headerThomas Huth1-1/+1
We should not use any CONFIG switches in uapi headers since these only work during kernel compilation. They are not defined for userspace. Let's use the __mcoldfire__ switch from the compiler here instead. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2024-05-06printk: Remove redundant CONFIG_BASE_FULLYoann Congal22-22/+22
CONFIG_BASE_FULL is equivalent to !CONFIG_BASE_SMALL and is enabled by default: CONFIG_BASE_SMALL is the special case to take care of. So, remove CONFIG_BASE_FULL and move the config choice to CONFIG_BASE_SMALL (which defaults to 'n') For defconfigs explicitely disabling BASE_FULL, explicitely enable BASE_SMALL. For defconfigs explicitely enabling BASE_FULL, drop it as it is the default. Signed-off-by: Yoann Congal <yoann.congal@smile.fr> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240505080343.1471198-4-yoann.congal@smile.fr Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-05-06printk: Change type of CONFIG_BASE_SMALL to boolYoann Congal1-3/+3
CONFIG_BASE_SMALL is currently a type int but is only used as a boolean. So, change its type to bool and adapt all usages: CONFIG_BASE_SMALL == 0 becomes !IS_ENABLED(CONFIG_BASE_SMALL) and CONFIG_BASE_SMALL != 0 becomes IS_ENABLED(CONFIG_BASE_SMALL). Reviewed-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Yoann Congal <yoann.congal@smile.fr> Link: https://lore.kernel.org/r/20240505080343.1471198-3-yoann.congal@smile.fr Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-05-06LoongArch: KVM: Add mmio trace events supportBibo Mao2-6/+22
Add mmio trace events support, currently generic mmio events KVM_TRACE_MMIO_WRITE/xxx_READ/xx_READ_UNSATISFIED are added here. Also vcpu id field is added for all kvm trace events, since perf KVM tool parses vcpu id information for kvm entry event. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-05-06LoongArch: KVM: Add software breakpoint supportBibo Mao7-3/+40
When VM runs in kvm mode, system will not exit to host mode when executing a general software breakpoint instruction such as INSN_BREAK, trap exception happens in guest mode rather than host mode. In order to debug guest kernel on host side, one mechanism should be used to let VM exit to host mode. Here a hypercall instruction with a special code is used for software breakpoint usage. VM exits to host mode and kvm hypervisor identifies the special hypercall code and sets exit_reason with KVM_EXIT_DEBUG. And then let qemu handle it. Idea comes from ppc kvm, one api KVM_REG_LOONGARCH_DEBUG_INST is added to get the hypercall code. VMM needs get sw breakpoint instruction with this api and set the corresponding sw break point for guest kernel. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-05-06LoongArch: KVM: Add PV IPI support on guest sideBibo Mao8-2/+197
PARAVIRT config option and PV IPI is added for the guest side, function pv_ipi_init() is used to add IPI sending and IPI receiving hooks. This function firstly checks whether system runs in VM mode, and if kernel runs in VM mode, it will call function kvm_para_available() to detect the current hypervirsor type (now only KVM type detection is supported). The paravirt functions can work only if current hypervisor type is KVM, since there is only KVM supported on LoongArch now. PV IPI uses virtual IPI sender and virtual IPI receiver functions. With virtual IPI sender, IPI message is stored in memory rather than emulated HW. IPI multicast is also supported, and 128 vcpus can received IPIs at the same time like X86 KVM method. Hypercall method is used for IPI sending. With virtual IPI receiver, HW SWI0 is used rather than real IPI HW. Since VCPU has separate HW SWI0 like HW timer, there is no trap in IPI interrupt acknowledge. Since IPI message is stored in memory, there is no trap in getting IPI message. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-05-06LoongArch: KVM: Add PV IPI support on host sideBibo Mao6-2/+211
On LoongArch system, IPI hw uses iocsr registers. There are one iocsr register access on IPI sending, and two iocsr access on IPI receiving for the IPI interrupt handler. In VM mode all iocsr accessing will cause VM to trap into hypervisor. So with one IPI hw notification there will be three times of trap. In this patch PV IPI is added for VM, hypercall instruction is used for IPI sender, and hypervisor will inject an SWI to the destination vcpu. During the SWI interrupt handler, only CSR.ESTAT register is written to clear irq. CSR.ESTAT register access will not trap into hypervisor, so with PV IPI supported, there is one trap with IPI sender, and no trap with IPI receiver, there is only one trap with IPI notification. Also this patch adds IPI multicast support, the method is similar with x86. With IPI multicast support, IPI notification can be sent to at most 128 vcpus at one time. It greatly reduces the times of trapping into hypervisor. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-05-06LoongArch: KVM: Add vcpu mapping from physical cpuidBibo Mao4-0/+129
Physical CPUID is used for interrupt routing for irqchips such as ipi, msgint and eiointc interrupt controllers. Physical CPUID is stored at the CSR register LOONGARCH_CSR_CPUID, it can not be changed once vcpu is created and the physical CPUIDs of two vcpus cannot be the same. Different irqchips have different size declaration about physical CPUID, the max CPUID value for CSR LOONGARCH_CSR_CPUID on Loongson-3A5000 is 512, the max CPUID supported by IPI hardware is 1024, while for eiointc irqchip is 256, and for msgint irqchip is 65536. The smallest value from all interrupt controllers is selected now, and the max cpuid size is defines as 256 by KVM which comes from the eiointc irqchip. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-05-06LoongArch: KVM: Add cpucfg area for kvm hypervisorBibo Mao3-17/+50
Instruction cpucfg can be used to get processor features. And there is a trap exception when it is executed in VM mode, and also it can be used to provide cpu features to VM. On real hardware cpucfg area 0 - 20 is used by now. Here one specified area 0x40000000 -- 0x400000ff is used for KVM hypervisor to provide PV features, and the area can be extended for other hypervisors in future. This area will never be used for real HW, it is only used by software. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-05-06LoongArch: KVM: Add hypercall instruction emulationBibo Mao3-1/+38
On LoongArch system, there is a hypercall instruction special for virtualization. When system executes this instruction on host side, there is an illegal instruction exception reported, however it will trap into host when it is executed in VM mode. When hypercall is emulated, A0 register is set with value KVM_HCALL_INVALID_CODE, rather than inject EXCCODE_INE invalid instruction exception. So VM can continue to executing the next code. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-05-06LoongArch/smp: Refine some ipi functions on LoongArch platformBibo Mao7-71/+63
Refine the ipi handling on LoongArch platform, there are three modifications: 1. Add generic function get_percpu_irq(), replacing some percpu irq functions such as get_ipi_irq()/get_pmc_irq()/get_timer_irq() with get_percpu_irq(). 2. Change definition about parameter action called by function loongson_send_ipi_single() and loongson_send_ipi_mask(), and it is defined as decimal encoding format at ipi sender side. Normal decimal encoding is used rather than binary bitmap encoding for ipi action, ipi hw sender uses decimal encoding code, and ipi receiver will get binary bitmap encoding, the ipi hw will convert it into bitmap in ipi message buffer. 3. Add a structure smp_ops on LoongArch platform so that pv ipi can be used later. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-05-06x86/alternatives: Remove alternative_input_2()Borislav Petkov (AMD)1-14/+0
It is unused. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240506122848.20326-1-bp@kernel.org
2024-05-06powerpc/64: Set _IO_BASE to POISON_POINTER_DELTA not 0 for CONFIG_PCI=nMichael Ellerman1-1/+1
There is code that builds with calls to IO accessors even when CONFIG_PCI=n, but the actual calls are guarded by runtime checks. If not those calls would be faulting, because the page at virtual address zero is (usually) not mapped into the kernel. As Arnd pointed out, it is possible a large port value could cause the address to be above mmap_min_addr which would then access userspace, which would be a bug. To avoid any such issues, set _IO_BASE to POISON_POINTER_DELTA. That is a value chosen to point into unmapped space between the kernel and userspace, so any access will always fault. Note that on 32-bit POISON_POINTER_DELTA is 0, so the patch only has an effect on 64-bit. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240503075619.394467-2-mpe@ellerman.id.au
2024-05-06powerpc/io: Avoid clang null pointer arithmetic warningsMichael Ellerman1-12/+12
With -Wextra clang warns about pointer arithmetic using a null pointer. When building with CONFIG_PCI=n, that triggers a warning in the IO accessors, eg: In file included from linux/arch/powerpc/include/asm/io.h:672: linux/arch/powerpc/include/asm/io-defs.h:23:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 23 | DEF_PCI_AC_RET(inb, u8, (unsigned long port), (port), pio, port) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ... linux/arch/powerpc/include/asm/io.h:591:53: note: expanded from macro '__do_inb' 591 | #define __do_inb(port) readb((PCI_IO_ADDR)_IO_BASE + port); | ~~~~~~~~~~~~~~~~~~~~~ ^ That is because when CONFIG_PCI=n, _IO_BASE is defined as 0. Although _IO_BASE is defined as plain 0, the cast (PCI_IO_ADDR) converts it to void * before the addition with port happens. Instead the addition can be done first, and then the cast. The resulting value will be the same, but avoids the warning, and also avoids void pointer arithmetic which is apparently non-standard. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Closes: https://lore.kernel.org/all/CA+G9fYtEh8zmq8k8wE-8RZwW-Qr927RLTn+KqGnq1F=ptaaNsA@mail.gmail.com Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240503075619.394467-1-mpe@ellerman.id.au
2024-05-06powerpc/bpf: enable kfunc callHari Bathini2-10/+61
Currently, bpf jit code on powerpc assumes all the bpf functions and helpers to be part of core kernel text. This is false for kfunc case, as function addresses may not be part of core kernel text area. So, add support for addresses that are not within core kernel text area too, to enable kfunc support. Emit instructions based on whether the function address is within core kernel text address or not, to retain optimized instruction sequence where possible. In case of PCREL, as a bpf function that is not within core kernel text area is likely to go out of range with relative addressing on kernel base, use PC relative addressing. If that goes out of range, load the full address with PPC_LI64(). With addresses that are not within core kernel text area supported, override bpf_jit_supports_kfunc_call() to enable kfunc support. Also, override bpf_jit_supports_far_kfunc_call() to enable 64-bit pointers, as an address offset can be more than 32-bit long on PPC64. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240502173205.142794-2-hbathini@linux.ibm.com
2024-05-06powerpc/64/bpf: fix tail calls for PCREL addressingHari Bathini1-14/+16
With PCREL addressing, there is no kernel TOC. So, it is not setup in prologue when PCREL addressing is used. But the number of instructions to skip on a tail call was not adjusted accordingly. That resulted in not so obvious failures while using tailcalls. 'tailcalls' selftest crashed the system with the below call trace: bpf_test_run+0xe8/0x3cc (unreliable) bpf_prog_test_run_skb+0x348/0x778 __sys_bpf+0xb04/0x2b00 sys_bpf+0x28/0x38 system_call_exception+0x168/0x340 system_call_vectored_common+0x15c/0x2ec Also, as bpf programs are always module addresses and a bpf helper in general is a core kernel text address, using PC relative addressing often fails with "out of range of pcrel address" error. Switch to using kernel base for relative addressing to handle this better. Fixes: 7e3a68be42e1 ("powerpc/64: vmlinux support building with PCREL addresing") Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240502173205.142794-1-hbathini@linux.ibm.com
2024-05-06powerpc/dexcr: Add DEXCR prctl interfaceBenjamin Gray2-0/+111
Now that we track a DEXCR on a per-task basis, individual tasks are free to configure it as they like. The interface is a pair of getter/setter prctl's that work on a single aspect at a time (multiple aspects at once is more difficult if there are different rules applied for each aspect, now or in future). The getter shows the current state of the process config, and the setter allows setting/clearing the aspect. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> [mpe: Account for PR_RISCV_SET_ICACHE_FLUSH_CTX, shrink some longs lines] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240417112325.728010-5-bgray@linux.ibm.com
2024-05-06alpha: drop pre-EV56 supportArnd Bergmann11-301/+15
All EV4 machines are already gone, and the remaining EV5 based machines all support the slightly more modern EV56 generation as well. Debian only supports EV56 and later. Drop both of these and build kernels optimized for EV56 and higher when the "generic" options is selected, tuning for an out-of-order EV6 pipeline, same as Debian userspace. Since this was the only supported architecture without 8-bit and 16-bit stores, common kernel code no longer has to worry about aligning struct members, and existing workarounds from the block and tty layers can be removed. The alpha memory management code no longer needs an abstraction for the differences between EV4 and EV5+. Link: https://lists.debian.org/debian-alpha/2023/05/msg00009.html Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-06x86/microcode: Remove unused struct cpu_info_ctxDr. David Alan Gilbert1-5/+0
This looks unused since 2071c0aeda22 ("x86/microcode: Simplify init path even more") Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240506004300.770564-1-linux@treblig.org
2024-05-05mm/arm64: override clear_young_dirty_ptes() batch helperLance Yang2-0/+84
The per-pte get_and_clear/modify/set approach would result in unfolding/refolding for contpte mappings on arm64. So we need to override clear_young_dirty_ptes() for arm64 to avoid it. Link: https://lkml.kernel.org/r/20240418134435.6092-3-ioworker0@gmail.com Signed-off-by: Lance Yang <ioworker0@gmail.com> Suggested-by: Barry Song <21cnbao@gmail.com> Suggested-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jeff Xie <xiehuan09@gmail.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Peter Xu <peterx@redhat.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05mm/page_table_check: support userfault wr-protect entriesPeter Xu1-17/+1
Allow page_table_check hooks to check over userfaultfd wr-protect criteria upon pgtable updates. The rule is no co-existance allowed for any writable flag against userfault wr-protect flag. This should be better than c2da319c2e, where we used to only sanitize such issues during a pgtable walk, but when hitting such issue we don't have a good chance to know where does that writable bit came from [1], so that even the pgtable walk exposes a kernel bug (which is still helpful on triaging) but not easy to track and debug. Now we switch to track the source. It's much easier too with the recent introduction of page table check. There are some limitations with using the page table check here for userfaultfd wr-protect purpose: - It is only enabled with explicit enablement of page table check configs and/or boot parameters, but should be good enough to track at least syzbot issues, as syzbot should enable PAGE_TABLE_CHECK[_ENFORCED] for x86 [1]. We used to have DEBUG_VM but it's now off for most distros, while distros also normally not enable PAGE_TABLE_CHECK[_ENFORCED], which is similar. - It conditionally works with the ptep_modify_prot API. It will be bypassed when e.g. XEN PV is enabled, however still work for most of the rest scenarios, which should be the common cases so should be good enough. - Hugetlb check is a bit hairy, as the page table check cannot identify hugetlb pte or normal pte via trapping at set_pte_at(), because of the current design where hugetlb maps every layers to pte_t... For example, the default set_huge_pte_at() can invoke set_pte_at() directly and lose the hugetlb context, treating it the same as a normal pte_t. So far it's fine because we have huge_pte_uffd_wp() always equals to pte_uffd_wp() as long as supported (x86 only). It'll be a bigger problem when we'll define _PAGE_UFFD_WP differently at various pgtable levels, because then one huge_pte_uffd_wp() per-arch will stop making sense first.. as of now we can leave this for later too. This patch also removes commit c2da319c2e altogether, as we have something better now. [1] https://lore.kernel.org/all/000000000000dce0530615c89210@google.com/ Link: https://lkml.kernel.org/r/20240417212549.2766883-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05arm: mm: drop VM_FAULT_BADMAP/VM_FAULT_BADACCESSKefeng Wang1-15/+15
If bad map or access, directly set code to SEGV_MAPRR or SEGV_ACCERR, also set fault to 0 and goto error handling, which make us to drop the arch's special vm fault reason. [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/20240411130925.73281-3-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Aishwarya TCV <aishwarya.tcv@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Cristian Marussi <cristian.marussi@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05arm64: mm: drop VM_FAULT_BADMAP/VM_FAULT_BADACCESSKefeng Wang1-23/+20
Patch series "mm: remove arch's private VM_FAULT_BADMAP/BADACCESS", v2. Directly set SEGV_MAPRR or SEGV_ACCERR for arm/arm64 to remove the last two arch's private vm_fault reasons. This patch (of 2): If bad map or access, directly set si_code to SEGV_MAPRR or SEGV_ACCERR, also set fault to 0 and goto error handling, which make us to drop the arch's special vm fault reason. Link: https://lkml.kernel.org/r/20240411130925.73281-1-wangkefeng.wang@huawei.com Link: https://lkml.kernel.org/r/20240411130925.73281-2-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Aishwarya TCV <aishwarya.tcv@arm.com> Cc: Cristian Marussi <cristian.marussi@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05xtensa/mm: convert check_tlb_entry() to sanity check foliosDavid Hildenbrand1-5/+6
We want to limit the use of page_mapcount() to the places where it is absolutely necessary. So let's convert check_tlb_entry() to perform sanity checks on folios instead of pages. This essentially already happened: page_count() is mapped to folio_ref_count(), and page_mapped() to folio_mapped() internally. However, we would have printed the page_mapount(), which does not really match what page_mapped() would have checked. Let's simply print the folio mapcount to avoid using page_mapcount(). For small folios there is no change. Link: https://lkml.kernel.org/r/20240409192301.907377-17-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05sh/mm/cache: use folio_mapped() in copy_from_user_page()David Hildenbrand1-1/+1
We want to limit the use of page_mapcount() to the places where it is absolutely necessary. We're already using folio_mapped in copy_user_highpage() and copy_to_user_page() for a similar purpose so ... let's also simply use it for copy_from_user_page(). There is no change for small folios. Likely we won't stumble over many large folios on sh in that code either way. Link: https://lkml.kernel.org/r/20240409192301.907377-13-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05mm: pass VMA instead of MM to follow_pte()David Hildenbrand2-6/+3
... and centralize the VM_IO/VM_PFNMAP sanity check in there. We'll now also perform these sanity checks for direct follow_pte() invocations. For generic_access_phys(), we might now check multiple times: nothing to worry about, really. Link: https://lkml.kernel.org/r/20240410155527.474777-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Sean Christopherson <seanjc@google.com> [KVM] Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Fei Li <fei1.li@intel.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Yonghua Huang <yonghua.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05Merge tag 'powerpc-6.9-4' of ↵Linus Torvalds3-8/+15
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix incorrect delay handling in the plpks (keystore) code - Fix a panic when an LPAR boots with a frozen PE Thanks to Andrew Donnellan, Gaurav Batra, Nageswara R Sastry, and Nayna Jain. * tag 'powerpc-6.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries/iommu: LPAR panics during boot up with a frozen PE powerpc/pseries: make max polling consistent for longer H_CALLs
2024-05-05Merge tag 'x86-urgent-2024-05-05' of ↵Linus Torvalds9-67/+64
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 fixes from Ingo Molnar: - Remove the broken vsyscall emulation code from the page fault code - Fix kexec crash triggered by certain SEV RMP table layouts - Fix unchecked MSR access error when disabling the x2APIC via iommu=off * tag 'x86-urgent-2024-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Remove broken vsyscall emulation code from the page fault code x86/apic: Don't access the APIC when disabling x2APIC x86/sev: Add callback to apply RMP table fixups for kexec x86/e820: Add a new e820 table update helper
2024-05-05ARM: orion5x: Convert TS409 board to GPIO descriptors for LEDsLinus Walleij1-8/+17
This makes the LEDs on the TS409 Orion5x board use GPIO descriptors instead of hardcoded GPIOs from the global numberspace. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2024-05-05ARM: orion5x: Convert Net2big board to GPIO descriptors for LEDsLinus Walleij1-4/+17
This makes the LEDs on the Net2big Orion5x board use GPIO descriptors instead of hardcoded GPIOs from the global numberspace. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2024-05-05ARM: orion5x: Convert MV2120 board to GPIO descriptors for LEDsLinus Walleij1-8/+21
This makes the LEDs on the MV2120 Orion5x board use GPIO descriptors instead of hardcoded GPIOs from the global numberspace. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2024-05-05ARM: orion5x: Convert DNS323 board to GPIO descriptors for LEDsLinus Walleij1-15/+48
This makes the LEDs on the D-Link DNS323 Orion5x board use GPIO descriptors instead of hardcoded GPIOs from the global numberspace. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2024-05-05ARM: orion5x: Convert D2Net board to GPIO descriptors for LEDsLinus Walleij1-3/+13
This makes the LEDs on the D2Net Orion5x board use GPIO descriptors instead of hardcoded GPIOs from the global numberspace. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2024-05-05arm64: dts: marvell: espressobin-ultra: fix Ethernet Switch unit addressKrzysztof Kozlowski1-34/+64
The Espressobin Ultra DTS includes Espressobin DTSI which defines ethernet-switch@1 node. The Ultra DTS overrides "reg" to 3, but that leaves still old unit address which conflicts with the new phy@1 node (W=1 dtc warning): armada-3720-espressobin.dtsi:148.29-203.4: Warning (unique_unit_address_if_enabled): /soc/internal-regs@d0000000/mdio@32004/ethernet-switch@1: duplicate unit-address (also used in node /soc/internal-regs@d0000000/mdio@32004/ethernet-phy@1) Fix this by deleting ethernet-switch@1 node and merging original node with code from Ultra DTS into new ethernet-switch@3. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2024-05-05arm64: dts: marvell: turris-mox: drop unneeded flash address/size-cellsKrzysztof Kozlowski1-2/+0
Flash node uses single "partition" node to describe partitions, so remove deprecated address/size-cells properties to also fix dtc W=1 warnings: armada-3720-turris-mox.dts:218.10-255.4: Warning (avoid_unnecessary_addr_size): /soc/internal-regs@d0000000/spi@10600/flash@0: unnecessary #address-cells/#size-cells without "ranges" or child "reg" property Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Marek Behún <kabel@kernel.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2024-05-05arm64: dts: marvell: eDPU: drop redundant address/size-cellsKrzysztof Kozlowski1-2/+0
The ethernet-switch node does not have children with unit addresses, so address/size-cells are not really correct, as reported by dtc W=1 warning: armada-3720-eDPU.dts:26.19-60.4: Warning (avoid_unnecessary_addr_size): /soc/internal-regs@d0000000/mdio@32004/switch@0: unnecessary #address-cells/#size-cells without "ranges" or child "reg" property This probably also fixes dtbs_check warning, but I could not find it, so not sure about that. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2024-05-05Revert "s390: Relocate vmlinux ELF data to virtual address space"Alexander Gordeev1-2/+2
This reverts commit 9ecaa2e94e602a3cbcbfe182535f6297f7630b98. In case CONFIG_MODULES kernel option is not defined the build fails with the following linker error: block/partitions/ibm.o: in function `ibm_partition': ibm.c:(.text+0x8bc): relocation truncated to fit: R_390_PLT32DBL against undefined symbol `dasd_biodasdinfo' Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-05-04arm64: zynqmp: Add resets property for UART nodesManikanta Guntupalli1-0/+2
Add resets property for UART0 and UART1 nodes Signed-off-by: Manikanta Guntupalli <manikanta.guntupalli@amd.com> Link: https://lore.kernel.org/r/20240425062358.1347684-3-manikanta.guntupalli@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-04perf: Use device_show_string() helper for sysfs attributesLukas Wunner1-10/+3
Deduplicate sysfs ->show() callbacks which expose a string at a static memory location. Use the newly introduced device_show_string() helper in the driver core instead by declaring those sysfs attributes with DEVICE_STRING_ATTR_RO(). No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/3a297850312b4ecb62d6872121de04496900f502.1713608122.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-04driver core: Add device_show_string() helper for sysfs attributesLukas Wunner1-10/+0
For drivers wishing to expose an unsigned long, int or bool at a static memory location in sysfs, the driver core provides ready-made helpers such as device_show_ulong() to be used as ->show() callback. Some drivers need to expose a string and so far they all provide their own ->show() implementation. arch/powerpc/perf/hv-24x7.c went so far as to create a device_show_string() helper but kept it private. Make it public for reuse by other drivers. The pattern seems to be sufficiently frequent to merit a public helper. Add a DEVICE_STRING_ATTR_RO() macro in line with the existing DEVICE_ULONG_ATTR() and similar macros to ease declaration of string attributes. Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2e3eaaf2600bb55c0415c23ba301e809403a7aa2.1713608122.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-03x86/pci: Remove OLPC dead codeKunwu Chan1-3/+0
3ef0e1f8cad0 ("x86: olpc: add One Laptop Per Child architecture support") added a commented-out EHCI config section that has never been used. Remove this dead code. Link: https://lore.kernel.org/r/20240125030623.513902-1-chentao@kylinos.cn Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-05-03arm64: dts: qcom: pm6150: correct USB VBUS regulator compatibleKrzysztof Kozlowski1-2/+2
The first part of the compatible of USB VBUS node misses ending quote, thus we have one long compatible consisting of two compatible strings leading to dtbs_check warnings: sc7180-idp.dtb: usb-vbus-regulator@1100: compatible:0: 'qcom,pm6150-vbus-reg,\n qcom,pm8150b-vbus-reg' does not match '^[a-zA-Z0-9][a-zA-Z0-9,+\\-._/]+$' sc7180-idp.dtb: /soc@0/spmi@c440000/pmic@0/usb-vbus-regulator@1100: failed to match any schema with compatible: ['qcom,pm6150-vbus-reg,\n qcom,pm8150b-vbus-reg'] Reported-by: Rob Herring <robh@kernel.org> Fixes: f81c2f01cad6 ("arm64: dts: qcom: pm6150: define USB-C related blocks") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20240330091311.6224-2-krzysztof.kozlowski@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-05-03alpha: cabriolet: remove EV5 CPU supportArnd Bergmann4-44/+4
The sys_cabriolet.c file includes support for multiple evaluation boards. pc164 and lx164 are for ev56 CPUs, while the eb164 is now the last supported machine that only supports ev5 but not ev56. Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-03alpha: remove LCA and APECS based machinesArnd Bergmann18-2940/+21
APECS is the DECchip 21071x chipset for the EV4 and EV45 generation, while LCA is the integrated I/O support on the corresponding low-cost alpha machines of that generation. All of these CPUs lack the BWX extension for byte and word access, so drop the chipset support and all associated machines. Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-03alpha: sable: remove early machine supportArnd Bergmann6-333/+11
The sable family (Alphaserver 2000 and 2100) comes in variants for EV4, EV45, EV5 and EV56. Drop support for the earlier ones that lack support for the BWX extension but keep the later 'gamma' variant around since that works with EV56 CPUs. Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-03alpha: remove DECpc AXP150 (Jensen) supportArnd Bergmann8-770/+8
This is one of the hackiest Alpha machines, and the only one without PCI support. Removing this allows cleaning up code in eise and tty drivers in addition to the architecture code. Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-03alpha: trim the unused stuff from asm-offsets.cAl Viro1-20/+1
Out of 21 constants, only 6 are used... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-03alpha: jensen, t2 - make __EXTERN_INLINE same as for the restAl Viro2-2/+2
We want io.h primitives (readb(), etc.) to be extern inline. However, that requires the backing out-of-line implementation somewhere, preferably kept in sync with the inline ones. The way it's done is __EXTERN_INLINE macro that defaults to extern inline, but can be overridden in compilation unit where the out-of-line instance will be. That works, but it's brittle - we *must* make sure that asm/io.h is the very first include in such compilation units. There'd been a bunch of bugs of that sort in the past. Another issue is the choice of overriding definition for __EXTERN_INLINE; it must be either 'inline' or empty. Either will do for compilation purposes - inline void foo(...) {...} (without extern or static) is going to generate out-of-line instance. The difference is that 'definition without a prototype' heuristics trigger on void foo(void) { ... } but not on inline void foo(void) { ... } Most of the overrides go for 'inline'; in two cases (sys_jensen and core_t2) __EXTERN_INLINE is defined as empty. Without -Wmissing-prototypes it didn't matter, but now that we have that thing always on... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-03alpha: core_lca: take the unused functions outAl Viro1-44/+0
the only user had been drivers/char/h8.c, and that got taken out and shot back in 2004... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-03alpha: missing includesAl Viro16-14/+45
... and missing externs in proto.h Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-03alpha: sys_sio: fix misspelled ifdefsAl Viro1-2/+2
definitions of avanti_mv and noname_mv (and associated ALIAS_MV) are conditional upon the wrong thing - it should be CONFIG_ALPHA_{AVANTI,NONAME}_CH, not CONFIG_ALPHA_{AVANTI,NONAME}. The former is a system type; the latter is for the bits shared by AVANTI with XL and NONAME with ALPHA_BOOK1 resp. We want all those machine vectors defined (but not aliased - see ALIAS_MV() definition for details) for GENERIC build; for system-specfic builds we want only one mv, so avanti_mv should *not* be there for XL; it certainly should not be have alpha_mv aliased to it on such config - xl_mv will be there and alpha_mv can't be aliased to both of those. The same goes for Noname vs. Alphabook1. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-03alpha: don't make functions public without a reasonAl Viro7-14/+16
if it's really used only inside the same source file, make it static... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-03alpha: add clone3() supportAl Viro2-1/+2
Since clone3() needs the full register state saved for copying into the child, it needs the same kind of wrapper as fork(), vfork() and clone(). Exact same wrapper works, actually... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>