summaryrefslogtreecommitdiffstats
path: root/kexec
AgeCommit message (Collapse)AuthorFilesLines
2024-04-26ARM: Fix add_buffer_phys_virt() align issueHEADmastermainHaiqing Bai1-1/+4
When "CONFIG_ARM_LPAE" is enabled,3 level page table is used by MMU, the "SECTION_SIZE" is defined with (1 << 21), but 'add_buffer_phys_virt()' hardcode this to (1 << 20). Suggested-By: fredrik.markstrom@gmail.com Signed-off-by: Haiqing Bai <Haiqing.Bai@windriver.com> Signed-off-by: Alexander Kanavin <alex@linutronix.de> Signed-off-by: Simon Horman <horms@kernel.org>
2024-03-15kexec_file: add kexec_file flag to support debug printingBaoquan He2-0/+2
This add KEXEC_FILE_DEBUG to kexec_file_flags so that it can be passed to kernel when '-d' is added with kexec_file_load interface. With that flag enabled, kernel can enable the debugging message printing. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Simon Horman <horms@kernel.org>
2024-01-25kexec: don't use kexec_file_load on XENJiri Bohac2-0/+5
Since commit 29fe5067ed07 ("kexec: make -a the default") kexec tries the kexec_file_load syscall first and only falls back to kexec_load on selected error codes. This effectively breaks kexec on XEN, unless -c is pecified to force the kexec_load syscall. The XEN-specific functions (xen_kexec_load / xen_kexec_unload) are only called from my_load / k_unload, i.e. the kexec_load code path. With -p (panic kernel) kexec_file_load on XEN fails with -EADDRNOTAVAIL (crash kernel reservation is ignored by the kernel on XEN), which is not in the list of return codes that cause the fallback to kexec_file. Without -p kexec_file_load actualy leads to a kernel oops on v6.4.0 (needs to be dubugged separately). Signed-off-by: Jiri Bohac <jbohac@suse.cz> Fixes: 29fe5067ed07 ("kexec: make -a the default") Signed-off-by: Simon Horman <horms@kernel.org>
2023-12-02LoongArch: Load vmlinux.efi to the link addressWANG Rui1-3/+7
Currently, kexec loads vmlinux.efi to address 0 instead of the link address. This causes kexec to fail to boot the new vmlinux.efi on qemu. pei_loongarch_load: kernel_segment: 0000000000000000 pei_loongarch_load: kernel_entry: 00000000013f1000 pei_loongarch_load: image_size: 0000000001ca0000 pei_loongarch_load: text_offset: 0000000000200000 pei_loongarch_load: phys_offset: 0000000000000000 pei_loongarch_load: PE format: yes loongarch_load_other_segments:333: command_line: kexec console=ttyS0,115200 kexec_load: entry = 0x13f1000 flags = 0x1020000 nr_segments = 2 segment[0].buf = 0x7fffeea38010 segment[0].bufsz = 0x1b55200 segment[0].mem = (nil) segment[0].memsz = 0x1ca0000 segment[1].buf = 0x5555570940b0 segment[1].bufsz = 0x200 segment[1].mem = 0x1ca0000 segment[1].memsz = 0x4000 This patch constrains the range of the kernel segment by `hole_min` and `hole_max` to place vmlinux.efi exactly at the link address. pei_loongarch_load: kernel_segment: 0000000000200000 pei_loongarch_load: kernel_entry: 00000000013f1000 pei_loongarch_load: image_size: 0000000001ca0000 pei_loongarch_load: text_offset: 0000000000200000 pei_loongarch_load: phys_offset: 0000000000000000 pei_loongarch_load: PE format: yes loongarch_load_other_segments:339: command_line: kexec console=ttyS0,115200 kexec_load: entry = 0x13f1000 flags = 0x1020000 nr_segments = 2 segment[0].buf = 0x7ffff2028010 segment[0].bufsz = 0x1b55200 segment[0].mem = 0x200000 segment[0].memsz = 0x1ca0000 segment[1].buf = 0x555557498098 segment[1].bufsz = 0x200 segment[1].mem = 0x1ea0000 segment[1].memsz = 0x4000 Signed-off-by: WANG Rui <wangrui@loongson.cn> Reviewed-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Simon Horman <horms@kernel.org>
2023-12-02LoongArch: Fix an issue with relocatable vmlinuxWANG Rui1-2/+3
Normally vmlinux for LoongArch is of ET_EXEC type, while if built with CONFIG_RELOCATABLE (this is PIE) and Clang, it will be of ET_DYN type. Meanwhile, physical address field of segments in vmlinux has actually the same value as virtual address field. Similar to arm64, this patch allows to unconditionally skip the check on LoongArch. Link: https://github.com/ClangBuiltLinux/linux/issues/1963 Signed-off-by: WANG Rui <wangrui@loongson.cn> Reviewed-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Simon Horman <horms@kernel.org>
2023-12-02m68k: fix getrandom() use with uclibcLaurent Vivier1-0/+1
With uclibc, getrandom() is only defined with _GNU_SOURCE, fix that: kexec/arch/m68k/bootinfo.c: In function 'bootinfo_add_rng_seed': kexec/arch/m68k/bootinfo.c:231:13: warning: implicit declaration of function 'getrandom'; did you mean 'srandom'? [-Wimplicit-function-declaration] 231 | if (getrandom(bi->rng_seed.data, RNG_SEED_LEN, GRND_NONBLOCK) != RNG_SEED_LEN) { | ^~~~~~~~~ | srandom kexec/arch/m68k/bootinfo.c:231:56: error: 'GRND_NONBLOCK' undeclared (first use in this function) 231 | if (getrandom(bi->rng_seed.data, RNG_SEED_LEN, GRND_NONBLOCK) != RNG_SEED_LEN) { | ^~~~~~~~~~~~~ Fixes: b9de05184816 ("m68k: pass rng seed via BI_RNG_SEED") Cc: Jason@zx2c4.com Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Simon Horman <horms@kernel.org>
2023-12-02lzma: Relax memory limit for lzma decompressorWANG Rui1-1/+1
The kexec cannot load LZMA compressed vmlinuz.efi on LoongArch. Try LZMA decompression. lzma_decompress_file: read on /tmp/Image4yyfhM of 65536 bytes failed pez_prepare: decompressed size 8563960 pez_prepare: done Cannot load vmlinuz.efi The root cause is that lzma decompressor requires more memory usage, which exceeds the current 64M limit. Reported-by: Huacai Chen <chenhuacai@kernel.org> Signed-off-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: Simon Horman <horms@kernel.org>
2023-11-16kexec: ppc64: print help to stdout instead of stderrAditya Gupta3-15/+15
Currently 'kexec --help' on powerpc64 prints the generic help/usage to stdout, and the powerpc64 specific options to stderr That is, if the stdout of 'kexec --help' is redirected to some file, some of the help options will not be redirected, and instead printed on the terminal/stderr: [root@machine kexec-tools]# kexec --help > /tmp/out --command-line=<Command line> command line to append. --append=<Command line> same as --command-line. --ramdisk=<filename> Initial RAM disk. --initrd=<filename> same as --ramdisk. --devicetreeblob=<filename> Specify device tree blob file. Not applicable while using --kexec-file-syscall. --dtb=<filename> same as --devicetreeblob. elf support is still broken --elf64-core-headers Prepare core headers in ELF64 format --dt-no-old-root Do not reuse old kernel root= param. while creating flatten device tree. Fix this inconsistency by writing powerpc64 specific options to stdout, similar to the generic 'kexec --help' With the proposed changes, it is like this (nothing printed to stderr): [root@machine kexec-tools]# ./build/sbin/kexec --help > /tmp/out Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-10-11kexec/loongarch64: fix 'make dist' file loss issueMing Wang1-0/+1
The Makefile omits the iomem.h file, causing the archive file generated by 'make dist' to lose iomem.h. This patch is used to fix this problem. Signed-off-by: Ming Wang <wangming01@loongson.cn> Signed-off-by: Simon Horman <horms@kernel.org>
2023-10-04kexec: provide a memfd_create() wrapper if not present in libcJulien Olivain1-0/+11
Commit 714fa115 "kexec/arm64: Simplify the code for zImage" introduced a use of the memfd_create() system call, included in version kexec-tools v2.0.27. This system call was introduced in kernel commit [1], first included in kernel v3.17 (released on 2014-10-05). The memfd_create() glibc wrapper function was added much later in commit [2], first included in glibc version 2.27 (released on 2018-02-01). This direct use memfd_create() introduced a requirement on Kernel >= 3.17 and glibc >= 2.27. There is old toolchains like [3] for example (which ships gcc 7.3.1, glibc 2.25 and includes kernel v4.10 headers), that can still be used to build newer kernels. Even if such toolchains can be seen as outdated, they are is still claimed as supported by recent kernel. For example, Kernel v6.5.5 has a requirement on gcc version 5.1 and greater. See [4]. Moreover, kexec-tools <= 2.0.26 could be compiled using recent toolchains with alternative libc (e.g. uclibc-ng, musl) which are not providing the memfd_create() wrapper. When compiling kexec-tools v2.0.27 with a toolchain not providing the memfd_create() syscall wrapper, the compilation fail with message: kexec/kexec.c: In function 'copybuf_memfd': kexec/kexec.c:645:7: warning: implicit declaration of function 'memfd_create'; did you mean 'SYS_memfd_create'? [-Wimplicit-function-declaration] fd = memfd_create("kernel", MFD_ALLOW_SEALING); ^~~~~~~~~~~~ SYS_memfd_create kexec/kexec.c:645:30: error: 'MFD_ALLOW_SEALING' undeclared (first use in this function); did you mean '_PC_ALLOC_SIZE_MIN'? fd = memfd_create("kernel", MFD_ALLOW_SEALING); ^~~~~~~~~~~~~~~~~ _PC_ALLOC_SIZE_MIN In order to let kexec-tools compile in a wider range of configurations, this commit adds a memfd_create() function check in autoconf configure script, and adds a system call wrapper which will be used if the function is not available. With this commit, the environment requirement is relaxed to only kernel >= v3.17. Note: this issue was found in kexec-tools integration in Buildroot [5] using the command "utils/test-pkg -a -p kexec", which tests many toolchain/arch combinations. [1] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9183df25fe7b194563db3fec6dc3202a5855839c [2] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=59d2cbb1fe4b8601d5cbd359c3806973eab6c62d [3] https://releases.linaro.org/components/toolchain/binaries/7.3-2018.05/aarch64-linux-gnu/gcc-linaro-7.3.1-2018.05-x86_64_aarch64-linux-gnu.tar.xz [4] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/Documentation/process/changes.rst?h=v6.5.5#n32 [5] https://buildroot.org/ Signed-off-by: Julien Olivain <ju.o@free.fr> Signed-off-by: Simon Horman <horms@kernel.org>
2023-10-04crashdump/x86: set the elfcorehdr segment size for hotplugEric DeVolder1-0/+8
For hotplug, the elfcorehdr segment must be sized appropriately to allow a growing number of CPUs or memory regions. Use the size reported by the kernel via /sys/kernel/crash_elfcorehdr_sz. Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-10-04crashdump/x86: identify elfcorehdr segment for hotplugEric DeVolder1-0/+3
Identify the segment containing the elfcorehdr buffer so that it can be excluded from the purgatory checksum/digest, if hotplug support is in effect. Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-10-04crashdump: exclude elfcorehdr segment from digest for hotplugEric DeVolder2-0/+9
To allow direct modification of the elfcorehdr by the kernel, in response to CPU and memory hot un/plug and/or online/offline events, the buffer containing the elfcorehdr must be excluded from the purgatory checksum/digest. If the elfcorehdr is not excluded from the purgatory checksum/digest, then at panic time, the checksum/digest check fails (due to the elfcorehdr having been modified), and the kdump capture kernel does not start. Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-10-04crashdump: setup general hotplug supportEric DeVolder1-0/+18
To allow direct modification of the elfcorehdr by the kernel, in response to CPU and memory hot un/plug and/or online/offline events, the following conditions must occur: - the elfcorehdr buffer must be excluded from the purgatory checksum/digest, and - the elfcorehdr segment must be large enough, and - the kernel must be notified that it can modify the elfcorehdr Excluding the elfcorehdr buffer from the digest occurs in patch "crashdump: exclude elfcorehdr segment from digest for hotplug". If this is not done, a change to the elfcorehdr will cause the purgatory check at panic time to fail, and kdump capture kernel does not start. For hotplug, the size of the elfcorehdr segment is obtained from the kernel via the /sys/kernel/crash_elforehdr_size node. The KEXEC_UPDATE_ELFCOREHDR flag indicates to the kernel that it can make direct modifications to the elfcorehdr. Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-10-04crashdump: introduce the hotplug command line optionsEric DeVolder3-1/+18
Introducing the --hotplug command line option, which is used to indicate to the kernel that the kdump image is setup to permit the kernel to directly modify the elfcorehdr in response to CPU and memory hotplug and/or online/offline events. This option is only meaningful for kexec_load() syscall. For the kexec_file_load() syscall, this option is a no-op as the kernel handles all aspects of loading the kdump image. This is the command line processing and documentation. Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-10-04kexec: define KEXEC_UPDATE_ELFCOREHDREric DeVolder1-0/+1
The Linux kernel defines this flag to indicate that the kexec_load()'ed image is setup so that the kernel may directly modify the elfcorehdr (and not cause the purgatory digest checksum to fail) in response to CPU or memory hot un/plug and/or on/offline events. Define this flag to match/mirror the kernel flag. Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-10-04kexec: update manpage with explicit mention of clean kexecHari Bathini1-2/+9
While the manpage does mention about kexec boot with a clean shutdown, it is not explicit about it. Make it explicit. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-09-20zboot: add loongarch kexec_load supportDave Young5-0/+97
Copy arm64 code and change for loongarch so that the kexec -c can load a zboot image. Note: probe zboot image first otherwise the pei-loongarch file type will be used. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-09-20zboot: enable arm64 kexec_load for zboot imageDave Young4-5/+28
kexec_file_load support of zboot kernel image decompressed the vmlinuz, so in kexec_load code just load the kernel with reading the decompressed kernel fd into a new buffer and use it directly. Signed-off-by: Dave Young <dyoung@redhat.com> Tested-by: Baoquan He <bhe@redhat.com> Reviewed-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-08-11arm64: Hook up the ZBOOT support as vmlinuzJeremy Linton3-1/+9
Add the previously defined _probe() and _usage() routines to the kexec file types table, and build the new module. It should be noted that this "vmlinuz" support reuses the "Image" support to actually load the resulting image after it has been decompressed to a temporary file. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-08-11arm64: Add ZBOOT PE containing compressed image supportJeremy Linton2-0/+111
The kernel EFI stub ZBOOT feature creates a PE that contains a compressed linux kernel image. The stub when run in a valid UEFI environment then decompresses the resulting image and executes it. Support these image formats with kexec as well to avoid having to keep an alternate kernel image around. This patch adds a the _probe(), _load() and usage() routines needed for kexec to understand this format. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> [Modified by Pingfan to export kernel fd with load method] Signed-off-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-08-11kexec/zboot: Add arch independent zboot supportJeremy Linton2-0/+132
The linux kernel CONFIG_ZBOOT option creates self decompressing PE kernel images. So this means that kexec should have a generic understanding of the format which may be used by multiple arches. So lets add an arch independent validation and decompression routine. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> [Modified by Pingfan to export kernel fd] Signed-off-by: Pingfan Liu <piliu@redhat.com> [Corrected indentation] Signed-off-by: Simon Horman <horms@kernel.org>
2023-08-11kexec: Introduce a member kernel_fd in kexec_infoPingfan Liu2-0/+9
Utilize the image load interface to export the kernel fd, which points to the uncompressed kernel and will be passed to kexec_file_load. The credit goes to the Dave Young, who contributes the original code. Signed-off-by: Pingfan Liu <piliu@redhat.com> Co-authored-by: Dave Young <dyoung@redhat.com> Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-08-11kexec/arm64: Simplify the code for zImagePingfan Liu6-254/+26
Inside zimage_probe(), it uncompresses the kernel and performs some check, similar to image_probe(). Taking a close look, the uncompressing has already executed before the image probe is called. What is missing here is to provide a fd, pointing to an uncompressed kernel image. This patch creates a memfd based on the result produced by slurp_decompress_file(), and finally simplify the logical of the probe for aarch64. The credit goes to the Dave Young, who contributes the original code. Signed-off-by: Pingfan Liu <piliu@redhat.com> Co-authored-by: Dave Young <dyoung@redhat.com> Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-03-08LoongArch: kdump: Set up kernel image segmentYouling Tang5-1/+40
On LoongArch, we can use the same kernel image as 1st kernel when 3f89765d622 ("LoongArch: kdump: Add single kernel image implementation") is merged, but we have to modify the entry point as well as segments addresses in the kernel elf header (or PE format vmlinux.efi) in order to load them into correct places. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Simon Horman <horms@kernel.org>
2023-03-08kexec: __NR_kexec_file_load is set to undefined on LoongArchYouling Tang1-1/+1
The initial reason is that after the merger of 29fe5067ed07 ("kexec: make -a the default"), kexec cannot be used on LoongArch architectures. We need to add "-c" for normal use. The current kexec_file_load system call is not implemented in architectures such as LoongArch, so it needs to pass kexec_load. So we need to set __NR_kexec_file_load to undefined in unsupported architectures. This will return EFALLBACK via is_kexec_file_load_implemented, and then via kexec_load. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Simon Horman <horms@kernel.org>
2023-03-03ppc64: Add elf-ppc64 file types/options and an arch specific flag to man pageGautam Menghani1-0/+35
Document the elf-ppc64 file options and the "--dt-no-old-root" arch specific flag in the man page. Signed-off-by: Gautam Menghani <gautam@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@kernel.org>
2023-02-23x86: add devicetree supportJulian Winkler7-4/+43
Since linux kernel has dropped support for simple firmware interface (SFI), the only way of boot newer versions on intel MID platform is using devicetree Signed-off-by: Julian Winkler <julian.winkler1@web.de> Signed-off-by: Simon Horman <horms@kernel.org>
2023-02-07kexec: make -a the defaultAhelenia Ziemiańska2-6/+6
AFAICT, there's no downside to this, and running into this each time I want to kexec (and, presumably, a significant chunk of the population, since lockdown is quite popular) on some machines, then going to the manual, then finding out I want the /auto/ flag(!) is quite annoying: # kexec -l /boot/vmlinuz-6.1.0-3-amd64 --initrd /boot/initrd.img-6.1.0-3-amd64 --reuse-cmdline kexec_load failed: Operation not permitted entry = 0x46eff7760 flags = 0x3e0000 nr_segments = 7 segment[0].buf = 0x557cd303efa0 segment[0].bufsz = 0x70 segment[0].mem = 0x100000 segment[0].memsz = 0x1000 segment[1].buf = 0x557cd3046fe0 segment[1].bufsz = 0x190 segment[1].mem = 0x101000 segment[1].memsz = 0x1000 segment[2].buf = 0x557cd303f6e0 segment[2].bufsz = 0x30 segment[2].mem = 0x102000 segment[2].memsz = 0x1000 segment[3].buf = 0x7f658fa37010 segment[3].bufsz = 0x12a51b5 segment[3].mem = 0x46a55a000 segment[3].memsz = 0x12a6000 segment[4].buf = 0x7f6590ce1210 segment[4].bufsz = 0x7e99e0 segment[4].mem = 0x46b800000 segment[4].memsz = 0x377c000 segment[5].buf = 0x557cd3039350 segment[5].bufsz = 0x42fa segment[5].mem = 0x46eff2000 segment[5].memsz = 0x5000 segment[6].buf = 0x557cd3032000 segment[6].bufsz = 0x70e0 segment[6].mem = 0x46eff7000 segment[6].memsz = 0x9000 Closes: https://bugs.debian.org/1030248 Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: Simon Horman <horms@kernel.org>
2023-02-07ppc64: add --reuse-cmdline parameter supportSourabh Jain2-3/+26
An option to copy the command line arguments from running kernel to kexec'd kernel. This option works for both kexec and kdump. In case --append=<args> or --command-line=<args> is provided along with --reuse-cmdline parameter then args listed against append and command-line parameter will be combined with command line argument from running kernel. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Simon Horman <horms@kernel.org>
2022-11-18m68k: pass rng seed via BI_RNG_SEEDJason A. Donenfeld3-0/+34
In order to pass fresh entropy to kexec'd kernels, use BI_RNG_SEED for passing a seed, with the same semantics that kexec-tools currently uses for i386's setup_data. Link: https://git.kernel.org/torvalds/c/dc63a086daee92c63e3 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Simon Horman <horms@kernel.org>
2022-10-10LoongArch: Remove redundant cmdline parameters when using --reuse-cmdline optionYouling Tang3-1/+4
In LoongArch, when using the --reuse-cmdline option to reuse the current command line, it may lead to redundancy (like kexec, initrd command line arguments). In order to avoid the possible impact of initrd removal on other architectures, remove_parameter will be called in a specific architecture for processing. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Simon Horman <horms@kernel.org>
2022-10-10LoongArch: PE format image loading supportYouling Tang6-0/+229
The LoongArch kernel will mainly use the vmlinux.efi image in PE format, so add it support. I tested this on LoongArch 3A5000 machine and works as expected, kexec: $ sudo kexec -l /boot/vmlinux.efi --reuse-cmdline $ sudo kexec -e kdump: $ sudo kexec -p /boot/vmlinux-kdump.efi --reuse-cmdline --append="nr_cpus=1" # echo c > /proc/sysrq_trigger Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Simon Horman <horms@kernel.org>
2022-10-10LoongArch: Add kexec/kdump supportYouling Tang11-0/+849
Add the 64-bit processing support of the LoongArch architecture. For the time being, the quick restart function(kexec) is supported. That is, the "kexec -l" and "kexec -e" commands can be used normally. At the same time, the crash dump function also supports, "kexec -p" operation can be successfully performed, and the vmcore file can be generated. I tested this on LoongArch 3A5000 machine and works as expected, kexec: $ sudo kexec -l /boot/vmlinux --reuse-cmdline $ sudo kexec -e kdump: $ sudo kexec -p /boot/vmlinux-kdump --reuse-cmdline --append="nr_cpus=1" # echo c > /proc/sysrq_trigger Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Simon Horman <horms@kernel.org>
2022-10-05ppc64: remove rma_top limitSourabh Jain1-2/+0
Restricting kexec tool to allocate hole for kexec segments below 768MB may not be relavent now since first memory block size can be 1024MB and more. Removing rma_top restriction will give more space to find holes for kexec segments and existing in-place checks make sure that kexec segment allocation doesn't cross the first memory block because every kexec segment has to be within first memory block for kdump kernel to boot properly. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Simon Horman <horms@kernel.org>
2022-07-19kexec-tools: Remove duplicate ultoa() definitions and redefine itTiezhu Yang6-95/+16
There exist duplicate ultoa() definitions in many archs, remove them, and also redefine ultoa() in kexec/kexec.h to make it more readable. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Simon Horman <horms@kernel.org>
2022-07-15i386: pass rng seed via setup_dataJason A. Donenfeld1-0/+25
Linux ≥5.20 expects a RNG seed via setup_data as of the upstream commit in the link below. That commit adjusts kexec_file_load to pass SETUP_RNG_SEED. kexec-tools should follow suite, so add more or less the same code here. Link: https://git.kernel.org/tip/tip/c/68b8e9713c8 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Simon Horman <horms@kernel.org>
2022-06-26kexec-tools: mips: Pass initrd parameter via cmdlineHui Li3-3/+80
Under loongson platform, use command: kexec -l vmlinux... --append="root=UUID=28e1..." --initrd=... kexec -e quick restart failed like this: ******************************************************************** [ 3.420791] VFS: Cannot open root device "UUID=6462a8a4-02fb-49..." [ 3.431262] Please append a correct "root=" boot option; ... ... ... ... [ 3.543175] 0801 4194304 sda1 554e69cc-01 [ 3.543175] [ 3.549494] 0802 62914560 sda2 554e69cc-02 [ 3.549495] [ 3.555818] 0803 8388608 sda3 554e69cc-03 [ 3.555819] [ 3.562139] 0804 174553229 sda4 554e69cc-04 [ 3.562139] [ 3.568463] 0b00 1048575 sr0 [ 3.568464] driver: sr [ 3.574524] Kernel panic - not syncing: VFS: Unable to mount root fs... [ 3.582750] ---[ end Kernel panic - not syncing: VFS:... ******************************************************************* The kernel cannot parse the UUID, the UUID is parsed in the initrd. For compatibility with previous platforms, loongson platform obtain initrd parameter through cmdline in kernel, the kernel supports use cmdline to parse initrd. But under the mips architecture, kexec-tools pass the initrd through DTB. Made the following modifications: (1) in kexec/arch/mips/kexec-elf-mips.c Add patch_initrd_info(), at runtime to distinguish different cpu, only for loongson cpu, add initrd parameter to cmdline. (2) in kexec/arch/mips/crashdump-mips.c Because loongson uses a different page_offset, it should be modified to ensure that crashdump functionality is correct and reliable. (3) in kexec/arch/mips/crashdump-mips.h Added platform-specific page_offset macro definition. Signed-off-by: Hui Li <lihui@loongson.cn> Signed-off-by: Simon Horman <horms@kernel.org>
2022-04-29arm64/crashdump-arm64: increase CRASH_MAX_MEMORY_RANGES to 32kabuehaze141-1/+1
On ARM64 based VMs hotplugging more than 31GB of memory will cause kdump to fail loading as it's hitting the CRASH_MAX_MEMORY_RANGES limit which is currently 32 on ARM64 given that the memory block size is 1GB. This patch is raising CRASH_MAX_MEMORY_RANGES to 32K similar to what we have on x86, this should allow kdump to work until the VM has 32TB which should be enough for a long time. Signed-off-by: Hazem Mohamed Abuelfotoh <abuehaze@amazon.com> Acked-by: Baoquan He <bhe@redhat.com> Signed-off-by: Simon Horman <horms@kernel.org>
2022-04-01arm64: fix static data relocations in machine_apply_elf_rel()Pingfan Liu1-3/+2
As for 'static data relocations', instead of patching an instruction (OR ops), it should be assigned to value directly. Signed-off-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-04-01kexec/elf: assign one to align if sh_addralign equals zeroPingfan Liu1-3/+7
According to ELF specification, if sh_addralign equals zero or one, then the section has no alignment requirement on the start address. (I.e. it can be aligned on 1 byte) Since modern cpu asks the .text, .data, .bss to be aligned on the machine word boundary at least, so in elf_rel_load(), sh_addralign can not be zero, and align = shdr->sh_addralign; ... bufsz = _ALIGN(bufsz, align); will not render a result of 'bufsz = 0'. But it had better have a check on the case of 'sh_addralign == 0' regardless of the assumption of machine word alignment. This patch has no functional change. Signed-off-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-04-01arm64/crashdump-arm64: explicit type conversion to suppress compiler warningPingfan Liu1-1/+1
elf_info.page_offset is 'unsigned long long', while get_page_offset() has the input param as a type of 'unsigned long *'. It demands explicit type casting to mute the compiler warning. Signed-off-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-04-01arm64/kexec-arm64: add support for R_AARCH64_MOVW_UABS_G* relaPingfan Liu1-0/+40
Build kexec-tools with clang(clang version 13.0.1 (Fedora 13.0.1-1.fc36)). Then when kexec loads kernel, it runs into the error message "machine_apply_elf_rel: ERROR Unknown type: 264". This is caused by the following reloc type in purgatory/purgatory.ro, which is not supported yet. R_AARCH64_MOVW_UABS_G0_NC R_AARCH64_MOVW_UABS_G1_NC R_AARCH64_MOVW_UABS_G2_NC R_AARCH64_MOVW_UABS_G3 Adding code to support these relocs, so kexec can work smoothly. Signed-off-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-04-01arm64/kexec-arm64: use enum to organize the reloc typePingfan Liu1-41/+15
More and more reloc type need to be supported on aarch64. Using enum to organize them to shorten the #ifdef macro list. Signed-off-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-04-01arm64/kexec-arm64: add support for R_AARCH64_LDST128_ABS_LO12_NC relaPingfan Liu1-0/+16
GCC 12 has some changes, which affects the generated AArch64 code of kexec-tools. Accordingly, a new rel type R_AARCH64_LDST128_ABS_LO12_NC is confronted by machine_apply_elf_rel() on AArch64. This fails the load of kernel with the message "machine_apply_elf_rel: ERROR Unknown type: 299" Citing from objdump -rDSl purgatory/purgatory.ro 0000000000000f80 <sha256_starts>: sha256_starts(): f80: 90000001 adrp x1, 0 <verify_sha256_digest> f80: R_AARCH64_ADR_PREL_PG_HI21 .text+0xfa0 f84: a9007c1f stp xzr, xzr, [x0] f88: 3dc00021 ldr q1, [x1] f88: R_AARCH64_LDST128_ABS_LO12_NC .text+0xfa0 f8c: 90000001 adrp x1, 0 <verify_sha256_digest> f8c: R_AARCH64_ADR_PREL_PG_HI21 .text+0xfb0 f90: 3dc00020 ldr q0, [x1] f90: R_AARCH64_LDST128_ABS_LO12_NC .text+0xfb0 f94: ad008001 stp q1, q0, [x0, #16] f98: d65f03c0 ret f9c: d503201f nop fa0: 6a09e667 .inst 0x6a09e667 ; undefined fa4: bb67ae85 .inst 0xbb67ae85 ; undefined fa8: 3c6ef372 .inst 0x3c6ef372 ; undefined fac: a54ff53a ld3w {z26.s-z28.s}, p5/z, [x9, #-3, mul vl] fb0: 510e527f sub wsp, w19, #0x394 fb4: 9b05688c madd x12, x4, x5, x26 fb8: 1f83d9ab .inst 0x1f83d9ab ; undefined fbc: 5be0cd19 .inst 0x5be0cd19 ; undefined Here, gcc generates codes, which make loads and stores carried out using the 128-bits floating-point registers. And a new rel type R_AARCH64_LDST128_ABS_LO12_NC should be handled. Make machine_apply_elf_rel() coped with this new reloc, so kexec-tools can work smoothly. Signed-off-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-03-30kexec-tools: fix leak FILE pointer.Lichen Liu1-1/+4
Close fp if file size is smaller than 13 bytes. Fixes: dcfcc73c73e6 ("kexec-tools: Determine if the image is lzma commpressed") Signed-off-by: Lichen Liu <lichliu@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-03-29kexec-tools: Determine if the image is lzma commpressedLichen Liu1-0/+36
Currently there are 2 functions for decompressing compressed image. The zlib_decompress_file() will determine if the image is compressed by gzip before read, but lzma_decompress_file() will not. This can cause misleading information to be printed when the image is not compressed by lzma and debug option is used: ]# kexec -d -s -l /boot/vmlinuz-5.14.10-300.fc35.x86_64 \ --initrd /boot/initramfs-5.14.10-300.fc35.x86_64.img \ --reuse-cmdline Try gzip decompression. Try LZMA decompression. lzma_decompress_file: read on /boot/vmlinuz-5.14.10-300.fc35.x86_64 of 65536 bytes failed Add a helper function is_lzma_file() to help behave consistently. Signed-off-by: Lichen Liu <lichliu@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-03-23kexec-xen: Allow xen_kexec_exec() to return in case of Live UpdateRaphael Ning3-9/+28
Currently, my_exec() does not expect the Xen KEXEC_CMD_kexec hypercall to return on success, because it assumes that the hypercall always triggers an immediate reboot. However, for Live Update, the hypercall merely schedules the kexec operation and returns; the actual reboot happens asynchronously. [1] Therefore, rework the Xen code path of my_exec() such that it does not treat a successfully processed Live Update request as an error. Also, rephrase the comment above the function to remove ambiguity. [1] https://lists.xen.org/archives/html/xen-devel/2021-05/msg00286.html Signed-off-by: Raphael Ning <raphning@amazon.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-03-23kexec-tools: print error if kexec_file_load failsHari Bathini1-0/+1
Commit 4f77da634035 ("kexec-tools: Fix kexec_file_load(2) error handling") introduced EFALLBACK for scenarios where fallbacking back to kexec_load syscall is likely to work and dropped printing error message for these scenarios. But printing error message for other failure scenarios was inadvertently dropped. Restore printing error message for such cases. Fixes: 4f77da634035 ("kexec-tools: Fix kexec_file_load(2) error handling") Cc: Petr Tesarik <ptesarik@suse.com> Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Petr Tesarik <ptesarik@suse.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-02-01kexec-tools: mips: Concatenate --reuse-cmdline and --appendTiezhu Yang1-2/+6
Use concat_cmdline() to concatenate the --append string and the --reuse-cmdline string, otherwise only one of the two options is valid. This is similar with commit 8b42c99aa3bc ("Fix --reuse-cmdline so it is usable."). Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-02-01kexec-tools: mips: Add some debug infoTiezhu Yang1-0/+7
Use dbgprintf() to print command_line, initrd and dtb in arch_process_options() for debugging. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-01-24arm64: fix PAGE_OFFSET calc for flipped mmKairui Song1-1/+13
Since kernel commit 14c127c957c1 ('arm64: mm: Flip kernel VA space'), the memory layout on arm64 have changed, and kexec-tools can no longer get the the right PAGE_OFFSET based on _text symbol. Prior to that, the kimage (_text) lays above PAGE_END with this layout: 0 -> VA_START : Usespace VA_START -> VA_START + 256M : BPF JIT, Modules VA_START + 256M -> PAGE_OFFSET - (~GB misc) : Vmalloc (KERNEL _text HERE) PAGE_OFFSET -> ... : * Linear map * And here we have: VA_START = -1UL << VA_BITS PAGE_OFFSET = -1UL << (VA_BITS - 1) _text < -1UL << (VA_BITS - 1) Kernel image lays somewhere between VA_START and PAGE_OFFSET, so we just calc VA_BITS by getting the highest unset bit of _text symbol address, and shift one less bit of VA_BITS to get page offset. This works as long as KASLR don't put kernel in a too high location (which is commented inline). And after that commit, kernel layout have changed: 0 -> PAGE_OFFSET : Userspace PAGE_OFFSET -> PAGE_END : * Linear map * PAGE_END -> PAGE_END + 128M : bpf jit region PAGE_END + 128M -> PAGE_END + 256MB : modules PAGE_END + 256M -> ... : vmalloc (KERNEL _text HERE) Here we have: PAGE_OFFSET = -1UL << VA_BITS PAGE_END = -1UL << (VA_BITS - 1) _text > -1UL << (VA_BITS - 1) Kernel image now lays above PAGE_END, so we have to shift one more bit to get the VA_BITS, and shift the exact VA_BITS for PAGE_OFFSET. We can simply check if "_text > -1UL << (VA_BITS - 1)" is true to judge which layout is being used and shift the page offset occordingly. Signed-off-by: Kairui Song <kasong@tencent.com> (rebased and stripped by Pingfan ) Signed-off-by: Pingfan Liu <piliu@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-01-24arm64: read VA_BITS from kcore for 52-bits VA kernelPingfan Liu1-4/+30
phys_to_virt() calculates virtual address. As a important factor, page_offset is excepted to be accurate. Since arm64 kernel exposes va_bits through vmcore, using it. Signed-off-by: Pingfan Liu <piliu@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-01-24arm64/crashdump: unify routine to get page_offsetPingfan Liu3-26/+6
There are two funcs to get page_offset: get_kernel_page_offset() get_page_offset() Since get_kernel_page_offset() does not observe the kernel formula, and remove it. Unify them in order to introduce 52-bits VA kernel more easily in the coming patch. Signed-off-by: Pingfan Liu <piliu@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-01-24arm64: make phys_offset signedPingfan Liu2-7/+7
After kernel commit 7bc1a0f9e176 ("arm64: mm: use single quantity to represent the PA to VA translation"), phys_offset can be negative if running 52-bits kernel on 48-bits hardware. So changing phys_offset from unsigned to signed. Signed-off-by: Pingfan Liu <piliu@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-01-14s390: add support for --reuse-cmdlineSven Schnelle2-4/+15
--reuse-cmdline reads the command line of the currently running kernel from /proc/cmdline and uses that for the kernel that should be kexec'd. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-01-14use slurp_proc_file() in get_command_line()Sven Schnelle1-16/+10
This way the size of the command line that get_command_line() can handle is no longer fixed. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-01-14add slurp_proc_file()Sven Schnelle1-0/+51
slurp_file() cannot be used to read proc files, as they are returning a size of zero in stat(). Add a function slurp_proc_file() which is similar to slurp_file(), but doesn't require the size of the file to be known. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-01-14s390: use KEXEC_ALL_OPTIONSSven Schnelle1-8/+2
KEXEC_ALL_OPTIONS could be used instead defining the same array several times. This makes code easier to maintain when new options are added. Suggested-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-01-14s390: add variable command line sizeSven Schnelle3-34/+55
Newer s390 kernels support a command line size longer than 896 bytes. Such kernels contain a new member in the parameter area, which might be utilized by tools like kexec. Older kernels have the location initialized to zero, so we check whether there's a non-zero number present and use that. If there isn't, we fallback to the legacy command line size of 896 bytes. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-01-14arm64: support more than one crash kernel regionsChen Zhou3-47/+86
When crashkernel is reserved above 4G in memory, kernel should reserve some amount of low memory for swiotlb and some DMA buffers. So there may be two crash kernel regions, one is below 4G, the other is above 4G. Currently, there is only one crash kernel region on arm64, and pass "linux,usable-memory-range = <BASE SIZE>" property to crash dump kernel. Now, we pass "linux,usable-memory-range = <BASE1 SIZE1 BASE2 SIZE2>" to crash dump kernel to support two crash kernel regions and load crash kernel high. Make the low memory region as the second range "BASE2 SIZE2" to keep compatibility with existing user-space and older kdump kernels. Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2022-01-14s390: handle R_390_PLT32DBL reloc entries in machine_apply_elf_rel()Alexander Egorenkov1-1/+2
Starting with gcc 11.3, the C compiler will generate PLT-relative function calls even if they are local and do not require it. Later on during linking, the linker will replace all PLT-relative calls to local functions with PC-relative ones. Unfortunately, the purgatory code of kexec/kdump is not being linked as a regular executable or shared library would have been, and therefore, all PLT-relative addresses remain in the generated purgatory object code unresolved. This in turn lets kexec-tools fail with "Unknown rela relocation: 0x14 0x73c0901c" for such relocation types. Furthermore, the clang C compiler has always behaved like described above and this commit should fix the purgatory code built with the latter. Because the purgatory code is no regular executable or shared library, contains only calls to local functions and has no PLT, all R_390_PLT32DBL relocation entries can be resolved just like a R_390_PC32DBL one. * https://refspecs.linuxfoundation.org/ELF/zSeries/lzsabi0_zSeries/x1633.html#AEN1699 Relocation entries of purgatory code generated with gcc 11.3 ------------------------------------------------------------ $ readelf -r purgatory/purgatory.o Relocation section '.rela.text' at offset 0x6e8 contains 27 entries: Offset Info Type Sym. Value Sym. Name + Addend 00000000000c 000300000013 R_390_PC32DBL 0000000000000000 .data + 2 00000000001a 001000000014 R_390_PLT32DBL 0000000000000000 sha256_starts + 2 000000000030 001100000014 R_390_PLT32DBL 0000000000000000 sha256_update + 2 000000000046 001200000014 R_390_PLT32DBL 0000000000000000 sha256_finish + 2 000000000050 000300000013 R_390_PC32DBL 0000000000000000 .data + 102 00000000005a 001300000014 R_390_PLT32DBL 0000000000000000 memcmp + 2 ... 000000000118 001600000014 R_390_PLT32DBL 0000000000000000 setup_arch + 2 00000000011e 000300000013 R_390_PC32DBL 0000000000000000 .data + 2 00000000012c 000f00000014 R_390_PLT32DBL 0000000000000000 verify_sha256_digest + 2 000000000142 001700000014 R_390_PLT32DBL 0000000000000000 post_verification[...] + 2 Relocation entries of purgatory code generated with gcc 11.2 ------------------------------------------------------------ $ readelf -r purgatory/purgatory.o Relocation section '.rela.text' at offset 0x6e8 contains 27 entries: Offset Info Type Sym. Value Sym. Name + Addend 00000000000e 000300000013 R_390_PC32DBL 0000000000000000 .data + 2 00000000001c 001000000013 R_390_PC32DBL 0000000000000000 sha256_starts + 2 000000000036 001100000013 R_390_PC32DBL 0000000000000000 sha256_update + 2 000000000048 001200000013 R_390_PC32DBL 0000000000000000 sha256_finish + 2 000000000052 000300000013 R_390_PC32DBL 0000000000000000 .data + 102 00000000005c 001300000013 R_390_PC32DBL 0000000000000000 memcmp + 2 ... 00000000011a 001600000013 R_390_PC32DBL 0000000000000000 setup_arch + 2 000000000120 000300000013 R_390_PC32DBL 0000000000000000 .data + 122 000000000130 000f00000013 R_390_PC32DBL 0000000000000000 verify_sha256_digest + 2 000000000146 001700000013 R_390_PC32DBL 0000000000000000 post_verification[...] + 2 Corresponding s390 kernel discussion: * https://lore.kernel.org/linux-s390/20211208105801.188140-1-egorenar@linux.ibm.com/T/#u Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reported-by: Tao Liu <ltao@redhat.com> Suggested-by: Philipp Rudo <prudo@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> [hca@linux.ibm.com: changed commit message as requested by Philipp Rudo] Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-12-15arm64/crashdump: deduce paddr of _text based on kernel code sizePingfan Liu1-3/+11
kexec-tools commit 61b8c79b0fb7 ("arm64/crashdump-arm64: deduce the paddr of _text") tries to deduce the paddr of _text, but turns out partially. That commit is based on "The Image must be placed text_offset bytes from a 2MB aligned base address anywhere in usable system RAM and called there" in linux/Documentation/arm64/booting.rst, plus text_offset field is zero. But in practice, some boot loaders does not obey the convention, and still boots up the kernel successfully. Revisiting kernel commit e2a073dde921 ("arm64: omit [_text, _stext) from permanent kernel mapping"), the kernel code size changes from (unsigned long)__init_begin - (unsigned long)_text to (unsigned long)__init_begin - (unsigned long)_stext And it should be a better factor to decide which label starts the "Kernel code" in /proc/iomem. Signed-off-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-10-20arm: kdump: Add DT properties to crash dump kernel's DTBGeert Uytterhoeven3-2/+147
Pass the following properties to the crash dump kernel, to provide a modern DT interface between kexec and the crash dump kernel: - linux,elfcorehdr: ELF core header segment, similar to the "elfcorehdr=" kernel parameter. - linux,usable-memory-range: Usable memory reserved for the crash dump kernel. This makes the memory reservation explicit, so Linux no longer needs to mask the program counter, and rely on the "mem=" kernel parameter to obtain the start and size of usable memory. For backwards compatibility, the "elfcorehdr=" and "mem=" kernel parameters are still appended to the kernel command line. Loosely based on the ARM64 version by Akashi Takahiro. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-10-20kexec-tools: multiboot2: Correct BASIC_MEMINFO memory unitsTu Dinh1-2/+2
mem_lower and mem_upper are measured in kilobytes. Signed-off-by: Simon Horman <horms@verge.net.au>
2021-10-05Add some necessary free() callsKai Song1-2/+8
free should be called before the function exit abnormally. Signed-off-by: Kai Song <songkai01@inspur.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-10-05Add some necessary fclose() callsKai Song4-1/+8
fclose should be called before function exits Signed-off-by: Kai Song <songkai01@inspur.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-09-24ppc64: Fix memory leak problem in zImage_ppc64_load()Kai Song1-0/+7
When the function exits abnormally,ph should be freed. Signed-off-by: Kai Song <songkai01@inspur.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-09-24i386: Remove unused local variable in get_kernel_page_offset()Kai Song1-1/+0
In get_kernel_page_offset(),the local variable kv is unused,remove it. Signed-off-by: Kai Song <songkai01@inspur.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-09-14multiboot2: Accept x86-64 imagesZhaofeng Li2-4/+6
Signed-off-by: Zhaofeng Li <hello@zhaofeng.li> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-09-14multiboot2: Avoid first 0x500 bytesZhaofeng Li1-1/+1
In some cases, add_buffer will actually try to allocate the buffer at 0x0, which may not be acceptable by some kernels. Let's avoid the first 0x500 bytes so we don't screw up the IVT and BDA. Signed-off-by: Zhaofeng Li <hello@zhaofeng.li> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-09-14multiboot2: Use rel_min and rel_max for buffer destinationsZhaofeng Li1-2/+2
This would segfault if mhi.rel_tag didn't exist. Signed-off-by: Zhaofeng Li <hello@zhaofeng.li> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-09-14multiboot2: Correct MBI size calculationZhaofeng Li1-4/+13
tag_load_base_addr is dependent on rel_tag, and tag_framebuffer was not accounted for. Signed-off-by: Zhaofeng Li <hello@zhaofeng.li> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-09-14x86: Consolidate elf_x86_probe routinesZhaofeng Li3-33/+40
Signed-off-by: Zhaofeng Li <hello@zhaofeng.li> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-09-13Refer FDT tokens with symbolic namesSourabh Jain1-10/+11
Replace hardcoded FDT structure block tokens with proper names to improve code readability. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-06-06arm64/crashdump-arm64: deduce the paddr of _textPingfan Liu1-2/+10
Since kernel commit e2a073dde921 ("arm64: omit [_text, _stext) from permanent kernel mapping"), the physical address of 'Kernel code' in /proc/iomem is mapped from _text, instead, from _stext. Taking the compatibility into account, it had better deduce the paddr of _text despite of the unavailability through /proc/iomem. It can be achieved by utilizing the fact _text aligned on 2MB. Signed-off-by: Pingfan Liu <piliu@redhat.com> Cc: Simon Horman <horms@verge.net.au> To: kexec@lists.infradead.org Signed-off-by: Simon Horman <horms@verge.net.au>
2021-05-02kexec-tools: Remove duplicate definition of ramdiskPetr Tesarik1-1/+0
The ramdisk variable is defined in kexec/arch/ppc/kexec-ppc.c. This other definition is not needed and breaks build with -fno-common. Signed-off-by: Petr Tesarik <ptesarik@suse.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-17arm: do not copy magic 4 bytes of appended DTB in zImageAlexander Egorenkov1-1/+11
If the passed zImage happens to have a DTB appended, then the magic 4 bytes of the DTB are copied together with the kernel image. This leads to failed kexec boots because the decompressor finds the aforementioned DTB magic and falsely tries to replace the DTB passed in the register r2 with the non-existent appended one. Signed-off-by: Alexander Egorenkov <egorenar-dev@posteo.net> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-17kexec: Remove the error prone kernel_version functionEric W. Biederman6-85/+11
During kexec there are two kernel versions at play. The version of the running kernel and the version of the kernel that will be booted. On powerpc it appears people have been using the version of the running kernel to attempt to detect properties of the kernel to be booted which is just wrong. As the linux kernel version that is being detected is a no longer supported kernel just remove that buggy and confused code. On x86_64 the kernel_version is used to compute the starting virtual address of the running kernel so a proper core dump may be generated. Using the kernel_version stopped working a while ago when the starting virtual address became randomized. The old code was kept for the case where the kernel was not built with randomization support, but there is nothing in reading /proc/kcore that won't work to detect the starting virtual address even there. In fact /proc/kcore must have the starting virtual address or a debugger can not make sense of the running kernel. So just make computing the starting virtual address on x86_64 unconditional. With a hard coded fallback just in case something went wrong. Doing something with kernel_version() has become important as recent stable kernels have seen the minor version to > 255. Just removing kernel_version() looks like the best option. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-07Shrink segments to fit alignment instead of throwing them awayHongyan Xia1-3/+12
We risk throwing an entire large chunk away if it is just slightly unaligned which then causes the crash kernel to run out of RAM. Keep them and shrink them to alignment. Signed-off-by: Hongyan Xia <hongyxia@amazon.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-07Fix where the real mode interrupt vector endsHongyan Xia1-2/+8
The real mode ends at 0x400, not 0x100. The code intentionally excludes the IVT as RAM, so use the correct address. Also, 0x100 is not 1K aligned and will be rejected by add_memmap(). We have observed problems that after a multiboot2 kexec, the next kexec will throw away such unaligned chunks, losing memory for the next next kernel. In some corner cases, such loss of memory can actually cause OOM during boot. Signed-off-by: Hongyan Xia <hongyxia@amazon.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02crashdump/x86: increase CRASH_MAX_MEMORY_RANGES to 32kDavid Hildenbrand2-3/+6
virtio-mem in Linux adds/removes individual memory blocks (e.g., 128 MB each). Linux merges adjacent memory blocks added by virtio-mem devices, but we can still end up with a very sparse memory layout when unplugging memory in corner cases. Let's increase the maximum number of crash memory ranges from ~2k to 32k. 32k should be sufficient for a very long time. e_phnum field in the header is 16 bits wide, so we can fit a maximum of ~64k entries in there, shared with other entries (i.e., CPU). Therefore, using up to 32k memory ranges is fine. (if we ever need more than ~64k, we can switch to the sh_info field) Move the temporary xen ranges off the stack, dynamically allocating memory for them. Note: We don't have to increase MAX_MEMORY_RANGES, because virtio-mem added memory is driver managed and always detected and added by a driver in the kexec'ed kernel; for ordinary kexec, we must not expose these ranges in the firmware-provided memmap. Cc: Simon Horman <horms@verge.net.au> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02crashdump/x86: iterate only over actual crash memory rangesDavid Hildenbrand1-1/+1
No need to iterate over empty entries. Cc: Simon Horman <horms@verge.net.au> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02crashdump/x86: dump any kind of "System RAM"David Hildenbrand1-2/+8
Traditionally, we had "System RAM" only on the top level of in the kernel resource tree (-> /proc/iomem). Nowadays, we can also have "System RAM" on lower levels of the tree -- driver-managed device memory that is always detected and added via drivers. Current examples are memory added via dax/kmem -- ("System RAM (kmem)") and virtio-mem ("System RAM (virtio_mem)"). Note that in some kernel versions "System RAM (kmem)" was exposed as "System RAM", but similarly, on lower levels of the resource tree. Let's add anything that contains "System RAM" to the elf core header, so it will be dumped for kexec_load(). Handling kexec_file_load() in the kernel is similarly getting fixed [1]. Loading a kdump kernel via "kexec -p -c" ... will result in the kdump kernel to also dump dax/kmem and virtio-mem added System RAM now. Note: We only want to dump this memory, we don't want to add this memory to the memmap of an ordinary kexec'ed kernel ("fast system reboot"). [1] https://lkml.kernel.org/r/20210322160200.19633-1-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02kexec-xen: Use correct image type for Live UpdateRaphael Ning1-9/+14
Unlike xen_kexec_load(), xen_kexec_unload() and xen_kexec_status() fail to distinguish between normal kexec and Xen Live Update image types. Fix that by introducing a new helper function that maps internal flags to KEXEC_TYPE_*, and using it throughout kexec-xen.c. Signed-off-by: Raphael Ning <raphning@amazon.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02kexec: Make --status work with normal kexec imagesRaphael Ning1-1/+3
According to kexec(8) manpage, --status (-S) works with both normal kexec (loaded by -l) and crash kernel (loaded by -p) image types, and defaults to the latter. However, the implementation does not match the description: `kexec -l -S` queries the -p image type as if -l were not specified. This is because there is no internal flag defined for the normal kexec type, and -S treats the zero flag as the trigger for the default behaviour (-p). Fix that by making sure the default behaviour for -S is not applied when the -l option is present. Signed-off-by: Raphael Ning <raphning@amazon.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02kexec: Fix description of --status exit codeRaphael Ning2-4/+5
On both Linux and Xen, an exit code of 0 from `kexec --status` indicates that the kexec image being queried is NOT loaded, which is contrary to what the man page and usage() say. Signed-off-by: Raphael Ning <raphning@amazon.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02kexec: Use %llu/%llx and casts to format uint64_tGeert Uytterhoeven1-1/+2
When compiling for 32-bit: kexec/kexec.c: In function ‘cmdline_add_liveupdate’: kexec/kexec.c:1192:30: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=] 1192 | sprintf(buf, " liveupdate=%luM@0x%lx", lu_sizeM, lu_start); | ~~^ ~~~~~~~~ | | | | | uint64_t {aka long long unsigned int} | long unsigned int | %llu kexec/kexec.c:1192:37: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=] 1192 | sprintf(buf, " liveupdate=%luM@0x%lx", lu_sizeM, lu_start); | ~~^ ~~~~~~~~ | | | | | uint64_t {aka long long unsigned int} | long unsigned int | %llx Indeed, "uint64_t" is "unsigned long long" on 32-bit formats, and "unsigned long" on 64-bit formats. Fix this by casting to "unsigned long long", and formatting using "%llu" or "%llx". Fixes: b13984c6f9ec7fdd ("kexec: Introduce --load-live-update for xen") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02mips: Fix the increased mem parameter sizeYouling Tang1-1/+1
The added "mem=size@start" parameter actually corresponds to "crashkernel=YM@XM", but 1 byte is missing when calculating the size, so 1 byte should be added. For example, when using crashkernel=108M@64M (110592K@65536K): Without this patch: the mem parameter added is: mem=110591K@65536K With this patch: the mem parameter added is: mem=110592K@65536K Fixes: 0eac64052636 ("kexec: mips: Fix mem parameters") Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02i386: fix build on pre 4.4 kernelsFederico Pellegrin1-0/+4
kexec build will fail on older kernels (pre 4.4) as the define VIDEO_CAPABILITY_64BIT_BASE was not present at that time. This patch adds it, as per linux/include/uapi/linux/screen_info.h, if not present. Signed-off-by: Federico Pellegrin <fede@evolware.org> Reviewed-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02mips: Fix typo in commentYouling Tang1-1/+1
Fix typo in comment. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02mips: Add '--reuse-cmdline' optional parameter supportYouling Tang2-5/+11
This patch adds an option "--reuse-cmdline" for people that are lazy in typing --append="$(cat /proc/cmdline)", which will directly use the command line of the currently running system. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-12-09kexec: mips: Fix mem parametersJinyang He1-0/+29
"mem=" is useful to indicate the memory region when capture kernel boot. Otherwise, capture kernel will breakdown the memory of panic kernel. Although it can be add by user, adding "mem" by software is a better way. What's more, "mem" should contain elfcorehdr range. Elfcorehdr memory should be managed by kernel. Fixes: 7bd251654aad ("kexec-tools: mips: Remove commandline parameter "mem"") Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-12-09dt-ops: fix memory leak when new_node malloc failsqiuguorui11-1/+2
In function dtb_set_property, when malloc new_node fails, we need to free new_dtb before return. Fixes: f56cbcf4c2766 ("kexec/dt-ops.c: Fix '/chosen' v/s 'chosen' node being passed to fdt helper functions") Signed-off-by: qiuguorui1 <qiuguorui1@huawei.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-11-30zlib: fix resource leak when gzdirect failedqiuguorui11-2/+2
In function zlib_decompress_file, when gzdirect(fp) fails, we should gzclose fp before return. Fixes: d606837b56d46 ("Fix zlib/lzma decompression.") Signed-off-by: qiuguorui1 <qiuguorui1@huawei.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-11-16x86_64: allow ELFCLASS32 for x32 supportAhelenia Ziemiańska1-1/+2
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-11-16i386: fix string formatting-related warningsAhelenia Ziemiańska1-2/+14
fixed the same way as in 70cca82 "kexec: Fix snprintf related compilation warnings" Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-11-16i386/kexec-mb2-x86.c: cast ints to uintptr_t before pointers to avoid warningsAhelenia Ziemiańska1-3/+3
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-10-23arm64: Add purgatory printingMatthias Brugger2-1/+76
Add option to allow purgatory printing on arm64 hardware by passing the console name which should be used. Based on a patch by Geoff Levand. Signed-off-by: Matthias Brugger <mbrugger@suse.com> Acked-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-09-29kexec: Fix snprintf related compilation warningsBhupesh Sharma2-6/+32
This patch fixes the following snprintf related compilation warning seen currently with gcc versions 7 and 8 when kexec is compiled with -Wformat-truncation option: kexec/fs2dt.c:673:34: warning: ‘stdout-path’ directive output may be truncated writing 11 bytes into a region of size between 1 and 1024 [-Wformat-truncation=] snprintf(filename, MAXPATH, "%sstdout-path", pathname); ^~~~~~~~~~~ kexec/fs2dt.c:673:3: note: ‘snprintf’ output between 12 and 1035 bytes into a destination of size 1024 snprintf(filename, MAXPATH, "%sstdout-path", pathname); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ kexec/fs2dt.c:676:35: warning: ‘linux,stdout-path’ directive output may be truncated writing 17 bytes into a region of size between 1 and 1024 [-Wformat-truncation=] snprintf(filename, MAXPATH, "%slinux,stdout-path", pathname); ^~~~~~~~~~~~~~~~~ kexec/fs2dt.c:676:4: note: ‘snprintf’ output between 18 and 1041 bytes into a destination of size 1024 snprintf(filename, MAXPATH, "%slinux,stdout-path", pathname); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ kexec/firmware_memmap.c:132:35: warning: ‘%s’ directive output may be truncated writing 5 bytes into a region of size between 0 and 4095 [-Wformat-truncation=] snprintf(filename, PATH_MAX, "%s/%s", entry, "start"); ^~ ~~~~~~~ kexec/firmware_memmap.c:132:2: note: ‘snprintf’ output between 7 and 4102 bytes into a destination of size 4096 snprintf(filename, PATH_MAX, "%s/%s", entry, "start"); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ kexec/firmware_memmap.c:142:35: warning: ‘%s’ directive output may be truncated writing 3 bytes into a region of size between 0 and 4095 [-Wformat-truncation=] snprintf(filename, PATH_MAX, "%s/%s", entry, "end"); ^~ ~~~~~ kexec/firmware_memmap.c:142:2: note: ‘snprintf’ output between 5 and 4100 bytes into a destination of size 4096 snprintf(filename, PATH_MAX, "%s/%s", entry, "end"); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ kexec/firmware_memmap.c:152:35: warning: ‘%s’ directive output may be truncated writing 4 bytes into a region of size between 0 and 4095 [-Wformat-truncation=] snprintf(filename, PATH_MAX, "%s/%s", entry, "type"); ^~ ~~~~~~ kexec/firmware_memmap.c:152:2: note: ‘snprintf’ output between 6 and 4101 bytes into a destination of size 4096 snprintf(filename, PATH_MAX, "%s/%s", entry, "type"); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Since the simplest method to address the gcc warnings and possible truncation would be to check the return value provided from snprintf (well there are other methods like using 'asnprintf' or using 'open_memstream' function to create the FILE object, but these are more intrusive), so this patch does the same. Cc: Simon Horman <horms@verge.net.au> Cc: Eric Biederman <ebiederm@xmission.com> Cc: kexec@lists.infradead.org Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-09-29kexec-tools: Add some missing free() callsYouling Tang3-8/+27
Add some missing free() calls. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-09-29kexec-tools: Fix a prompt message when crashkernel is not reservedYouling Tang1-1/+1
Where Y specifies how much memory to reserve for the dump-capture kernel and X specifies the beginning of this reserved memory. So Y should be placed before X. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-09-25kexec-tools: mips: Remove commandline parameter "mem"Youling Tang1-29/+0
"mem=" indicating the memory region the new kernel can use to boot into. And passed to the dump-capture kernel by kernel commandline parameter "mem=". But in the dump-capture kernel, we don’t need to use this parameter now, so remove "mem" and don't add "mem=" to new kernel commandline. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-09-02kexec/kexec.c: Add missing close() callYouling Tang1-0/+3
Add missing close() call. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Reviewed-by: Khalid Aziz <khalid@gonehiking.org> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-08-20MIPS: Fix compile warnnings in kexec-elf-mips.cYouling Tang1-1/+1
Fix the following warnings: kexec/arch/mips/kexec-elf-mips.c:161:41: warning: passing argument 3 of ‘dtb_set_initrd’ makes integer from pointer without a cast dtb_set_initrd(&dtb_buf, &dtb_length, initrd_buf, initrd_buf + initrd_size); ^ In file included from kexec/arch/mips/kexec-elf-mips.c:33:0: kexec/arch/mips/../../dt-ops.h:6:5: note: expected ‘off_t’ but argument is of type ‘char *’ int dtb_set_initrd(char **dtb, off_t *dtb_size, off_t start, off_t end); ^ kexec/arch/mips/kexec-elf-mips.c:161:53: warning: passing argument 4 of ‘dtb_set_initrd’ makes integer from pointer without a cast dtb_set_initrd(&dtb_buf, &dtb_length, initrd_buf, initrd_buf + initrd_size); ^ In file included from kexec/arch/mips/kexec-elf-mips.c:33:0: kexec/arch/mips/../../dt-ops.h:6:5: note: expected ‘off_t’ but argument is of type ‘char *’ int dtb_set_initrd(char **dtb, off_t *dtb_size, off_t start, off_t end); ^ Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-08-10mips: kexec-elf-mips: fix not free in elf_mips_load()Jinyang He1-0/+3
In the function elf_mips_load(), crash_cmdline was alloced memory. But it seems to forget to free it when last used at line 131. Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-08-10kexec-tools: Check callback first in kexec_iomem_for_each_line()Jinyang He1-7/+8
In the function kexec_iomem_for_each_line(), it is better to check the callback first, it can return directly if the callback is NULL. Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-06-06arm: Increase zImage length after getting the tagŁukasz Stelmach1-7/+8
Increase the size of the zImage after seeking for the tag to avoid reading past the end of the supplied buffer should there be not tag in the zImage. Fixes: f57f0bf8975d24fe1e7c4936fdfb5c3b123ab75f Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com> Cc: Russell King <rmk@armlinux.org.uk> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-06-06kexec-tools: fix the unintended fallthrough when '-d' option is usedHari Bathini1-0/+1
Fixes: 28d4ab532808 ("Add generic debug option") Cc: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-04-24arm: redefine OPT_APPEND and OPT_RAMDISKŁukasz Stelmach4-6/+6
Redefine OPT_APPEND to avoid clash with OPT_KEXEC_SYSCALL_AUTO. Redefine OPT_RAMDISK to avoid such problems in the future Minor cleanup in HPPA too. Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-04-07kexec-tools: s390: Reset kernel command line on syscall fallbackPetr Tesarik1-0/+1
The command line is duplicated on s390 if kexec_file_load(2) is not implemented. That's because the corresponding variable is not reset to an empty string before re-parsing the kexec command line. Fixes: 9cf721279f6c ("Reset getopt before falling back to legacy syscall") Signed-off-by: Petr Tesarik <ptesarik@suse.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-04-07kexec-xen: Introduce --exec-live-update to trigger a live updateVarad Gautam3-7/+22
This signals xen to do a KEXEC_TYPE_LIVE_UPDATE kexec operation. Signed-off-by: Varad Gautam <vrd@amazon.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-04-07kexec: Introduce --load-live-update for xenVarad Gautam5-10/+125
Support loading a live update image for xen from kexec userspace. For a multiboot2 Elf on a xen setup, this will: - load the Elf into KEXEC_RANGE_MA_XEN - load purgatory and modules into KEXEC_RANGE_MA_LIVEUPDATE - append the Elf cmdline with " liveupdate=<size>@<addr> v2: define xen related symbols outside of HAVE_LIBXENCTRL Signed-off-by: Varad Gautam <vrd@amazon.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-04-07kexec-xen: Introduce xen_get_kexec_range to wrap xc_kexec_get_rangeVarad Gautam3-33/+41
And convert all callers of xc_kexec_get_range to use this. Allows reusing sanity checks for other KEXEC_RANGEs v2: define xen_get_kexec_range outside of HAVE_LIBXENCTRL Signed-off-by: Varad Gautam <vrd@amazon.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-04-01kexec-tools: Remove duplicated variable declarationsKairui Song4-7/+4
When building kexec-tools for Fedora 32, following error is observed: /usr/bin/ld: kexec/arch/x86_64/kexec-bzImage64.o:(.bss+0x0): multiple definition of `bzImage_support_efi_boot'; kexec/arch/i386/kexec-bzImage.o:(.bss+0x0): first defined here /builddir/build/BUILD/kexec-tools-2.0.20/kexec/arch/arm/../../fs2dt.h:33: multiple definition of `my_debug'; kexec/fs2dt.o:/builddir/build/BUILD/kexec-tools-2.0.20/kexec/fs2dt.h:33: first defined here /builddir/build/BUILD/kexec-tools-2.0.20/kexec/arch/arm64/kexec-arm64.h:68: multiple definition of `arm64_mem'; kexec/fs2dt.o:/builddir/build/BUILD/kexec-tools-2.0.20/././kexec/arch/arm64/kexec-arm64.h:68: first defined here /builddir/build/BUILD/kexec-tools-2.0.20/kexec/arch/arm64/kexec-arm64.h:54: multiple definition of `initrd_size'; kexec/fs2dt.o:/builddir/build/BUILD/kexec-tools-2.0.20/././kexec/arch/arm64/kexec-arm64.h:54: first defined here /builddir/build/BUILD/kexec-tools-2.0.20/kexec/arch/arm64/kexec-arm64.h:53: multiple definition of `initrd_base'; kexec/fs2dt.o:/builddir/build/BUILD/kexec-tools-2.0.20/././kexec/arch/arm64/kexec-arm64.h:53: first defined here And apparently, these variables are wrongly declared multiple times. So remove duplicated declaration. Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-04-01Removing condition that will never be met after calls xmalloc and xreallocLeonidas S. Barbosa1-12/+0
Hi, Looking in the kexec-tools code I found these conditions that seems will never be met. Not sure if that was intentional for explicitity, if it was the case, please disconsider this patch. xmalloc and xrealloc when fails calls die() that calls exit(1). Checks for if(!memory) after they are called will never be met that condition, since the process will be exited after an allocation fail. Signed-off-by: Leonidas S. Barbosa <kirotawa@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-04-01kexec: support parsing the string "Reserved" to get the correct e820 ↵Lianbo Jiang1-1/+1
reserved region When loading kernel and initramfs for kexec, kexec-tools could get the e820 reserved region from "/proc/iomem" in order to rebuild the e820 ranges for kexec kernel, but there may be the string "Reserved" in the "/proc/iomem", which caused the failure of parsing. For example: #cat /proc/iomem|grep -i reserved 00000000-00000fff : Reserved 7f338000-7f34dfff : Reserved 7f3cd000-8fffffff : Reserved f17f0000-f17f1fff : Reserved fe000000-ffffffff : Reserved Currently, kexec-tools can not handle the above case because the memcmp() is case sensitive when comparing the string. So, let's fix this corner and make sure that the string "reserved" and "Reserved" in the "/proc/iomem" are both parsed appropriately. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Acked-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-04-01kexec-tools: Reset getopt before falling back to legacy syscallPetr Tesarik1-2/+10
The modules may need to parse the arguments again after kexec_file_load(2) failed, but getopt is not reset. This change fixes the --initrd option on s390x. Without this patch, it will fail to load the initrd on kernels that do not implement kexec_file_load(2). Signed-off-by: Petr Tesarik <ptesarik@suse.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-04-01kexec-tools: Fix kexec_file_load(2) error handlingPetr Tesarik2-54/+61
The handling of kexec_file_load() error conditions needs some improvement. First, on failure, the system call itself returns -1 and sets errno. It is wrong to check the return value itself. Second, do_kexec_file_load() mixes different types of error codes (-1, return value of a load method, negative kernel error number). Let it always return one of the reason codes defined in kexec/kexec.h. Third, the caller of do_kexec_file_load() cannot know what exactly failed inside that function, so it should not check errno directly. All it needs to know is whether it makes sense to fall back to the other syscall. Add an error code for that purpose (EFALLBACK), and let do_kexec_file_load() decide. Fourth, do_kexec_file_load() should not print any error message if it returns EFALLBACK, because the fallback syscall may succeed later, and the user is confused whether the command failed, or not. Move the error message towards the end of main(). Signed-off-by: Petr Tesarik <ptesarik@suse.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-04-01crashdump-ppc64: crashkernel-base and crashkernel-size are big-endianThadeu Lima de Souza Cascardo1-2/+2
When reading the device-tree exported crashkernel-base and crashkernel-size, their values should be converted from big-endian to the CPU byte order. These is the output of running kexec --print-ckr-size on a little-endian ppc64 box. $ kexec --print-ckr-size 137438953472 $ kexec --print-ckr-size 536870912 Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-01-03kexec: build multiboot2 for i386Chris Packham2-1/+6
This addresses the following compilation issues when building for i386. kexec/arch/i386/kexec-x86.c:39:22: error: 'multiboot2_x86_probe' undeclared here (not in a function); did you mean 'multiboot_x86_probe'? { "multiboot2-x86", multiboot2_x86_probe, multiboot2_x86_load, ^~~~~~~~~~~~~~~~~~~~ multiboot_x86_probe kexec/arch/i386/kexec-x86.c:39:44: error: 'multiboot2_x86_load' undeclared here (not in a function); did you mean 'multiboot_x86_load'? { "multiboot2-x86", multiboot2_x86_probe, multiboot2_x86_load, ^~~~~~~~~~~~~~~~~~~ multiboot_x86_load kexec/arch/i386/kexec-x86.c:40:4: error: 'multiboot2_x86_usage' undeclared here (not in a function); did you mean 'multiboot_x86_usage'? multiboot2_x86_usage }, ^~~~~~~~~~~~~~~~~~~~ multiboot_x86_usage make: *** [Makefile:114: kexec/arch/i386/kexec-x86.o] Error 1 make: *** Waiting for unfinished jobs.... Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-01-03ARM: Use mmap for zImage initrdBrandon Maier1-1/+1
We use a large initrd that maxes out our available RAM when loading kexec. The problem can be mitigated by using slurp_file_mmap(), which avoids creating a copy of the initrd. The initrd does not use free, realloc, etc, so it should be safe to use. Signed-off-by: Brandon Maier <brandon.maier@rockwellcollins.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-01-03arm64: kdump: deal with a lot of resource entries in /proc/iomemAKASHI Takahiro1-15/+10
As described in the commit ("arm64: kexec: allocate memory space avoiding reserved regions"), /proc/iomem now has a lot of "reserved" entries, and it's not just enough to have a fixed size of memory range array. With this patch, kdump is allowed to handle arbitrary number of memory ranges, using mem_regions_alloc_and_xxx() functions. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-01-03arm64: kexec: allocate memory space avoiding reserved regionsAKASHI Takahiro1-59/+94
On UEFI/ACPI-only system, some memory regions, including but not limited to UEFI memory map and ACPI tables, must be preserved across kexec'ing. Otherwise, they can be corrupted and result in early failure in booting a new kernel. In recent kernels, /proc/iomem now has an extended file format like: 40000000-5871ffff : System RAM 41800000-426affff : Kernel code 426b0000-42aaffff : reserved 42ab0000-42c64fff : Kernel data 54400000-583fffff : Crash kernel 58590000-585effff : reserved 58700000-5871ffff : reserved 58720000-58b5ffff : reserved 58b60000-5be3ffff : System RAM 58b61000-58b61fff : reserved where the "reserved" entries at the top level or under System RAM (and its descendant resources) are ones of such kind and should not be regarded as usable memory ranges where several free spaces for loading kexec data will be allocated. With this patch, get_memory_ranges() will handle this format of file correctly. Note that, for safety, unknown regions, in addition to "reserved" ones, will also be excluded. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-01-03kexec: add variant helper functions for handling memory regionsAKASHI Takahiro2-0/+49
mem_regions_alloc_and_add() and mem_regions_alloc_and_exclude() are functionally equivalent to, respectively, mem_regions_add() and mem_regions_exclude() except the formers will re-allocate memory dynamically when no more entries are available in 'ranges' array. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-10-03kexec-tools: Fix possible out-of-bounds access in ifdownHelge Deller1-1/+2
Fix a possible out-of-bounds access in function ifdown(): ifdown.c: In function 'ifdown': ifdown.c:56:4: warning: 'strncpy' specified bound 16 equals destination size [-Wstringop-truncation] 56 | strncpy(ifr.ifr_name, ifp->if_name, IFNAMSIZ); Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-10-01kexec: add support for PARISC architectureSven Schnelle8-0/+403
This patch adds support for the parisc Architecture. kexec support for parisc is included with linux-5.4. Signed-off-by: Sven Schnelle <svens@stackframe.org> Tested-by: Helge Deller <deller@gmx.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-09-16kexec/arm: undefine __NR_kexec_file_load for armQuanyang Wang1-0/+4
In the kernel upstream commit 4ab65ba7a5cb ("ARM: add kexec_file_load system call number"), __NR_kexec_file_load for arm has been defined to be 401. This results that even if kexec_file_load isn't implemented for arm but the function is_kexec_file_load_implemented() will still return true. So undef __NR_kexec_file_load for arm architecture. Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-09-16i386/kexec-mb2-x86.c: Fix compilation warningBhupesh Sharma1-2/+0
This patch fixes the following compilation warning in 'i386/kexec-mb2-x86.c' regarding the variable 'result' which is set but not used: kexec/arch/i386/kexec-mb2-x86.c:402:6: warning: variable ‘result’ set but not used [-Wunused-but-set-variable] int result; ^~~~~~ Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-09-08Cleanup: remove the read_elf_kcore()Lianbo Jiang1-1/+1
Here, no need to wrap the read_elf() again, lets invoke it directly. So remove the read_elf_kcore() and clean up redundant code. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-09-03x86: Fix PAGE_OFFSET for kernels since 4.20Donald Buczek2-1/+4
Linux kernel commit d52888aa2753 ("x86/mm: Move LDT remap out of KASLR region on 5-level paging") changed the base of the direct mapping from 0xffff880000000000 to 0xffff888000000000. This was merged into v4.20-rc2. Update to new address accordingly. Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-16kexec/arm64: Add support for handling zlib compressed (Image.gz) imageBhupesh Sharma6-3/+250
Currently the kexec_file_load() support for arm64 doesn't allow handling zlib compressed (i.e. Image.gz) image. Since most distributions use 'make zinstall' rule inside 'arch/arm64/boot/Makefile' to install the arm64 Image.gz compressed file inside the boot destination directory (for e.g. /boot), currently we cannot use kexec_file_load() to load vmlinuz (or Image.gz): # file /boot/vmlinuz /boot/vmlinuz: gzip compressed data, was "Image", <..snip..>, max compression, from Unix, original size 21945120 Now, since via kexec_file_load() we pass the 'fd' of Image.gz (compressed file) via the following command line ... # kexec -s -l /boot/vmlinuz-`uname -r` --initrd=/boot/initramfs-`uname -r`.img --reuse-cmdline ... kernel returns -EINVAL error value, as it is not able to locate the magic number =0x644d5241, which is expected in the 64-byte header of the decompressed kernel image. We can fix this in user-space kexec-tools, which handles an 'Image.gz' being passed via kexec_file_load(), using an approach as follows: a). Copy the contents of Image.gz to a temporary file. b). Decompress (gunzip-decompress) the contents inside the temporary file. c). Pass the 'fd' of the temporary file to the kernel space. So basically the kernel space still gets a decompressed kernel image to load via kexec-tools I tested this patch for the following three use-cases: 1. Uncompressed Image file: #kexec -s -l Image --initrd=/boot/initramfs-`uname -r`.img --reuse-cmdline 2. Signed Image file: #kexec -s -l Image.signed --initrd=/boot/initramfs-`uname -r`.img --reuse-cmdline 3. zlib compressed Image.gz file: #kexec -s -l /boot/vmlinuz-`uname -r` --initrd=/boot/initramfs-`uname -r`.img --reuse-cmdline Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-16kexec/kexec-zlib.h: Add 'is_zlib_file()' helper functionBhupesh Sharma2-0/+39
This patch adds 'is_zlib_file()' helper function which can be used to quickly determine with the passed kernel image is a zlib compressed kernel image. This is specifically useful for arm64 zImage (or Image.gz) support, which is introduced by later patches in this patchset. Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-16kexec-uImage-arm64.c: Fix return value of uImage_arm64_probe()Bhupesh Sharma1-1/+12
Commit bf06cf2095e1 ("kexec/uImage: probe to identify a corrupted image"), defined the 'uImage_probe_kernel()' function return values and correspondingly ;uImage_arm64_probe()' returns the same (0 -> If the image is valid 'type' image, -1 -> If the image is corrupted and 1 -> If the image is not a uImage). This causes issues because, in later patches we introduce zImage support for arm64, and since it is probed after uImage, the return values from 'uImage_arm64_probe()' needs to be fixed to make sure that kexec will not return with an invalid error code. Now, 'uImage_arm64_probe()' returns the following values instead: 0 - valid uImage. -1 - uImage is corrupted. 1 - image is not a uImage. Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-16kexec/kexec.c: Add the missing close() for fd used for kexec_file_load()Bhupesh Sharma1-0/+2
In kexec/kexec.c, we open() the kernel Image file and pass this file descriptor to the kexec_file_load() system call, but never call a corresponding close(). Fix the same via this patch. Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-10x86: Include kexec-mb2-x86.c and multiboot2.h in distributionSimon Horman1-1/+3
Fixes: 22a2ed55132e ("x86: Support multiboot2 images") Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-03x86: re-order includes to avoid duplicate struct e820entrySimon Horman1-1/+1
xenctrl.h defines struct e820entry as: if defined(__i386__) || defined(__x86_64__) ... #define E820_RAM 1 ... struct e820entry { uint64_t addr; uint64_t size; uint32_t type; } __attribute__((packed)); ... #endif $ dpkg-query -S /usr/include/xenctrl.h libxen-dev:amd64: /usr/include/xenctrl.h $ dpkg-query -W libxen-dev:amd64 libxen-dev:amd64 4.8.5+shim4.10.2+xsa282-1+deb9u11 ./include/x86/x86-linux.h defines struct e820entry as: #ifndef E820_RAM struct e820entry { uint64_t addr; /* start of memory segment */ uint64_t size; /* size of memory segment */ uint32_t type; /* type of memory segment */ #define E820_RAM 1 ... } __attribute__((packed)); #endif Since cedeee0a3007 ("x86: Introduce helpers for getting RSDP address") ./kexec/arch/i386/kexec-x86-common.c includes +#include "x86-linux-setup.h" #include "../../kexec-xen.h" When xenctrl.h is present the above results in: $ gcc ... In file included from kexec/arch/i386/../../kexec-xen.h:5:0, from kexec/arch/i386/kexec-x86-common.c:43: /usr/include/xenctrl.h:1271:8: error: redefinition of 'struct e820entry' struct e820entry { ^~~~~~~~~ In file included from kexec/arch/i386/x86-linux-setup.h:3:0, from kexec/arch/i386/kexec-x86-common.c:42: ./include/x86/x86-linux.h:16:8: note: originally defined here struct e820entry { ^~~~~~~~~ ... $ gcc --version | head -1 gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 To militate this this problem re-order the includes so that x86-linux.h is included after xenctrl.h and thus struct e820entry will only be defined once due to it being devined conditionally in x86-linux.h. In practice the definitions are the same so it should not matter which is chosen. It also seems rather unpleasent to me to need to play with include ordering. Perhaps a better solution in the longer term would be to rename the local definition of struct e820entry. Fixes: cedeee0a3007 ("x86: Introduce helpers for getting RSDP address") Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-03x86: Support multiboot2 imagesVarad Gautam6-0/+577
Add a new type `multiboot2-x86` that allows loading multiboot2 [1] images within the relocation range specified in the image header. The image is always placed at the lowest available address, regardless of the preference information. [1] https://www.gnu.org/software/grub/manual/multiboot2/multiboot.html Signed-off-by: Varad Gautam <vrd@amazon.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-03elf: Support ELF loading with relocationVarad Gautam2-65/+141
Add a helper to allow loading an image within specified address range. This will be used to load multiboot2 images later. Signed-off-by: Varad Gautam <vrd@amazon.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-31crashdump/x86: Use new introduce helper for getting RSDPKairui Song1-25/+9
Use the new introduce helper for getting RSDP, this ensures RSDP is always accessible and avoid code duplication. Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-31x86: Always try to fill acpi_rsdp_addr in boot paramsKairui Song1-0/+3
Since kernel commit e6e094e053af75 ("x86/acpi, x86/boot: Take RSDP address from boot params if available"), kernel accept an acpi_rsdp_addr param in boot_params. So fill in this parameter unconditionally, ensure second kernel always get the right RSDP address consistently, and boot well on EFI system even with EFI service disabled. User no longer need to change the kernel cmdline to workaround the missing RSDP issue. For older version of kernels (Before 5.0), there won't be any change of behavior. Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-31x86: Introduce helpers for getting RSDP addressKairui Song4-2/+46
On x86 RSDP is fundamental for booting the machine. When second kernel is incapable of parsing the RSDP address (eg. kexec next kernel on an EFI system with EFI service disabled), kexec should prepare the RSDP address for second kernel. Introduce helpers for getting RSDP from multiple sources, including boot params and EFI firmware. For legacy BIOS interface, there is no better way to find the RSDP address rather than scanning the memory region and search for it, and this will always be done by the kernel as a fallback, so this is no need to try to get the RSDP address for that case. Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-15x86: Find mounts by FS type, not nameNiklas Hambüchen1-4/+8
The name in mount invocations like mount -t debugfs debugfs /sys/kernel/debug is nothing but convention and cannot be relied upon. For example, https://www.kernel.org/doc/Documentation/filesystems/debugfs.txt recommends making the name "none" instead: mount -t debugfs none /sys/kernel/debug and many existing systems use mounts named "none" or otherwise. Using `mnt_type` instead of `mnt_fsname` allows kexec to work on such systems. This fixes another instance of `poweroff` not working on kexec'ed kernels because the lack of correctly matched mount results in EFI variables not being read and propagated. Signed-off-by: Niklas Hambüchen <mail@nh2.me> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-15x86: Check /proc/mounts before mtab for mountsNiklas Hambüchen1-2/+6
In many situations, especially on read-only file systems and initial ramdisks (intramfs/initrd), /etc/mtab does not exist. Before this commit, kexec would fail to read mounts on such systems in `find_mnt_by_fsname()`, such that `get_bootparam()` would not `boot_params/data`, which would then lead to e.g. `setup_efi_data()` not being called in `setup_efi_info()`. As a result, kexec'ed kernels would not obtain EFI data, subsequentially lack an `ACPI RSDP` entry, emitting: ACPI BIOS Error (bug): A valid RSDP was not found (20180810/tbxfroot-210) and thus fail to turn off the machine on poweroff, instead printing only: reboot: System halted This problem had to be worked around by passing `acpi_rsdp=` manually before. This commit obviates this workaround. See also: * https://github.com/coreos/bugs/issues/167#issuecomment-487320879 * http://lists.infradead.org/pipermail/kexec/2012-October/006924.html Signed-off-by: Niklas Hambüchen <mail@nh2.me> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-15xen: Avoid overlapping segments in low memoryDavid Woodhouse1-19/+54
Unlike Linux which creates a full identity mapping, Xen only maps those segments which are explicitly requested. Therefore, xen_kexec_load() silently adds in a segment from zero to 1MiB to ensure that VGA memory and other things are accessible. However, this doesn't work when there are already segments to be loaded under 1MiB, because the overlap causes Xen to reject the kexec_load. Be more careful and just infill the ranges which are required instead of naïvely adding a full 0-1MiB segment at the end of the list. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-03-06x86: Introduce a new option --reuse-video-typeKairui Song4-2/+14
After commit 060eee58 "x86: use old screen_info if needed", kexec-tools will force use old screen_info and vga type if failed to determine current vga type. But it is not always a good idea. Currently kernel hanging is inspected on some hyper-v VMs after this commit, because hyperv_fb will mimic EFI (or VESA) VGA on first boot up, but after the real driver is loaded, it will switch to new mode and no longer compatible with EFI/VESA VGA. Keep setting orig_video_isVGA to EFI/VESA VGA flag will get wrong driver loaded and try to manipulate the framebuffer in a wrong way. We can't ensure this won't happen on other framebuffer drivers, But it's a helpful feature if the framebuffer drivers just work. So this patch introduce a --reuse-video-type options to let user decide if the old screen_info hould be used unconditional or not. Signed-off-by: Kairui Song <kasong@redhat.com> Reviewed-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-02-05arm64: wipe old initrd addresses when patching the DTBJean-Philippe Brucker3-0/+12
When copying the DTB from the current kernel, if the user didn't pass an initrd on the command-line, make sure that the new DTB doesn't contain initrd properties with stale addresses. Otherwise the next kernel will try to unpack the initramfs from a location that contains junk, since the initial initrd is long gone: [ 49.370026] Initramfs unpacking failed: junk in compressed archive This issue used to be hidden by a successful recovery, but since commit ff1522bb7d98 ("initramfs: cleanup incomplete rootfs") in Linux, the kernel removes the default /root mountpoint after failing to load an initramfs, and cannot mount the rootfs passed on the command-line anymore. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-28x86: Handle 64bit framebuffer memory address properlyKairui Song1-1/+6
In a EFI system, the frame buffer address is 64bit, so currently if the address is beyound 4G, kexec will set wrong address due to truncate. Linux kernel commit ae2ee627dc87 ('efifb: Add support for 64-bit frame buffer addresses') added support for 64bit frame buffer address, an 'ext_lfb_base' field is added as the upper 32-bits of the frame buffer, and introduced a new capability flag 'VIDEO_TYPE_CAPABILITY_64BIT_BASE' to indicate if the extend field is used. This patch adopts this change, set proper extent address and capability flag when the address is beyound 4G. Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-28multiboot-x86: pass framebuffer information when requestedFriedemann Gerold1-10/+82
When the kernel requests video information, pass it the framebuffer information in the multiboot header from the linux framebuffer ioctl's. With the arch specific --reset-vga or --consolve-vga options, purgatory will reset the framebuffer so pass information for standard ega text mode. Signed-off-by: Friedemann Gerold <cinap_lenrek@felloff.net> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-28multiboot-x86: pass ACPI reserved memory information in memory mapFriedemann Gerold1-2/+11
Use the appropriate types for ACPI reclaim and ACPI NVS ranges in the multiboot memory map. This allows the kernel to locate ACPI tables on UEFI systems without having a explicit pointer to the RSD. Signed-off-by: Friedemann Gerold <cinap_lenrek@felloff.net> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-28multiboot-x86: support for non-elf kernelsFriedemann Gerold1-18/+45
Add support for non-elf multiboot kernels (such as Plan 9) by handling the MULTIBOOT_AOUT_KLUDGE bit. When the bit is clear then we are dealing with an ELF file and probe for ELF as before with elf_x86_probe(). When the bit is set then load_addr, load_end_addr, header_addr and entry_addr from the multiboot header are used load the memory image. Signed-off-by: Friedemann Gerold <cinap_lenrek@felloff.net> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-15arm64: add kexec_file_load supportAKASHI Takahiro5-2/+55
With this patch, kexec_file_load() system call is supported. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com> Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-09arm64: Add support to read PHYS_OFFSET from 'kcore' - pt_note or pt_load (if ↵Bhupesh Sharma2-9/+200
available) On certain arm64 platforms, it has been noticed that due to a hole at the start of physical ram exposed to kernel (i.e. it doesn't start from address 0), the kernel still calculates the 'memstart_addr' kernel variable as 0. Whereas the SYSTEM_RAM or IOMEM_RESERVED range in '/proc/iomem' would carry a first entry whose start address is non-zero (as the physical ram exposed to the kernel starts from a non-zero address). In such cases, if we rely on '/proc/iomem' entries to calculate the phys_offset, then we will have mismatch between the user-space and kernel space 'PHYS_OFFSET' value. The present 'kexec-tools' code does the same in 'get_memory_ranges_iomem_cb()' function when it makes a call to 'set_phys_offset()'. This can cause the vmcore generated via 'kexec-tools' to miss the last few bytes as the first '/proc/iomem' starts from a non-zero address. Please see [0] for the original bug-report from Yanjiang Jin. The same can be fixed in the following manner: 1. For newer kernel (>= 4.19, with commit 23c85094fe1895caefdd ["proc/kcore: add vmcoreinfo note to /proc/kcore"] available), 'kcore' contains a new PT_NOTE which carries the VMCOREINFO information. If the same is available, one should prefer the same to retrieve 'PHYS_OFFSET' value exported by the kernel as this is now the standard interface exposed by kernel for sharing machine specific details with the user-land as per the arm64 kernel maintainers (see [1]) . 2. For older kernels, we can try and determine the PHYS_OFFSET value from PT_LOAD segments inside 'kcore' via some jugglery of the correct virtual and physical address combinations. As a fallback, we still support getting the PHYS_OFFSET values from '/proc/iomem', to maintain backward compatibility. Testing: ------- - Tested on my apm-mustang and qualcomm amberwing board with upstream kernel (4.20.0-rc7) for both KASLR and non-KASLR boot cases. References: ----------- [0] https://www.spinics.net/lists/kexec/msg20618.html [1] https://www.mail-archive.com/kexec@lists.infradead.org/msg20300.html Reported-by: Yanjiang Jin <yanjiang.jin@hxt-semitech.com> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-09kexec/dt-ops.c: Fix '/chosen' v/s 'chosen' node being passed to fdt helper ↵Bhupesh Sharma1-4/+27
functions This patch fixes the incorrect 'chosen' node name being passed to various fdt helper functions inside 'kexec/dt-ops.c' As we can see from the Linux kernel usage inside 'drivers/firmware/efi/libstub/fdt.c', we pass '/chosen' node names to fdt helper function like 'fdt_path_offset()' whereas 'chosen' to the rest of the fdt helper functions like 'fdt_subnode_offset()'. We need to replicate the same in 'kexec-tools' to fix issues being reported when we use --dtb option while invoking 'kexec'. Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-09kexec/dt-ops.c: Fix adding '/chosen' node for cases where it is not ↵Bhupesh Sharma1-2/+3
available in dtb passed via --dtb option While calling 'kexec -l', in case we are passed a .dtb using --dtb option which doesn't contain a '/chosen' node, we try to create the '/chosen' node and add bootargs to this node. Currently the 'dt-ops.c' code is buggy as it passes '-FDT_ERR_NOTFOUND' to 'fdt_add_subnode()', which leads to the following error: # kexec -d --load Image --append 'debug' --dtb rk3399-sapphire.dtb <..snip..> dtb_set_property: fdt_add_subnode failed: FDT_ERR_NOTFOUND kexec: Set device tree bootargs failed. get_cells_size: #address-cells:2 #size-cells:2 cells_size_fitted: 0-0 cells_size_fitted: 0-0 setup_2nd_dtb: no kaslr-seed found This patch passes the correct nodeoffset value to 'fdt_add_subnode()', which fixes this issue: # kexec -d -l Image --append 'debug' --dtb rk3399-sapphire.dtb <..snip..> get_cells_size: #address-cells:2 #size-cells:2 cells_size_fitted: 0-0 cells_size_fitted: 0-0 setup_2nd_dtb: no kaslr-seed found Reported-by: Vicente Bergas <vicencb@gmail.com> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-09kexec/kexec-arm64.c: Add error handling check against return value of ↵Bhupesh Sharma1-0/+5
'set_bootargs()' This patch adds missing error handling check against the return value of 'set_bootargs()' in 'kexec-arm64.c' Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-09kexec/dt-ops.c: Fix check against 'fdt_add_subnode' return valueBhupesh Sharma1-1/+1
Vicenç reported (via [1]) that currently executing kexec with '--dtb' option and passing a .dtb which doesn't have a '/chosen' node leads to the following error: # kexec -d --dtb dtb_without_chosen_node.dtb --append 'cmdline' --load Image dtb_set_property: fdt_add_subnode failed: <valid offset/length> kexec: Set device tree bootargs failed. This happens because currently we check the return value of 'fdt_add_subnode()' function call in 'dt-ops.c' incorrectly: result = fdt_add_subnode(new_dtb, nodeoffset, node); if (result) { dbgprintf("%s: fdt_add_subnode failed: %s\n", _func__, fdt_strerror(result)); goto on_error; } As we can see in 'fdt_rw.c', a positive return value from 'fdt_add_subnode()' function doesn't indicate an error. We can see that the Linux kernel (see 'drivers/firmware/efi/libstub/fdt.c' for example) also checks the 'fdt_add_subnode()' function against negative return values for errors. See an example below from 'update_fdt()' function in 'drivers/firmware/efi/libstub/fdt.c': node = fdt_add_subnode(fdt, 0, "chosen"); if (node < 0) { status = node; <..snip..> goto fdt_set_fail; } This patch fixes the same in 'kexec-tools'. [1]. http://lists.infradead.org/pipermail/kexec/2018-October/021746.html Cc: Simon Horman <horms@verge.net.au> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Reported-by: Vicente Bergas <vicencb@gmail.com> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-10-29arm64: If 'getrandom' syscall fails, don't error out - just warn and proceed.Bhupesh Sharma1-3/+14
For calculating the random 'kaslr-seed' value to be passed to the secondary kernel (kexec or kdump), we invoke the 'getrandom' syscall inside 'setup_2nd_dtb()' function. Normally on most arm64 systems this syscall doesn't fail when the initrd scriptware (which arms kdump service) invokes the same. However, recently I noticed that on the 'hp-moonshot' arm64 boards, we have an issue with the newer kernels which causes the same to fail. As a result, the kdump service fails and we are not able to use the kdump infrastructure just after boot. As expected, once the random pool is sufficiently populated and we launch the kdump service arming scripts again (manually), then the kdump service is properly enabled. Lets handle the same, by not error'ing out if 'getrandom' syscall fails. Rather lets warn the user and proceed further by setting the 'kaslr-seed' value as 0 for the secondary kernel - which implies that it boots in a 'nokaslr' mode. Tested on my 'hp-moonshot' and 'qualcomm-amberwing' arm64 boards. Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-10-29x86: fix BAD_FREE in get_efi_runtime_map()Pingfan Liu1-2/+2
If the err_out label is reached, address of a stack variable is passed to free(). Fix it. Signed-off-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-10-02kdump: fix an error that can not parse the e820 reserved regionLianbo Jiang1-0/+2
When kexec-tools load the kernel and initramfs for kdump, kexec-tools will read /proc/iomem and recreate the e820 ranges for kdump kernel. But it fails to parse the e820 reserved region, because the memcmp() is case sensitive when comparing the string. In fact, it may be "Reserved" or "reserved" in the /proc/iomem, so we have to fix these cases. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Reviewed-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-08-24kexec: fix for "Unhandled rela relocation: R_X86_64_PLT32" errorChris Clayton1-0/+1
In response to a change in binutils, commit b21ebf2fb4c (x86: Treat R_X86_64_PLT32 as R_X86_64_PC32) was applied to the linux kernel during the 4.16 development cycle and has since been backported to earlier stable kernel series. The change results in the failure message in $SUBJECT when rebooting via kexec. Fix this by replicating the change in kexec. Signed-off-by: Chris Clayton <chris2553@googlemail.com> Acked-by: Baoquan He <bhe@redhat.com> Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Acked-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-06-29arm64: error out if kernel command line is too longMunehisa Kamata1-1/+8
Currently, in arm64, kexec silently truncates kernel command line longer than COMMAND_LINE_SIZE - 1. Error out in that case as some other architectures already do that. The error message is copied from x86_64. Suggested-by: Tom Kirchner <tjk@amazon.com> Signed-off-by: Munehisa Kamata <kamatam@amazon.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-06-29arm64: increase command line size to 2048Munehisa Kamata1-1/+1
Otherwise, we can hit the current 512 chars limit before hitting the Linux kernel's one, where allows 2048 chars in arm64. Signed-off-by: Munehisa Kamata <kamatam@amazon.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-06-19arm64: Add support to supply 'kaslr-seed' to secondary kernelBhupesh Sharma1-38/+100
This patch adds the support to supply 'kaslr-seed' to secondary kernel, when we do a 'kexec warm reboot to another kernel' (although the behaviour remains the same for the 'kdump' case as well) on arm64 platforms using the 'kexec_load' invocation method. Lets consider the case where the primary kernel working on the arm64 platform supports kaslr (i.e 'CONFIG_RANDOMIZE_BASE' was set to y and we have a compliant EFI firmware which supports EFI_RNG_PROTOCOL and hence can pass a non-zero (valid) seed to the primary kernel). Now the primary kernel reads the 'kaslr-seed' and wipes it to 0 and uses the seed value to randomize for e.g. the module base address offset. In the case of 'kexec_load' (or even kdump for brevity), we rely on the user-space kexec-tools to pass an appropriate dtb to the secondary kernel and since 'kaslr-seed' is wiped to 0 by the primary kernel, the secondary will essentially work with *nokaslr* as 'kaslr-seed' is set to 0 when it is passed to the secondary kernel. This can be true even in case the secondary kernel had 'CONFIG_RANDOMIZE_BASE' and 'CONFIG_RANDOMIZE_MODULE_REGION_FULL' set to y. This patch addresses this issue by first checking if the device tree provided by the firmware to the kernel supports the 'kaslr-seed' property and verifies that it is really wiped to 0. If this condition is met, it fixes up the 'kaslr-seed' property by using the getrandom() syscall to get a suitable random number. I verified this patch on my Qualcomm arm64 board and here are some test results: 1. Ensure that the primary kernel is boot'ed with 'kaslr-seed' dts property and it is really wiped to 0: [root@qualcomm-amberwing]# dtc -I dtb -O dts /sys/firmware/fdt | grep -A 10 -i chosen chosen { kaslr-seed = <0x0 0x0>; ... } 2. Now issue 'kexec_load' to load the secondary kernel (let's assume that we are using the same kernel as the secondary kernel): # kexec -l /boot/vmlinuz-`uname -r` --initrd=/boot/initramfs-`uname -r`.img --reuse-cmdline -d 3. Issue 'kexec -e' to warm boot to the secondary: # kexec -e 4. Now after the secondary boots, confirm that the load address of the modules is randomized in every successive boot: [root@qualcomm-amberwing]# cat /proc/modules sunrpc 524288 1 - Live 0xffff0307db190000 vfat 262144 1 - Live 0xffff0307db110000 fat 262144 1 vfat, Live 0xffff0307db090000 crc32_ce 262144 0 - Live 0xffff0307d8c70000 ... Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-05-22kexec/s390: Add support for kexec_file_loadPhilipp Rudo2-0/+49
Since kernel 4.17-rc2 s390 supports the kexec_file_load system call. Add the new system call to kexec-tools and provide the -s (--kexec-file-syscall) option for s390 to support this new feature. Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-04-19kexec: Add --no-checks optionGeoff Levand3-3/+25
Add a new option --no-checks to kexec that allows for a fast reboot by avoiding the purgatory integrity checks. This option is intended for use by kexec based bootloaders that load a new image and then immediately transfer control to it. Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-04-19kexec-elf-rel-ppc64: Fix cast from pointer warningGeoff Levand1-1/+1
Fixes warnings like these when building kexec for powerpc (32 bit): kexec-elf-rel-ppc64.c: warning: cast from pointer to integer of different size Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-04-19crashdump-ppc64: Fix integer truncation warningGeoff Levand1-1/+1
Fixes warnings like these when building kexec for powerpc (32 bit): crashdump-ppc64.h: warning: large integer implicitly truncated to unsigned type Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-04-19kexec: Fix printf warningGeoff Levand1-1/+2
Fixes warnings like these when building kexec for powerpc (32 bit): kexec.c: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘uint64_t Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-04-19Merge branch 'master' of git://git.armlinux.org.uk/~rmk/kexec-toolsSimon Horman1-32/+91
2018-04-10Fix a segmentation fault when trying to run "kexec -p"Petr Tesarik1-0/+1
Do not fall through to "--mem-min" when "-p" option is parsed. The break statement was apparently removed by mistake... Fixes: cb434cbe6f40 ("kexec: Do not special-case the -s option") Signed-off-by: Petr Tesarik <ptesarik@suse.com> Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-04-10arm64: fix an issue with kaslr-enabled vmlinuxAKASHI Takahiro1-1/+5
Normally vmlinux for arm64 is of ET_EXEC type, while if built with CONFIG_RANDAMIZE_BASE (that is KASLR), it will be of ET_DYN type. Meanwhile, physical address field of segments in vmlinux has actually the same value as virtual address field. Accordingly, in this case, it totally makes no sense to check for validity of segments against physical memory ranges and, if necessary, relocate them in elf_exec_load() on arm64. This patch allows to unconditionally skip the check on arm64. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-03-30kexec: Document -s, -c and -a options in the man pageMichal Suchanek1-0/+20
Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-03-30kexec: Add option to fall back to KEXEC_LOAD when KEXEC_FILE_LOAD is not ↵Michal Suchanek2-6/+63
supported Not all architectures implement KEXEC_FILE_LOAD. However, on some archiectures KEXEC_FILE_LOAD is required when secure boot is enabled in locked-down mode. Previously users had to select the KEXEC_FILE_LOAD syscall with undocumented -s option. However, if they did pass the option kexec would fail on architectures that do not support it. So add an -a option that tries KEXEC_FILE_LOAD and when it is not supported tries KEXEC_LOAD. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-03-30kexec: Add option to revert -sMichal Suchanek2-1/+8
The undocumented -s option selects KEXEC_FILE_LOAD syscall but there is no option to select KEXEC_LOAD syscall. Add it so -s can be reverted. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-03-30kexec: Do not special-case the -s optionMichal Suchanek1-21/+4
It is parsed separately to save a few CPU cycles when setting up other options but it just complicates the code. So fold it back and set up all flags for both KEXEC_LOAD and KEXEC_FILE_LOAD Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-03-30kexec: Fix option checks to take KEXEC_FILE_LOAD into accountMichal Suchanek1-1/+9
When kexec_file_load support was added some sanity checks were not updated. Some options are set only in the kexec_load flags so cannot be supported wiht kexec_file_load. On the other hand, reserved memory is needed for kdump with both kexec_load and kexec_file_load. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-03-30kexec: Return -ENOSYS when kexec does not know how to call KEXEC_FILE_LOADMichal Suchanek1-1/+1
When the kernel does not know a syscall number it returns -ENOSYS but when kexec does not know a syscall number it returns -1. Return -ENOSYS from kexec as well. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-03-30kexec/ppc64: leverage kexec_file_load supportHari Bathini2-0/+87
PPC64 kernel now supports kexec_file_load system call. Leverage it by enabling that support here. Note that loading crash kernel with this system call is not yet supported in the kernel and trying to load one will fail with '-ENOTSUPP' error. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-03-20ARM: Include stack and malloc space in zImage sizeRussell King1-0/+6
Include the stack and malloc space in our calculation of the zImage size, both of which must be avoided when locating the dtb. Signed-off-by: Russell King <rmk@armlinux.org.uk>
2018-03-20ARM: add further debugRussell King1-0/+11
Add further debugging of the kernel size Signed-off-by: Russell King <rmk@armlinux.org.uk>
2018-03-20ARM: read kernel size from zImageRussell King1-32/+74
Signed-off-by: Russell King <rmk@armlinux.org.uk>
2018-02-22kexec/ppc64: add support to parse ibm, dynamic-memory-v2 propertyHari Bathini4-54/+112
Add support to parse the new 'ibm,dynamic-memory-v2' property in the 'ibm,dynamic-reconfiguration-memory' node. This replaces the old 'ibm,dynamic-memory' property and is enabled in the kernel with a patch series that starts with commit 0c38ed6f6f0b ("powerpc/pseries: Enable support of ibm,dynamic-memory-v2"). All LMBs that share the same flags and are adjacent are grouped together in the newer version of the property making it compact to represent larger memory configurations. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-02-22kexec: add a helper function to add rangesHari Bathini1-62/+53
Add a helper function for adding ranges to avoid duplicating code. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-01-29x86: use old screen_info if neededDave Young1-4/+27
With modern drm/kms graphic driver kexec-tools does not setup screen_info correctly so one will only see screen output after those drm drivers reinitializing after rebooting. Copying the old screen info from original boot_params will help during my test, although it could not work for some potential cases, but it is not worse than before. This has been used in the kernel kexec_file_load. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-01-25kexec-tools: Make xc_dlhandle staticEric DeVolder2-11/+18
This patch is a follow-on to commit 894bea93 "kexec-tools: Perform run-time linking of libxenctrl.so". This patch addresses feedback from Daniel Kiper. This patch implements Daniel's suggestion to make the xc_dlhandle variable static, insert missing 'extern' qualifier for the new __xc() wrappers, and correct some style issues. Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-01-24kexec-tools: Call dlclose() from within __xc_interface_close()Eric DeVolder1-0/+1
This patch is a follow-on to commit 43d3932e "kexec-tools: Perform run-time linking of libxenctrl.so". This patch addresses feedback from Daniel Kiper. In the original patch, the call to dlclose() was omitted, in contrast to the description in the commit message. This patch inserts the call. Note that this dynamic linking feature is dependent upon the proper operation of the RTLD_NODELETE flag to dlopen(), which does work as advertised on Linux (otherwise the call to dlclose() should be omitted). Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-01-24kexec-tools: Perform run-time linking of libxenctrl.soEric DeVolder6-16/+115
When kexec is utilized in a Xen environment, it has an explicit run-time dependency on libxenctrl.so. This dependency occurs during the configure stage and when building kexec-tools. When kexec is utilized in a non-Xen environment (either bare metal or KVM), the configure and build of kexec-tools omits any reference to libxenctrl.so. Thus today it is not currently possible to configure and build a *single* kexec that will work in *both* Xen and non-Xen environments, unless the libxenctrl.so is *always* present. For example, a kexec configured for Xen in a Xen environment: # ldd build/sbin/kexec linux-vdso.so.1 => (0x00007ffdeba5c000) libxenctrl.so.4.4 => /usr/lib64/libxenctrl.so.4.4 (0x00000038d8000000) libz.so.1 => /lib64/libz.so.1 (0x00000038d6c00000) libc.so.6 => /lib64/libc.so.6 (0x00000038d6000000) libdl.so.2 => /lib64/libdl.so.2 (0x00000038d6400000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00000038d6800000) /lib64/ld-linux-x86-64.so.2 (0x000055e9f8c6c000) # build/sbin/kexec -v kexec-tools 2.0.16 However, the *same* kexec executable fails in a non-Xen environment: # copy xen kexec to . # ldd ./kexec linux-vdso.so.1 => (0x00007fffa9da7000) libxenctrl.so.4.4 => not found liblzma.so.0 => /usr/lib64/liblzma.so.0 (0x0000003014e00000) libz.so.1 => /lib64/libz.so.1 (0x000000300ea00000) libc.so.6 => /lib64/libc.so.6 (0x000000300de00000) libpthread.so.0 => /lib64/libpthread.so.0 (0x000000300e200000) /lib64/ld-linux-x86-64.so.2 (0x0000558cc786c000) # ./kexec -v ./kexec: error while loading shared libraries: libxenctrl.so.4.4: cannot open shared object file: No such file or directory At Oracle we "workaround" this by having two kexec-tools packages, one for Xen and another for non-Xen environments. At Oracle, the desire is to offer a single kexec-tools package that works in either environment. To achieve this, kexec-tools would either have to ship with libxenctrl.so (which we have deemed as unacceptable), or we can make kexec perform run-time linking against libxenctrl.so. This patch is one possible way to alleviate the explicit run-time dependency on libxenctrl.so. This implementation utilizes a set of macros to wrap calls into libxenctrl.so so that the library can instead be dlopen() and obtain the function via dlsym() and then make the call. The advantage of this implementation is that it requires few changes to the existing kexec-tools code. The dis- advantage is that it uses macros to remap libxenctrl functions and do work under the hood. Another possible implementation worth considering is the approach taken by libvmi. Reference the following file: https://github.com/libvmi/libvmi/blob/master/libvmi/driver/xen/libxc_wrapper.h The libxc_wrapper_t structure definition that starts at line ~33 has members that are function pointers into libxenctrl.so. This structure is populated once and then later referenced/dereferenced by the callers of libxenctrl.so members. The advantage of this implementation is it is more explicit in managing the use of libxenctrl.so and its versions, but the disadvantage is it would require touching more of the kexec-tools code. The following is a list libxenctrl members utilized by kexec: Functions: xc_interface_open xc_kexec_get_range xc_interface_close xc_kexec_get_range xc_interface_open xc_get_max_cpus xc_kexec_get_range xc_version xc_kexec_exec xc_kexec_status xc_kexec_unload xc_hypercall_buffer_array_create xc__hypercall_buffer_array_alloc xc_hypercall_buffer_array_destroy xc_kexec_load xc_get_machine_memory_map Data: xc__hypercall_buffer_HYPERCALL_BUFFER_NULL These were identified by configuring and building kexec-tools with Xen support, but omitting the -lxenctrl from the LDFLAGS in the Makefile for an x86_64 build. The above libxenctrl members were referenced via these source files. kexec/crashdump-xen.c kexec/kexec-xen.c kexec/arch/i386/kexec-x86-common.c kexec/arch/i386/crashdump-x86.c This patch provides a wrapper around the calls to the above functions in libxenctrl.so. Every libxenctrl call must pass a xc_interface which it obtains from xc_interface_open(). So the existing code is already structured in a manner that facilitates graceful dlopen()'ing of the libxenctrl.so and the subsequent dlsym() of the required member. The patch creates a wrapper function around xc_interface_open() and xc_interface_close() to perform the dlopen() and dlclose(). For the remaining xc_ functions, this patch defines a macro of the same name which performs the dlsym() and then invokes the function. See the __xc_call() macro for details. There was one data item in libxenctrl.so that presented a unique problem, HYPERCALL_BUFFER_NULL. It was only utilized once, as set_xen_guest_handle(xen_segs[s].buf.h, HYPERCALL_BUFFER_NULL); I tried a variety of techniques but could not find a general macro-type solution without modifying xenctrl.h. So the solution was to declare a local HYPERCALL_BUFFER_NULL, and this appears to work. I admit I am not familiar with libxenctrl to state if this is a satisfactory workaround, so feedback here welcome. I can state that this allows kexec to load/unload/kexec on Xen and non-Xen environments that I've tested without issue. With this patch applied, kexec-tools can be built with Xen support and yet there is no explicit run-time dependency on libxenctrl.so. Thus it can also be deployed in non-Xen environments where libxenctrl.so is not installed. # ldd build/sbin/kexec linux-vdso.so.1 => (0x00007fff7dbcd000) liblzma.so.0 => /usr/lib64/liblzma.so.0 (0x00000038d9000000) libz.so.1 => /lib64/libz.so.1 (0x00000038d6c00000) libdl.so.2 => /lib64/libdl.so.2 (0x00000038d6400000) libc.so.6 => /lib64/libc.so.6 (0x00000038d6000000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00000038d6800000) /lib64/ld-linux-x86-64.so.2 (0x0000562dc0c14000) # build/sbin/kexec -v kexec-tools 2.0.16 This feature/ability is enabled with the following: ./configure --with-xen=dl The previous --with-xen=no and --with-xen=yes still work as before. Not specifying a --with-xen still defaults to --with-xen=yes. As I've introduced a new build and run-time mode, I've done an extensive matrix of both build-time and run-time checks of kexec with this patch applied. The set of build-time scenarios are: 1: configure --with-xen=no and Xen support NOT present 2: configure --with-xen=no and Xen support IS present 3: configure --with-xen=yes and Xen support NOT present 4: configure --with-xen=yes and Xen support IS present 5: configure --with-xen=dl and Xen support NOT present 6: configure --with-xen=dl and Xen support IS present Xen support present requires that configure can find both xenctrl.h and libxenctrl.so. Then for each of the six scenarios above, the corresponding kexec binary was tested on a Xen system (Oracle's OVS dom0) and a non-Xen system (Oracle Linux). There are two build-time checks: did kexec build, and did it contain libxenctrl.so? The presence of libxenctrl.so in kexec was checked via ldd. The results were: Scenario | Build | libxenctrl.so | Result 1 | pass | no | pass - see Note 1 2 | pass | no | pass - see Note 1 3 | pass | no | pass - see Note 2 4 | pass | yes | pass - see Note 3 5 | pass | no | pass - see Note 2 6 | pass | no | pass - see Note 4 Note 1: This passes since due to --with-xen=no, there will be no Xen support in kexec and therefore no libxenctrl.so a in the kexec. Note 2: This passes since while --with-xen=yes, the configure displays a message indicating that Xen support is disabled, and allows kexec to build (this is the same behavior as prior to this patch). And since Xen support is disabled, there is no libxenctrl.so in the kexec. Note 3: This passes since with --with-xen=yes and configure locating the xenctrl.h and libxenctrl.so, support for Xen was built into kexec. Ldd shows an explicit dependency on the library. Note 4: This passes since with --with-xen=dl and configure locating the xenctrl.h and libxencrl.so, support for Xen was built into kexec. However, this uses the new technique introduced by this patch and, as a result, ldd shows that the libxenctrl.so is not a explicit run-time dependency for kexec (rather libdl.so is now an explicit dependency). This is precisely the goal of this patch! The net effect is that there are now three "flavors" of a kexec binary (prior to this patch there were two): a) kexec with no support for Xen [scenarios 1, 2, 3, 5], b) kexec with support for Xen and libxenctrl.so as an explicit dependency [scenario 4], and c) kexec with support for Xen and libxenctrl.so is NOT an explicit dependency [scenario 6]. The run-time checks are to take each of the six scenarios above and run the corresponding kexec binary on both a Xen system and a non-Xen system. The test for each kexec scenario was: % service kdump stop % vi /etc/init.d/kdump change KEXEC= to /sbin/kexec-[123456] % service kdump start # If not FAILED, then below % service kdump status Kdump is operational % rm -fr /var/crash/* % echo c > /proc/sysrq-trigger # after reboot verify vmcore generated % ls -al /var/crash/<tab> The results were: Scenario | Xen environment | non-Xen environment 1 | fail - see Note 5 | pass 2 | fail - see Note 5 | pass 3 | fail - see Note 6 | pass 4 | pass | fail - see Note 7 5 | fail - see Note 6 | pass 6 | pass | pass Note 5: Due to --with-xen=no, kexec lacks support for Xen and will fail in the Xen environment. This behavior is the same as prior to this patch. Note 6: Due to the missing xenctrl.h and libxenctrl.so, kexec was built without support for Xen, and thus will fail in the Xen environment. This behavior is the same as prior to this patch. Note 7: This kexec has the explicit dependency on libxenctrl.so which prevents it from running in a non-Xen environment. This is expected as this is the original issue for which this patch is intended to address. Note that for scenarios 1, 2, 3 and 5 kexec lacks support for Xen, thus these versions are expected to "fail" in a Xen environment. On the flip side, since a non-Xen environment does not need libxenctrl.so, all but scenario 4 are expected to "pass" in a non-Xen environment. The results match these expectations! And, of course, importantly with this patch applied, it did not have an adverse impact on kexec build or run-time. Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-11-01ARM: read kernel size from zImageRussell King1-1/+50
Read the new extension data which tells the boot agent about the requirements for booting the kernel image, such as how much RAM will be consumed by the kernel through decompression and booting. This is necessary to control the placement of the DTB and compressed RAM disk to avoid these objects being corrupted. Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Russell King <rmk@armlinux.org.uk> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-11-01ARM: cleanup initrd and dtb handingRussell King1-59/+57
There is no difference in the way the initrd is handled between an ATAG-based kernel and a DTB-based kernel. Therefore, this should be handled identically in both cases. Rearrange the code to achieve this. Signed-off-by: Russell King <rmk@armlinux.org.uk> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-10-18kexec-tools: mips: Use proper page_offset for OCTEON CPUs.David Daney1-0/+27
The OCTEON family of MIPS64 CPUs uses a PAGE_OFFSET of 0x8000000000000000ULL, which is differs from other CPUs. Scan /proc/cpuinfo to see if the current system is "Octeon", if so, patch the page_offset so that usable kdump core files are produced. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-10-18kexec-tools: mips: Merge adjacent memory ranges.David Daney1-4/+10
Some kernel versions running on MIPS split the System RAM memory regions reported in /proc/iomem. This may cause loading of the kexec kernel to fail if it crosses one of the splits. Fix by merging adjacent memory ranges that have the same type. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-10-16kexec-tools: mips: Try to include bss in kernel vmcore file.David Daney1-1/+4
The kernel message buffers, as well as a lot of other useful data reside in the bss section. Without this vmcore-dmesg cannot work, and debugging with a core dump is much more difficult. Try to add the /proc/iomem "Kernel bss" section to vmcore. If it is not found, just do what we used to do and use "Kernel data" instead. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-10-16kexec-tools: mips: Don't set lowmem_limit to 2G for 64-bit systems.David Daney2-3/+4
The 64-bit MIPS architecture doesn't have the same 2G limit the 32-bit version has. Set MAXMEM and lowmem_limit to 0 for 64-bit MIPS so that memory above 2G is usable in the kdump core files. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-08-30kexec-tools: ppc64: fix leak while checking for coherent device memoryHari Bathini1-0/+1
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-08-28kexec-tools: ppc64: avoid adding coherent memory regions to crash memory rangesHari Bathini2-2/+63
Accelerator devices like GPU and FPGA cards contain onboard memory. This onboard memory is represented as a memory only NUMA node, integrating it with core memory subsystem. Since, the link through which these devices are integrated to core memory goes down after a system crash and they are meant for user workloads, avoid adding coherent device memory regions to crash memory ranges. Without this change, makedumpfile tool tries to save unaccessible coherent device memory regions, crashing the system. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-08-10kexec-tools: powerpc: fix command line overflow errorHari Bathini7-9/+12
Since kernel commit a5980d064fe2 ("powerpc: Bump COMMAND_LINE_SIZE to 2048"), powerpc bumped command line size to 2048 but the size used here is still the default value of 512. Bump it to 2048 to fix command line overflow errors observed when command line length is above 512 bytes. Also, get rid of the multiple definitions of COMMAND_LINE_SIZE macro in ppc architecture. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-08-04kexec-tools: ppc64: fix how RMA top is deducedHari Bathini1-16/+19
Hang was observed, in purgatory, on a machine configured with single LPAR. This was because one of the segments was loaded outside the actual Real Memory Area (RMA) due to wrongly deduced RMA top value. Currently, top of real memory area, which is crucial for loading kexec/kdump kernel, is obtained by iterating through mem nodes and setting its value based on the base and size values of the last mem node in the iteration. That can't always be correct as the order of iteration may not be same and RMA base & size are always based on the first memory property. Fix this by setting RMA top value based on the base and size values of the memory node that has the smallest base value (first memory property) among all the memory nodes. Also, correct the misnomers rmo_base and rmo_top to rma_base and rma_top respectively. While how RMA top is deduced was broken for sometime, the issue may not have been seen so far, for couple of possible reasons: 1. Only one mem node was available. 2. First memory property has been the last node in iteration when multiple mem nodes were present. Fixes: 02f4088ffded ("kexec fix ppc64 device-tree mem node") Reported-by: Ankit Kumar <ankit@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Geoff Levand <geoff@infradead.org> Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-05-22Handle additional e820 memmap type stringsEric DeVolder1-0/+4
Keep pace with changes to linux arch/x86/kernel/e820.c to function e820_type_to_string(). With this change, the following messages from kexec are eliminated (and allows kexec to load): Unknown type (Reserved) while parsing /sys/firmware/memmap/8/type. Please report this as bug. Using RANGE_RESERVED now. Unknown type (Unknown E820 type) while parsing /sys/firmware/memmap/4/type. Please report this as bug. Using RANGE_RESERVED now. Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-05-22arm64: kdump: Add support for binary image filesPratyush Anand1-0/+13
This patch adds support to use binary image ie arch/arm64/boot/Image with kdump. Signed-off-by: Pratyush Anand <panand@redhat.com> [takahiro.akashi@linaro.org: a bit reworked] Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Tested-by: David Woodhouse <dwmw@amazon.co.uk> Tested-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>