24.1. AMD64 Specific Boot Options¶
There are many others (usually documented in driver documentation), but only the AMD64 specific ones are listed here.
24.1.1. Machine check¶
Please see Configurable sysfs parameters for the x86-64 machine check code for sysfs runtime tunables.
- Disable machine check
- Disable CMCI(Corrected Machine Check Interrupt) that Intel processor supports. Usually this disablement is not recommended, but it might be handy if your hardware is misbehaving. Note that you’ll get more problems without CMCI than with due to the shared banks, i.e. you might get duplicated error logs.
- Don’t make logs for corrected errors. All events reported as corrected are silently cleared by OS. This option will be useful if you have no interest in any of corrected errors.
- Disable features for corrected errors, e.g. polling timer and CMCI. All events reported as corrected are not cleared by OS and remained in its error banks. Usually this disablement is not recommended, however if there is an agent checking/clearing corrected errors (e.g. BIOS or hardware monitoring applications), conflicting with OS’s error handling, and you cannot deactivate the agent, then this option will be a help.
- Do not opt-in to Local MCE delivery. Use legacy method to broadcast MCEs.
- Enable logging of machine checks left over from booting. Disabled by default on AMD Fam10h and older because some BIOS leave bogus ones. If your BIOS doesn’t do that it’s a good idea to enable though to make sure you log even machine check events that result in a reboot. On Intel systems it is enabled by default.
- Disable boot machine check logging.
- mce=tolerancelevel[,monarchtimeout] (number,number)
- tolerance levels: 0: always panic on uncorrected errors, log corrected errors 1: panic or SIGBUS on uncorrected errors, log corrected errors 2: SIGBUS or log uncorrected errors, log corrected errors 3: never panic or SIGBUS, log all errors (for testing only) Default is 1 Can be also set using sysfs which is preferable. monarchtimeout: Sets the time in us to wait for other CPUs on machine checks. 0 to disable.
- Don’t overwrite the bios-set CMCI threshold. This boot option prevents Linux from overwriting the CMCI threshold set by the bios. Without this option, Linux always sets the CMCI threshold to 1. Enabling this may make memory predictive failure analysis less effective if the bios sets thresholds for memory errors since we will not see details for all errors.
- Force-enable recoverable machine check code paths
- nomce (for compatibility with i386)
- same as mce=off
Everything else is in sysfs now.
- Use IO-APIC. Default
- Don’t use the IO-APIC.
- Don’t use the local APIC
- Don’t use the local APIC (alias for i386 compatibility)
- See IO-APIC
- Don’t set up the APIC timer
- Don’t check the IO-APIC timer. This can work around problems with incorrect timer initialization on some boards.
- Do APIC timer calibration using the pmtimer. Implies apicmaintimer. Useful when your PIT timer is totally broken.
- Deprecated, use tsc=unstable instead.
- Don’t use the HPET timer.
24.1.4. Idle loop¶
- Don’t do power saving in the idle loop using HLT, but poll for rescheduling event. This will make the CPUs eat a lot more power, but may be useful to get slightly better performance in multiprocessor benchmarks. It also makes some profiling using performance counters more accurate. Please note that on systems with MONITOR/MWAIT support (like Intel EM64T CPUs) this option has no performance advantage over the normal idle loop. It may also interact badly with hyperthreading.
- reboot=b[ios] | t[riple] | k[bd] | a[cpi] | e[fi] [, [w]arm | [c]old]
- Use the CPU reboot vector for warm reset
- Don’t set the cold reboot flag
- Set the cold reboot flag
- Force a triple fault (init)
- Use the keyboard controller. cold reset (default)
- Use the ACPI RESET_REG in the FADT. If ACPI is not configured or the ACPI reset does not work, the reboot path attempts the reset using the keyboard controller.
- Use efi reset_system runtime service. If EFI is not configured or the EFI reset does not work, the reboot path attempts the reset using the keyboard controller.
Using warm reset will be much faster especially on big memory systems because the BIOS will not go through the memory check. Disadvantage is that not all hardware will be completely reinitialized on reboot so there may be boot problems on some systems.
- Don’t stop other CPUs on reboot. This can make reboot more reliable in some cases.
24.1.6. Non Executable Mappings¶
- Only set up a single NUMA node spanning all memory.
- Don’t parse the SRAT table for NUMA setup
- Don’t parse the HMAT table for NUMA setup, or soft-reserved memory partitioning.
- If given as a memory unit, fills all system RAM with nodes of size interleaved over physical nodes.
- If given as an integer, fills all system RAM with N fake nodes interleaved over physical nodes.
- If given as an integer followed by ‘U’, it will divide each physical node into N emulated nodes.
- Don’t enable ACPI
- Use ACPI boot table parsing, but don’t enable ACPI interpreter
- Force ACPI on (currently not needed)
- Disable out of spec ACPI workarounds.
- Set up ACPI SCI interrupt.
- Don’t route interrupts
- Disable firmware first mode for corrected errors. This disables parsing the HEST CMC error source to check if firmware has set the FF flag. This may result in duplicate corrected error reports.
- Don’t use PCI
- Use conf1 access.
- Use conf2 access.
- Assign ROMs.
- Assign busses
- Set PCI interrupt mask to MASK
- Scan up to NUMBER busses, no matter what the mptable says.
- Don’t use ACPI to set up PCI interrupt routing.
24.1.10. IOMMU (input/output memory management unit)¶
Multiple x86-64 PCI-DMA mapping implementations exist, for example:
- <kernel/dma/direct.c>: use no hardware/software IOMMU at all (e.g. because you have < 3 GB memory). Kernel boot message: “PCI-DMA: Disabling IOMMU”
- <arch/x86/kernel/amd_gart_64.c>: AMD GART based hardware IOMMU. Kernel boot message: “PCI-DMA: using GART IOMMU”
- <arch/x86_64/kernel/pci-swiotlb.c> : Software IOMMU implementation. Used e.g. if there is no hardware IOMMU in the system and it is need because you have >3GB memory or told the kernel to us it (iommu=soft)) Kernel boot message: “PCI-DMA: Using software bounce buffering for IO (SWIOTLB)”
- <arch/x86_64/pci-calgary.c> : IBM Calgary hardware IOMMU. Used in IBM pSeries and xSeries servers. This hardware IOMMU supports DMA address mapping with memory protection, etc. Kernel boot message: “PCI-DMA: Using Calgary IOMMU”
iommu=[<size>][,noagp][,off][,force][,noforce] [,memaper[=<order>]][,merge][,fullflush][,nomerge] [,noaperture][,calgary]
General iommu options:
- Don’t initialize and use any kind of IOMMU.
- Don’t force hardware IOMMU usage when it is not needed. (default).
- Force the use of the hardware IOMMU even when it is not actually needed (e.g. because < 3 GB memory).
- Use software bounce buffering (SWIOTLB) (default for Intel machines). This can be used to prevent the usage of an available hardware IOMMU.
iommu options only relevant to the AMD GART hardware IOMMU:
- Set the size of the remapping area in bytes.
- Overwrite iommu off workarounds for specific chipsets.
- Flush IOMMU on each allocation (default).
- Don’t use IOMMU fullflush.
- Allocate an own aperture over RAM with size 32MB<<order. (default: order=1, i.e. 64MB)
- Do scatter-gather (SG) merging. Implies “force” (experimental).
- Don’t do scatter-gather (SG) merging.
- Ask the IOMMU not to touch the aperture for AGP.
- Don’t initialize the AGP driver and use full aperture.
- Always panic when IOMMU overflows.
- Use the Calgary IOMMU if it is available
iommu options only relevant to the software bounce buffering (SWIOTLB) IOMMU implementation:
- Prereserve that many 128K pages for the software IO bounce buffering.
- Force all IO through the software TLB.
Settings for the IBM Calgary hardware IOMMU currently found in IBM pSeries and xSeries machines
- Set the size of each PCI slot’s translation table when using the Calgary IOMMU. This is the size of the translation table itself in main memory. The smallest table, 64k, covers an IO space of 32MB; the largest, 8MB table, can cover an IO space of 4GB. Normally the kernel will make the right choice by itself.
- Enable translation even on slots that have no devices attached to them, in case a device will be hotplugged in the future.
- calgary=[disable=<PCI bus number>]
- Disable translation on a given PHB. For example, the built-in graphics adapter resides on the first bridge (PCI bus number 0); if translation (isolation) is enabled on this bridge, X servers that access the hardware directly from user space might stop working. Use this option if you have devices that are accessed from userspace directly on some PCI host bridge.
- Always panic when IOMMU overflows
- Do not use GB pages for kernel direct mappings.
- Use GB pages for kernel direct mappings.