Overview¶
This section presents the configuration process of a CXL Type-3 memory device,
and how it is ultimately exposed to users as either a DAX
device or
normal memory pages via the kernel’s page allocator.
Portions marked with a bullet are points at which certain kernel objects are generated.
Early Boot
BIOS, Build, and Boot Parameters
EFI_MEMORY_SP
CONFIG_EFI_SOFT_RESERVE
CONFIG_MHP_DEFAULT_ONLINE_TYPE
nosoftreserve
Memory Map Creation
EFI Memory Map / E820 Consulted for Soft-Reserved
CXL Memory is set aside to be handled by the CXL driver
Soft-Reserved IO Resource created for CFMWS entry
NUMA Node Creation
Nodes created from ACPI CEDT CFMWS and SRAT Proximity domains (PXM)
Memory Tier Creation
A default memory_tier is created with all nodes.
Contiguous Memory Allocation
Any requested CMA is allocated from Online nodes
Init Finishes, Drivers start probing
ACPI and PCI Drivers
Detects PCI device is CXL, marking it for probe by CXL driver
CXL Driver Operation
Base device creation
root, port, and memdev devices created
CEDT CFMWS IO Resource creation
Decoder creation
root, switch, and endpoint decoders created
Logical device creation
memory_region and endpoint devices created
Devices are associated with each other
If auto-decoder (BIOS-programmed decoders), driver validates configurations, builds associations, and locks configs at probe time.
If user-configured, validation and associations are built at decoder-commit time.
Regions surfaced as DAX region
dax_region created
DAX device created via DAX driver
DAX Driver Operation
DAX driver surfaces DAX region as one of two dax device modes
kmem - dax device is converted to hotplug memory blocks
DAX kmem IO Resource creation
hmem - dax device is left as daxdev to be accessed as a file.
If hmem, journey ends here.
DAX kmem surfaces memory region to Memory Hotplug to add to page allocator as “driver managed memory”
Memory Hotplug
mhp component surfaces a dax device memory region as multiple memory blocks to the page allocator
blocks appear in
/sys/bus/memory/devices
and linked to a NUMA node
blocks are onlined into the requested zone (NORMAL or MOVABLE)
Memory is marked “Driver Managed” to avoid kexec from using it as region for kernel updates