AMDGPU DebugFS

The amdgpu driver provides a number of debugfs files to aid in debugging issues in the driver. These are usually found in /sys/kernel/debug/dri/<num>.

DebugFS Files

amdgpu_benchmark

Run benchmarks using the DMA engine the driver uses for GPU memory paging. Write a number to the file to run the test. The results are written to the kernel log. VRAM is on device memory (dGPUs) or carve out (APUs) and GTT (Graphics Translation Tables) is system memory that is accessible by the GPU. The following tests are available:

  • 1: simple test, VRAM to GTT and GTT to VRAM

  • 2: simple test, VRAM to VRAM

  • 3: GTT to VRAM, buffer size sweep, powers of 2

  • 4: VRAM to GTT, buffer size sweep, powers of 2

  • 5: VRAM to VRAM, buffer size sweep, powers of 2

  • 6: GTT to VRAM, buffer size sweep, common display sizes

  • 7: VRAM to GTT, buffer size sweep, common display sizes

  • 8: VRAM to VRAM, buffer size sweep, common display sizes

amdgpu_test_ib

Read this file to run simple IB (Indirect Buffer) tests on all kernel managed rings. IBs are command buffers usually generated by userspace applications which are submitted to the kernel for execution on an particular GPU engine. This just runs the simple IB tests included in the kernel. These tests are engine specific and verify that IB submission works.

amdgpu_discovery

Provides raw access to the IP discovery binary provided by the GPU. Read this file to access the raw binary. This is useful for verifying the contents of the IP discovery table. It is chip specific.

amdgpu_vbios

Provides raw access to the ROM binary image from the GPU. Read this file to access the raw binary. This is useful for verifying the contents of the video BIOS ROM. It is board specific.

amdgpu_evict_gtt

Evict all buffers from the GTT memory pool. Read this file to evict all buffers from this pool.

amdgpu_evict_vram

Evict all buffers from the VRAM memory pool. Read this file to evict all buffers from this pool.

amdgpu_gpu_recover

Trigger a GPU reset. Read this file to trigger reset the entire GPU. All work currently running on the GPU will be lost.

amdgpu_ring_<name>

Provides read access to the kernel managed ring buffers for each ring <name>. These are useful for debugging problems on a particular ring. The ring buffer is how the CPU sends commands to the GPU. The CPU writes commands into the buffer and then asks the GPU engine to process it. This is the raw binary contents of the ring buffer. Use a tool like UMR to decode the rings into human readable form.

amdgpu_mqd_<name>

Provides read access to the kernel managed MQD (Memory Queue Descriptor) for ring <name> managed by the kernel driver. MQDs define the features of the ring and are used to store the ring’s state when it is not connected to hardware. The driver writes the requested ring features and metadata (GPU addresses of the ring itself and associated buffers) to the MQD and the firmware uses the MQD to populate the hardware when the ring is mapped to a hardware slot. Only available on engines which use MQDs. This provides access to the raw MQD binary.

amdgpu_error_<name>

Provides an interface to set an error code on the dma fences associated with ring <name>. The error code specified is propogated to all fences associated with the ring. Use this to inject a fence error into a ring.

amdgpu_pm_info

Provides human readable information about the power management features and state of the GPU. This includes current GFX clock, Memory clock, voltages, average SoC power, temperature, GFX load, Memory load, SMU feature mask, VCN power state, clock and power gating features.

amdgpu_firmware_info

Lists the firmware versions for all firmwares used by the GPU. Only entries with a non-0 version are valid. If the version is 0, the firmware is not valid for the GPU.

amdgpu_fence_info

Shows the last signalled and emitted fence sequence numbers for each kernel driver managed ring. Fences are associated with submissions to the engine. Emitted fences have been submitted to the ring and signalled fences have been signalled by the GPU. Rings with a larger emitted fence value have outstanding work that is still being processed by the engine that owns that ring. When the emitted and signalled fence values are equal, the ring is idle.

amdgpu_gem_info

Lists all of the PIDs using the GPU and the GPU buffers that they have allocated. This lists the buffer size, pool (VRAM, GTT, etc.), and buffer attributes (CPU access required, CPU cache attributes, etc.).

amdgpu_vm_info

Lists all of the PIDs using the GPU and the GPU buffers that they have allocated as well as the status of those buffers relative to that process’ GPU virtual address space (e.g., evicted, idle, invalidated, etc.).

amdgpu_sa_info

Prints out all of the suballocations (sa) by the suballocation manager in the kernel driver. Prints the GPU address, size, and fence info associated with each suballocation. The suballocations are used internally within the kernel driver for various things.

amdgpu_<pool>_mm

Prints TTM information about the memory pool <pool>.

amdgpu_vram

Provides direct access to VRAM. Used by tools like UMR to inspect objects in VRAM.

amdgpu_iomem

Provides direct access to GTT memory. Used by tools like UMR to inspect GTT memory.

amdgpu_regs_*

Provides direct access to various register aperatures on the GPU. Used by tools like UMR to access GPU registers.

amdgpu_regs2

Provides an IOCTL interface used by UMR for interacting with GPU registers.

amdgpu_sensors

Provides an interface to query GPU power metrics (temperature, average power, etc.). Used by tools like UMR to query GPU power metrics.

amdgpu_gca_config

Provides an interface to query GPU details (Graphics/Compute Array config, PCI config, GPU family, etc.). Used by tools like UMR to query GPU details.

amdgpu_wave

Used to query GFX/compute wave information from the hardware. Used by tools like UMR to query GFX/compute wave information.

amdgpu_gpr

Used to query GFX/compute GPR (General Purpose Register) information from the hardware. Used by tools like UMR to query GPRs when debugging shaders.

amdgpu_gprwave

Provides an IOCTL interface used by UMR for interacting with shader waves.

amdgpu_fw_attestation

Provides an interface for reading back firmware attestation records.