sphinx.addnodesdocument)}( rawsourcechildren]( translations LanguagesNode)}(hhh](h pending_xref)}(hhh]docutils.nodesTextChinese (Simplified)}parenthsba attributes}(ids]classes]names]dupnames]backrefs] refdomainstdreftypedoc reftarget"/translations/zh_CN/gpu/rfc/gpusvmmodnameN classnameN refexplicitutagnamehhh ubh)}(hhh]hChinese (Traditional)}hh2sbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget"/translations/zh_TW/gpu/rfc/gpusvmmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hItalian}hhFsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget"/translations/it_IT/gpu/rfc/gpusvmmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hJapanese}hhZsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget"/translations/ja_JP/gpu/rfc/gpusvmmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hKorean}hhnsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget"/translations/ko_KR/gpu/rfc/gpusvmmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hSpanish}hhsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget"/translations/sp_SP/gpu/rfc/gpusvmmodnameN classnameN refexplicituh1hhh ubeh}(h]h ]h"]h$]h&]current_languageEnglishuh1h hh _documenthsourceNlineNubhcomment)}(h*SPDX-License-Identifier: (GPL-2.0+ OR MIT)h]h*SPDX-License-Identifier: (GPL-2.0+ OR MIT)}hhsbah}(h]h ]h"]h$]h&] xml:spacepreserveuh1hhhhhhhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhj:ubah}(h]h ]h"]h$]h&]uh1hhjubh)}(hPartial migration is supported (i.e., a subset of pages attempting to migrate can actually migrate, with only the faulting page guaranteed to migrate).h]j )}(hPartial migration is supported (i.e., a subset of pages attempting to migrate can actually migrate, with only the faulting page guaranteed to migrate).h]hPartial migration is supported (i.e., a subset of pages attempting to migrate can actually migrate, with only the faulting page guaranteed to migrate).}(hjVhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhjRubah}(h]h ]h"]h$]h&]uh1hhjubh)}(hDDriver handles mixed migrations via retry loops rather than locking.h]j )}(hjlh]hDDriver handles mixed migrations via retry loops rather than locking.}(hjnhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhjjubah}(h]h ]h"]h$]h&]uh1hhjubeh}(h]h ]h"]h$]h&]bullet*uh1hhhhK hjubah}(h]h ]h"]h$]h&]uh1jhhubeh}(h]h ]h"]h$]h&]uh1hhhhKhhubah}(h]h ]h"]h$]h&]uh1hhhubah}(h]h ]h"]h$]h&]uh1hhhhhhNhNubh)}(hXEviction * Eviction is defined as migrating data from the GPU back to the CPU without a virtual address to free up GPU memory. * Only looking at physical memory data structures and locks as opposed to looking at virtual memory data structures and locks. * No looking at mm/vma structs or relying on those being locked. * The rationale for the above two points is that CPU virtual addresses can change at any moment, while the physical pages remain stable. * GPU page table invalidation, which requires a GPU virtual address, is handled via the notifier that has access to the GPU virtual address.h]h)}(hhh]h)}(hX\Eviction * Eviction is defined as migrating data from the GPU back to the CPU without a virtual address to free up GPU memory. * Only looking at physical memory data structures and locks as opposed to looking at virtual memory data structures and locks. * No looking at mm/vma structs or relying on those being locked. * The rationale for the above two points is that CPU virtual addresses can change at any moment, while the physical pages remain stable. * GPU page table invalidation, which requires a GPU virtual address, is handled via the notifier that has access to the GPU virtual address.h](h)}(hEvictionh]hEviction}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK hjubj)}(hhh]h)}(hhh](h)}(hsEviction is defined as migrating data from the GPU back to the CPU without a virtual address to free up GPU memory.h]j )}(hsEviction is defined as migrating data from the GPU back to the CPU without a virtual address to free up GPU memory.h]hsEviction is defined as migrating data from the GPU back to the CPU without a virtual address to free up GPU memory.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhjubah}(h]h ]h"]h$]h&]uh1hhjubh)}(h|Only looking at physical memory data structures and locks as opposed to looking at virtual memory data structures and locks.h]j )}(h|Only looking at physical memory data structures and locks as opposed to looking at virtual memory data structures and locks.h]h|Only looking at physical memory data structures and locks as opposed to looking at virtual memory data structures and locks.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhjubah}(h]h ]h"]h$]h&]uh1hhjubh)}(h>No looking at mm/vma structs or relying on those being locked.h]j )}(hjh]h>No looking at mm/vma structs or relying on those being locked.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhjubah}(h]h ]h"]h$]h&]uh1hhjubh)}(hThe rationale for the above two points is that CPU virtual addresses can change at any moment, while the physical pages remain stable.h]j )}(hThe rationale for the above two points is that CPU virtual addresses can change at any moment, while the physical pages remain stable.h]hThe rationale for the above two points is that CPU virtual addresses can change at any moment, while the physical pages remain stable.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhjubah}(h]h ]h"]h$]h&]uh1hhjubh)}(hGPU page table invalidation, which requires a GPU virtual address, is handled via the notifier that has access to the GPU virtual address.h]j )}(hGPU page table invalidation, which requires a GPU virtual address, is handled via the notifier that has access to the GPU virtual address.h]hGPU page table invalidation, which requires a GPU virtual address, is handled via the notifier that has access to the GPU virtual address.}(hj#hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK hjubah}(h]h ]h"]h$]h&]uh1hhjubeh}(h]h ]h"]h$]h&]jjuh1hhhhKhjubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1hhhhK hjubah}(h]h ]h"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]uh1hhhhhhNhNubh)}(hXGPU fault side * mmap_read only used around core MM functions which require this lock and should strive to take mmap_read lock only in GPU SVM layer. * Big retry loop to handle all races with the mmu notifier under the gpu pagetable locks/mmu notifier range lock/whatever we end up calling those. * Races (especially against concurrent eviction or migrate_to_ram) should not be handled on the fault side by trying to hold locks; rather, they should be handled using retry loops. One possible exception is holding a BO's dma-resv lock during the initial migration to VRAM, as this is a well-defined lock that can be taken underneath the mmap_read lock. * One possible issue with the above approach is if a driver has a strict migration policy requiring GPU access to occur in GPU memory. Concurrent CPU access could cause a livelock due to endless retries. While no current user (Xe) of GPU SVM has such a policy, it is likely to be added in the future. Ideally, this should be resolved on the core-MM side rather than through a driver-side lock.h]h)}(hhh]h)}(hX/GPU fault side * mmap_read only used around core MM functions which require this lock and should strive to take mmap_read lock only in GPU SVM layer. * Big retry loop to handle all races with the mmu notifier under the gpu pagetable locks/mmu notifier range lock/whatever we end up calling those. * Races (especially against concurrent eviction or migrate_to_ram) should not be handled on the fault side by trying to hold locks; rather, they should be handled using retry loops. One possible exception is holding a BO's dma-resv lock during the initial migration to VRAM, as this is a well-defined lock that can be taken underneath the mmap_read lock. * One possible issue with the above approach is if a driver has a strict migration policy requiring GPU access to occur in GPU memory. Concurrent CPU access could cause a livelock due to endless retries. While no current user (Xe) of GPU SVM has such a policy, it is likely to be added in the future. Ideally, this should be resolved on the core-MM side rather than through a driver-side lock.h](h)}(hGPU fault sideh]hGPU fault side}(hj`hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK2hj\ubj)}(hhh]h)}(hhh](h)}(hmmap_read only used around core MM functions which require this lock and should strive to take mmap_read lock only in GPU SVM layer.h]j )}(hmmap_read only used around core MM functions which require this lock and should strive to take mmap_read lock only in GPU SVM layer.h]hmmap_read only used around core MM functions which require this lock and should strive to take mmap_read lock only in GPU SVM layer.}(hjxhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK#hjtubah}(h]h ]h"]h$]h&]uh1hhjqubh)}(hBig retry loop to handle all races with the mmu notifier under the gpu pagetable locks/mmu notifier range lock/whatever we end up calling those.h]j )}(hBig retry loop to handle all races with the mmu notifier under the gpu pagetable locks/mmu notifier range lock/whatever we end up calling those.h]hBig retry loop to handle all races with the mmu notifier under the gpu pagetable locks/mmu notifier range lock/whatever we end up calling those.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK%hjubah}(h]h ]h"]h$]h&]uh1hhjqubh)}(hX`Races (especially against concurrent eviction or migrate_to_ram) should not be handled on the fault side by trying to hold locks; rather, they should be handled using retry loops. One possible exception is holding a BO's dma-resv lock during the initial migration to VRAM, as this is a well-defined lock that can be taken underneath the mmap_read lock.h]j )}(hX`Races (especially against concurrent eviction or migrate_to_ram) should not be handled on the fault side by trying to hold locks; rather, they should be handled using retry loops. One possible exception is holding a BO's dma-resv lock during the initial migration to VRAM, as this is a well-defined lock that can be taken underneath the mmap_read lock.h]hXbRaces (especially against concurrent eviction or migrate_to_ram) should not be handled on the fault side by trying to hold locks; rather, they should be handled using retry loops. One possible exception is holding a BO’s dma-resv lock during the initial migration to VRAM, as this is a well-defined lock that can be taken underneath the mmap_read lock.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK(hjubah}(h]h ]h"]h$]h&]uh1hhjqubh)}(hXOne possible issue with the above approach is if a driver has a strict migration policy requiring GPU access to occur in GPU memory. Concurrent CPU access could cause a livelock due to endless retries. While no current user (Xe) of GPU SVM has such a policy, it is likely to be added in the future. Ideally, this should be resolved on the core-MM side rather than through a driver-side lock.h]j )}(hXOne possible issue with the above approach is if a driver has a strict migration policy requiring GPU access to occur in GPU memory. Concurrent CPU access could cause a livelock due to endless retries. While no current user (Xe) of GPU SVM has such a policy, it is likely to be added in the future. Ideally, this should be resolved on the core-MM side rather than through a driver-side lock.h]hXOne possible issue with the above approach is if a driver has a strict migration policy requiring GPU access to occur in GPU memory. Concurrent CPU access could cause a livelock due to endless retries. While no current user (Xe) of GPU SVM has such a policy, it is likely to be added in the future. Ideally, this should be resolved on the core-MM side rather than through a driver-side lock.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK.hjubah}(h]h ]h"]h$]h&]uh1hhjqubeh}(h]h ]h"]h$]h&]jjuh1hhhhK#hjnubah}(h]h ]h"]h$]h&]uh1jhj\ubeh}(h]h ]h"]h$]h&]uh1hhhhK2hjYubah}(h]h ]h"]h$]h&]uh1hhjUubah}(h]h ]h"]h$]h&]uh1hhhhhhNhNubh)}(hX0Physical memory to virtual backpointer * This does not work, as no pointers from physical memory to virtual memory should exist. mremap() is an example of the core MM updating the virtual address without notifying the driver of address change rather the driver only receiving the invalidation notifier. * The physical memory backpointer (page->zone_device_data) should remain stable from allocation to page free. Safely updating this against a concurrent user would be very difficult unless the page is free.h]h)}(hhh]h)}(hXPhysical memory to virtual backpointer * This does not work, as no pointers from physical memory to virtual memory should exist. mremap() is an example of the core MM updating the virtual address without notifying the driver of address change rather the driver only receiving the invalidation notifier. * The physical memory backpointer (page->zone_device_data) should remain stable from allocation to page free. Safely updating this against a concurrent user would be very difficult unless the page is free.h](h)}(h&Physical memory to virtual backpointerh]h&Physical memory to virtual backpointer}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK:hjubj)}(hhh]h)}(hhh](h)}(hXThis does not work, as no pointers from physical memory to virtual memory should exist. mremap() is an example of the core MM updating the virtual address without notifying the driver of address change rather the driver only receiving the invalidation notifier.h]j )}(hXThis does not work, as no pointers from physical memory to virtual memory should exist. mremap() is an example of the core MM updating the virtual address without notifying the driver of address change rather the driver only receiving the invalidation notifier.h]hXThis does not work, as no pointers from physical memory to virtual memory should exist. mremap() is an example of the core MM updating the virtual address without notifying the driver of address change rather the driver only receiving the invalidation notifier.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK5hjubah}(h]h ]h"]h$]h&]uh1hhjubh)}(hThe physical memory backpointer (page->zone_device_data) should remain stable from allocation to page free. Safely updating this against a concurrent user would be very difficult unless the page is free.h]j )}(hThe physical memory backpointer (page->zone_device_data) should remain stable from allocation to page free. Safely updating this against a concurrent user would be very difficult unless the page is free.h]hThe physical memory backpointer (page->zone_device_data) should remain stable from allocation to page free. Safely updating this against a concurrent user would be very difficult unless the page is free.}(hj-hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK9hj)ubah}(h]h ]h"]h$]h&]uh1hhjubeh}(h]h ]h"]h$]h&]jjuh1hhhhK5hj ubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1hhhhK:hjubah}(h]h ]h"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]uh1hhhhhhNhNubh)}(hXKGPU pagetable locking * Notifier lock only protects range tree, pages valid state for a range (rather than seqno due to wider notifiers), pagetable entries, and mmu notifier seqno tracking, it is not a global lock to protect against races. * All races handled with big retry as mentioned above. h]h)}(hhh]h)}(hX-GPU pagetable locking * Notifier lock only protects range tree, pages valid state for a range (rather than seqno due to wider notifiers), pagetable entries, and mmu notifier seqno tracking, it is not a global lock to protect against races. * All races handled with big retry as mentioned above. h](h)}(hGPU pagetable lockingh]hGPU pagetable locking}(hjjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKAhjfubj)}(hhh]h)}(hhh](h)}(hNotifier lock only protects range tree, pages valid state for a range (rather than seqno due to wider notifiers), pagetable entries, and mmu notifier seqno tracking, it is not a global lock to protect against races.h]j )}(hNotifier lock only protects range tree, pages valid state for a range (rather than seqno due to wider notifiers), pagetable entries, and mmu notifier seqno tracking, it is not a global lock to protect against races.h]hNotifier lock only protects range tree, pages valid state for a range (rather than seqno due to wider notifiers), pagetable entries, and mmu notifier seqno tracking, it is not a global lock to protect against races.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK=hj~ubah}(h]h ]h"]h$]h&]uh1hhj{ubh)}(h5All races handled with big retry as mentioned above. h]j )}(h4All races handled with big retry as mentioned above.h]h4All races handled with big retry as mentioned above.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKAhjubah}(h]h ]h"]h$]h&]uh1hhj{ubeh}(h]h ]h"]h$]h&]jjuh1hhhhK=hjxubah}(h]h ]h"]h$]h&]uh1jhjfubeh}(h]h ]h"]h$]h&]uh1hhhhKAhjcubah}(h]h ]h"]h$]h&]uh1hhj_ubah}(h]h ]h"]h$]h&]uh1hhhhhhNhNubeh}(h]h ]h"]h$]h&]jjuh1hhhhK hhhhubeh}(h]agreed-upon-design-principlesah ]h"]agreed upon design principlesah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(hOverview of baseline designh]hOverview of baseline design}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKDubj )}(hXlGPU Shared Virtual Memory (GPU SVM) layer for the Direct Rendering Manager (DRM) is a component of the DRM framework designed to manage shared virtual memory between the CPU and GPU. It enables efficient data exchange and processing for GPU-accelerated applications by allowing memory sharing and synchronization between the CPU's and GPU's virtual address spaces.h]hXpGPU Shared Virtual Memory (GPU SVM) layer for the Direct Rendering Manager (DRM) is a component of the DRM framework designed to manage shared virtual memory between the CPU and GPU. It enables efficient data exchange and processing for GPU-accelerated applications by allowing memory sharing and synchronization between the CPU’s and GPU’s virtual address spaces.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chKhjhhubj )}(hKey GPU SVM Components:h]hKey GPU SVM Components:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chKhjhhubh)}(hhh](h)}(hXNotifiers: Used for tracking memory intervals and notifying the GPU of changes, notifiers are sized based on a GPU SVM initialization parameter, with a recommendation of 512M or larger. They maintain a Red-BlacK tree and a list of ranges that fall within the notifier interval. Notifiers are tracked within a GPU SVM Red-BlacK tree and list and are dynamically inserted or removed as ranges within the interval are created or destroyed.h]h)}(hhh]h)}(hXNotifiers: Used for tracking memory intervals and notifying the GPU of changes, notifiers are sized based on a GPU SVM initialization parameter, with a recommendation of 512M or larger. They maintain a Red-BlacK tree and a list of ranges that fall within the notifier interval. Notifiers are tracked within a GPU SVM Red-BlacK tree and list and are dynamically inserted or removed as ranges within the interval are created or destroyed.h](h)}(h Notifiers:h]h Notifiers:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hh[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chK%hjubj)}(hhh]j )}(hXUsed for tracking memory intervals and notifying the GPU of changes, notifiers are sized based on a GPU SVM initialization parameter, with a recommendation of 512M or larger. They maintain a Red-BlacK tree and a list of ranges that fall within the notifier interval. Notifiers are tracked within a GPU SVM Red-BlacK tree and list and are dynamically inserted or removed as ranges within the interval are created or destroyed.h]hXUsed for tracking memory intervals and notifying the GPU of changes, notifiers are sized based on a GPU SVM initialization parameter, with a recommendation of 512M or larger. They maintain a Red-BlacK tree and a list of ranges that fall within the notifier interval. Notifiers are tracked within a GPU SVM Red-BlacK tree and list and are dynamically inserted or removed as ranges within the interval are created or destroyed.}(hj)hhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chK hj&ubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1hhj%hK%hjubah}(h]h ]h"]h$]h&]uh1hhj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hX Ranges: Represent memory ranges mapped in a DRM device and managed by GPU SVM. They are sized based on an array of chunk sizes, which is a GPU SVM initialization parameter, and the CPU address space. Upon GPU fault, the largest aligned chunk that fits within the faulting CPU address space is chosen for the range size. Ranges are expected to be dynamically allocated on GPU fault and removed on an MMU notifier UNMAP event. As mentioned above, ranges are tracked in a notifier's Red-Black tree. h]h)}(hhh]h)}(hXRanges: Represent memory ranges mapped in a DRM device and managed by GPU SVM. They are sized based on an array of chunk sizes, which is a GPU SVM initialization parameter, and the CPU address space. Upon GPU fault, the largest aligned chunk that fits within the faulting CPU address space is chosen for the range size. Ranges are expected to be dynamically allocated on GPU fault and removed on an MMU notifier UNMAP event. As mentioned above, ranges are tracked in a notifier's Red-Black tree. h](h)}(hRanges:h]hRanges:}(hj[hhhNhNubah}(h]h ]h"]h$]h&]uh1hh[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chK/hjWubj)}(hhh]j )}(hXRepresent memory ranges mapped in a DRM device and managed by GPU SVM. They are sized based on an array of chunk sizes, which is a GPU SVM initialization parameter, and the CPU address space. Upon GPU fault, the largest aligned chunk that fits within the faulting CPU address space is chosen for the range size. Ranges are expected to be dynamically allocated on GPU fault and removed on an MMU notifier UNMAP event. As mentioned above, ranges are tracked in a notifier's Red-Black tree.h]hXRepresent memory ranges mapped in a DRM device and managed by GPU SVM. They are sized based on an array of chunk sizes, which is a GPU SVM initialization parameter, and the CPU address space. Upon GPU fault, the largest aligned chunk that fits within the faulting CPU address space is chosen for the range size. Ranges are expected to be dynamically allocated on GPU fault and removed on an MMU notifier UNMAP event. As mentioned above, ranges are tracked in a notifier’s Red-Black tree.}(hjmhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chK(hjjubah}(h]h ]h"]h$]h&]uh1jhjWubeh}(h]h ]h"]h$]h&]uh1hhjihK/hjTubah}(h]h ]h"]h$]h&]uh1hhjPubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hOperations: Define the interface for driver-specific GPU SVM operations such as range allocation, notifier allocation, and invalidations. h]h)}(hhh]h)}(hOperations: Define the interface for driver-specific GPU SVM operations such as range allocation, notifier allocation, and invalidations. h](h)}(h Operations:h]h Operations:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hh[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chK3hjubj)}(hhh]j )}(h}Define the interface for driver-specific GPU SVM operations such as range allocation, notifier allocation, and invalidations.h]h}Define the interface for driver-specific GPU SVM operations such as range allocation, notifier allocation, and invalidations.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chK2hjubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1hhjhK3hjubah}(h]h ]h"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hDevice Memory Allocations: Embedded structure containing enough information for GPU SVM to migrate to / from device memory. h]h)}(hhh]h)}(h|Device Memory Allocations: Embedded structure containing enough information for GPU SVM to migrate to / from device memory. h](h)}(hDevice Memory Allocations:h]hDevice Memory Allocations:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hh[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chK7hjubj)}(hhh]j )}(h`Embedded structure containing enough information for GPU SVM to migrate to / from device memory.h]h`Embedded structure containing enough information for GPU SVM to migrate to / from device memory.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chK6hjubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1hhjhK7hjubah}(h]h ]h"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hDevice Memory Operations: Define the interface for driver-specific device memory operations release memory, populate pfns, and copy to / from device memory. h]h)}(hhh]h)}(hDevice Memory Operations: Define the interface for driver-specific device memory operations release memory, populate pfns, and copy to / from device memory. h](h)}(hDevice Memory Operations:h]hDevice Memory Operations:}(hj'hhhNhNubah}(h]h ]h"]h$]h&]uh1hh[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chK;hj#ubj)}(hhh]j )}(hDefine the interface for driver-specific device memory operations release memory, populate pfns, and copy to / from device memory.h]hDefine the interface for driver-specific device memory operations release memory, populate pfns, and copy to / from device memory.}(hj9hhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chK:hj6ubah}(h]h ]h"]h$]h&]uh1jhj#ubeh}(h]h ]h"]h$]h&]uh1hhj5hK;hj ubah}(h]h ]h"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]j-uh1hh[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chKhjhhubj )}(hX|This layer provides interfaces for allocating, mapping, migrating, and releasing memory ranges between the CPU and GPU. It handles all core memory management interactions (DMA mapping, HMM, and migration) and provides driver-specific virtual functions (vfuncs). This infrastructure is sufficient to build the expected driver components for an SVM implementation as detailed below.h]hX|This layer provides interfaces for allocating, mapping, migrating, and releasing memory ranges between the CPU and GPU. It handles all core memory management interactions (DMA mapping, HMM, and migration) and provides driver-specific virtual functions (vfuncs). This infrastructure is sufficient to build the expected driver components for an SVM implementation as detailed below.}(hjhhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chK=hjhhubj )}(hExpected Driver Components:h]hExpected Driver Components:}(hjwhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chKDhjhhubh)}(hhh](h)}(hGPU page fault handler: Used to create ranges and notifiers based on the fault address, optionally migrate the range to device memory, and create GPU bindings. h]h)}(hhh]h)}(hGPU page fault handler: Used to create ranges and notifiers based on the fault address, optionally migrate the range to device memory, and create GPU bindings. h](h)}(hGPU page fault handler:h]hGPU page fault handler:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hh[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chKHhjubj)}(hhh]j )}(hUsed to create ranges and notifiers based on the fault address, optionally migrate the range to device memory, and create GPU bindings.h]hUsed to create ranges and notifiers based on the fault address, optionally migrate the range to device memory, and create GPU bindings.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chKGhjubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1hhjhKHhjubah}(h]h ]h"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]uh1hhjubh)}(hGarbage collector: Used to unmap and destroy GPU bindings for ranges. Ranges are expected to be added to the garbage collector upon a MMU_NOTIFY_UNMAP event in notifier callback. h]h)}(hhh]h)}(hGarbage collector: Used to unmap and destroy GPU bindings for ranges. Ranges are expected to be added to the garbage collector upon a MMU_NOTIFY_UNMAP event in notifier callback. h](h)}(hGarbage collector:h]hGarbage collector:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hh[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chKMhjubj)}(hhh]j )}(hUsed to unmap and destroy GPU bindings for ranges. Ranges are expected to be added to the garbage collector upon a MMU_NOTIFY_UNMAP event in notifier callback.h]hUsed to unmap and destroy GPU bindings for ranges. Ranges are expected to be added to the garbage collector upon a MMU_NOTIFY_UNMAP event in notifier callback.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chKKhjubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1hhjhKMhjubah}(h]h ]h"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]uh1hhjubh)}(hPNotifier callback: Used to invalidate and DMA unmap GPU bindings for ranges. h]h)}(hhh]h)}(hMNotifier callback: Used to invalidate and DMA unmap GPU bindings for ranges. h](h)}(hNotifier callback:h]hNotifier callback:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hh[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chKPhjubj)}(hhh]j )}(h9Used to invalidate and DMA unmap GPU bindings for ranges.h]h9Used to invalidate and DMA unmap GPU bindings for ranges.}(hj.hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj*hKPhj+ubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1hhj*hKPhjubah}(h]h ]h"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]uh1hhjubeh}(h]h ]h"]h$]h&]jjfuh1hh[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:70: ./drivers/gpu/drm/drm_gpusvm.chKFhjhhubj )}(haGPU SVM handles locking for core MM interactions, i.e., it locks/unlocks the mmap lock as needed.h]haGPU SVM handles locking for core MM interactions, i.e., it locks/unlocks the mmap lock as needed.}(hj[hhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:73: ./drivers/gpu/drm/drm_gpusvm.chKVhjhhubj )}(hX3GPU SVM introduces a global notifier lock, which safeguards the notifier's range RB tree and list, as well as the range's DMA mappings and sequence number. GPU SVM manages all necessary locking and unlocking operations, except for the recheck range's pages being valid (drm_gpusvm_range_pages_valid) when the driver is committing GPU bindings. This lock corresponds to the ``driver->update`` lock mentioned in Documentation/mm/hmm.rst. Future revisions may transition from a GPU SVM global lock to a per-notifier lock if finer-grained locking is deemed necessary.h](hX{GPU SVM introduces a global notifier lock, which safeguards the notifier’s range RB tree and list, as well as the range’s DMA mappings and sequence number. GPU SVM manages all necessary locking and unlocking operations, except for the recheck range’s pages being valid (drm_gpusvm_range_pages_valid) when the driver is committing GPU bindings. This lock corresponds to the }(hjjhhhNhNubhliteral)}(h``driver->update``h]hdriver->update}(hjthhhNhNubah}(h]h ]h"]h$]h&]uh1jrhjjubh lock mentioned in Documentation/mm/hmm.rst. Future revisions may transition from a GPU SVM global lock to a per-notifier lock if finer-grained locking is deemed necessary.}(hjjhhhNhNubeh}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:73: ./drivers/gpu/drm/drm_gpusvm.chKYhjhhubj )}(hXIn addition to the locking mentioned above, the driver should implement a lock to safeguard core GPU SVM function calls that modify state, such as drm_gpusvm_range_find_or_insert and drm_gpusvm_range_remove. This lock is denoted as 'driver_svm_lock' in code examples. Finer grained driver side locking should also be possible for concurrent GPU fault processing within a single GPU SVM. The 'driver_svm_lock' can be via drm_gpusvm_driver_set_lock to add annotations to GPU SVM.h]hXIn addition to the locking mentioned above, the driver should implement a lock to safeguard core GPU SVM function calls that modify state, such as drm_gpusvm_range_find_or_insert and drm_gpusvm_range_remove. This lock is denoted as ‘driver_svm_lock’ in code examples. Finer grained driver side locking should also be possible for concurrent GPU fault processing within a single GPU SVM. The ‘driver_svm_lock’ can be via drm_gpusvm_driver_set_lock to add annotations to GPU SVM.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:73: ./drivers/gpu/drm/drm_gpusvm.chKchjhhubj )}(hXThe migration support is quite simple, allowing migration between RAM and device memory at the range granularity. For example, GPU SVM currently does not support mixing RAM and device memory pages within a range. This means that upon GPU fault, the entire range can be migrated to device memory, and upon CPU fault, the entire range is migrated to RAM. Mixed RAM and device memory storage within a range could be added in the future if required.h]hXThe migration support is quite simple, allowing migration between RAM and device memory at the range granularity. For example, GPU SVM currently does not support mixing RAM and device memory pages within a range. This means that upon GPU fault, the entire range can be migrated to device memory, and upon CPU fault, the entire range is migrated to RAM. Mixed RAM and device memory storage within a range could be added in the future if required.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:76: ./drivers/gpu/drm/drm_gpusvm.chKohjhhubj )}(hThe reasoning for only supporting range granularity is as follows: it simplifies the implementation, and range sizes are driver-defined and should be relatively small.h]hThe reasoning for only supporting range granularity is as follows: it simplifies the implementation, and range sizes are driver-defined and should be relatively small.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:76: ./drivers/gpu/drm/drm_gpusvm.chKvhjhhubj )}(hXKPartial unmapping of ranges (e.g., 1M out of 2M is unmapped by CPU resulting in MMU_NOTIFY_UNMAP event) presents several challenges, with the main one being that a subset of the range still has CPU and GPU mappings. If the backing store for the range is in device memory, a subset of the backing store has references. One option would be to split the range and device memory backing store, but the implementation for this would be quite complicated. Given that partial unmappings are rare and driver-defined range sizes are relatively small, GPU SVM does not support splitting of ranges.h]hXKPartial unmapping of ranges (e.g., 1M out of 2M is unmapped by CPU resulting in MMU_NOTIFY_UNMAP event) presents several challenges, with the main one being that a subset of the range still has CPU and GPU mappings. If the backing store for the range is in device memory, a subset of the backing store has references. One option would be to split the range and device memory backing store, but the implementation for this would be quite complicated. Given that partial unmappings are rare and driver-defined range sizes are relatively small, GPU SVM does not support splitting of ranges.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:79: ./drivers/gpu/drm/drm_gpusvm.chK~hjhhubj )}(hXWith no support for range splitting, upon partial unmapping of a range, the driver is expected to invalidate and destroy the entire range. If the range has device memory as its backing, the driver is also expected to migrate any remaining pages back to RAM.h]hXWith no support for range splitting, upon partial unmapping of a range, the driver is expected to invalidate and destroy the entire range. If the range has device memory as its backing, the driver is also expected to migrate any remaining pages back to RAM.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:79: ./drivers/gpu/drm/drm_gpusvm.chKhjhhubj )}(hThis section provides three examples of how to build the expected driver components: the GPU page fault handler, the garbage collector, and the notifier callback.h]hThis section provides three examples of how to build the expected driver components: the GPU page fault handler, the garbage collector, and the notifier callback.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:82: ./drivers/gpu/drm/drm_gpusvm.chKhjhhubj )}(hThe generic code provided does not include logic for complex migration policies, optimized invalidations, fined grained driver locking, or other potentially required driver locking (e.g., DMA-resv locks).h]hThe generic code provided does not include logic for complex migration policies, optimized invalidations, fined grained driver locking, or other potentially required driver locking (e.g., DMA-resv locks).}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:82: ./drivers/gpu/drm/drm_gpusvm.chKhjhhubhenumerated_list)}(hhh]h)}(hGPU page fault handler h]j )}(hGPU page fault handlerh]hGPU page fault handler}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:82: ./drivers/gpu/drm/drm_gpusvm.chKhjubah}(h]h ]h"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]enumtypearabicprefixhsuffix)uh1jhjhhhNhNubh literal_block)}(hX9int driver_bind_range(struct drm_gpusvm *gpusvm, struct drm_gpusvm_range *range) { int err = 0; driver_alloc_and_setup_memory_for_bind(gpusvm, range); drm_gpusvm_notifier_lock(gpusvm); if (drm_gpusvm_range_pages_valid(range)) driver_commit_bind(gpusvm, range); else err = -EAGAIN; drm_gpusvm_notifier_unlock(gpusvm); return err; } int driver_gpu_fault(struct drm_gpusvm *gpusvm, unsigned long fault_addr, unsigned long gpuva_start, unsigned long gpuva_end) { struct drm_gpusvm_ctx ctx = {}; int err; driver_svm_lock(); retry: // Always process UNMAPs first so view of GPU SVM ranges is current driver_garbage_collector(gpusvm); range = drm_gpusvm_range_find_or_insert(gpusvm, fault_addr, gpuva_start, gpuva_end, &ctx); if (IS_ERR(range)) { err = PTR_ERR(range); goto unlock; } if (driver_migration_policy(range)) { mmap_read_lock(mm); devmem = driver_alloc_devmem(); err = drm_gpusvm_migrate_to_devmem(gpusvm, range, devmem_allocation, &ctx); mmap_read_unlock(mm); if (err) // CPU mappings may have changed goto retry; } err = drm_gpusvm_range_get_pages(gpusvm, range, &ctx); if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) { // CPU mappings changed if (err == -EOPNOTSUPP) drm_gpusvm_range_evict(gpusvm, range); goto retry; } else if (err) { goto unlock; } err = driver_bind_range(gpusvm, range); if (err == -EAGAIN) // CPU mappings changed goto retry unlock: driver_svm_unlock(); return err; }h]hX9int driver_bind_range(struct drm_gpusvm *gpusvm, struct drm_gpusvm_range *range) { int err = 0; driver_alloc_and_setup_memory_for_bind(gpusvm, range); drm_gpusvm_notifier_lock(gpusvm); if (drm_gpusvm_range_pages_valid(range)) driver_commit_bind(gpusvm, range); else err = -EAGAIN; drm_gpusvm_notifier_unlock(gpusvm); return err; } int driver_gpu_fault(struct drm_gpusvm *gpusvm, unsigned long fault_addr, unsigned long gpuva_start, unsigned long gpuva_end) { struct drm_gpusvm_ctx ctx = {}; int err; driver_svm_lock(); retry: // Always process UNMAPs first so view of GPU SVM ranges is current driver_garbage_collector(gpusvm); range = drm_gpusvm_range_find_or_insert(gpusvm, fault_addr, gpuva_start, gpuva_end, &ctx); if (IS_ERR(range)) { err = PTR_ERR(range); goto unlock; } if (driver_migration_policy(range)) { mmap_read_lock(mm); devmem = driver_alloc_devmem(); err = drm_gpusvm_migrate_to_devmem(gpusvm, range, devmem_allocation, &ctx); mmap_read_unlock(mm); if (err) // CPU mappings may have changed goto retry; } err = drm_gpusvm_range_get_pages(gpusvm, range, &ctx); if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) { // CPU mappings changed if (err == -EOPNOTSUPP) drm_gpusvm_range_evict(gpusvm, range); goto retry; } else if (err) { goto unlock; } err = driver_bind_range(gpusvm, range); if (err == -EAGAIN) // CPU mappings changed goto retry unlock: driver_svm_unlock(); return err; }}hj!sbah}(h]h ]h"]h$]h&]hhforcelanguagechighlight_args}uh1jh[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:82: ./drivers/gpu/drm/drm_gpusvm.chKhjhhubj)}(hhh]h)}(hGarbage Collector h]j )}(hGarbage Collectorh]hGarbage Collector}(hj<hhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:82: ./drivers/gpu/drm/drm_gpusvm.chKhj8ubah}(h]h ]h"]h$]h&]uh1hhj5ubah}(h]h ]h"]h$]h&]jjjhjjstartKuh1jhjhhhNhNubj )}(hXvoid __driver_garbage_collector(struct drm_gpusvm *gpusvm, struct drm_gpusvm_range *range) { assert_driver_svm_locked(gpusvm); // Partial unmap, migrate any remaining device memory pages back to RAM if (range->flags.partial_unmap) drm_gpusvm_range_evict(gpusvm, range); driver_unbind_range(range); drm_gpusvm_range_remove(gpusvm, range); } void driver_garbage_collector(struct drm_gpusvm *gpusvm) { assert_driver_svm_locked(gpusvm); for_each_range_in_garbage_collector(gpusvm, range) __driver_garbage_collector(gpusvm, range); }h]hXvoid __driver_garbage_collector(struct drm_gpusvm *gpusvm, struct drm_gpusvm_range *range) { assert_driver_svm_locked(gpusvm); // Partial unmap, migrate any remaining device memory pages back to RAM if (range->flags.partial_unmap) drm_gpusvm_range_evict(gpusvm, range); driver_unbind_range(range); drm_gpusvm_range_remove(gpusvm, range); } void driver_garbage_collector(struct drm_gpusvm *gpusvm) { assert_driver_svm_locked(gpusvm); for_each_range_in_garbage_collector(gpusvm, range) __driver_garbage_collector(gpusvm, range); }}hjXsbah}(h]h ]h"]h$]h&]hhj/j0j1j2}uh1jh[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:82: ./drivers/gpu/drm/drm_gpusvm.chKhjhhubj)}(hhh]h)}(hNotifier callback h]j )}(hNotifier callbackh]hNotifier callback}(hjohhhNhNubah}(h]h ]h"]h$]h&]uh1j h[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:82: ./drivers/gpu/drm/drm_gpusvm.chKhjkubah}(h]h ]h"]h$]h&]uh1hhjhubah}(h]h ]h"]h$]h&]jjjhjjjWKuh1jhjhhhNhNubj )}(hXvoid driver_invalidation(struct drm_gpusvm *gpusvm, struct drm_gpusvm_notifier *notifier, const struct mmu_notifier_range *mmu_range) { struct drm_gpusvm_ctx ctx = { .in_notifier = true, }; struct drm_gpusvm_range *range = NULL; driver_invalidate_device_pages(gpusvm, mmu_range->start, mmu_range->end); drm_gpusvm_for_each_range(range, notifier, mmu_range->start, mmu_range->end) { drm_gpusvm_range_unmap_pages(gpusvm, range, &ctx); if (mmu_range->event != MMU_NOTIFY_UNMAP) continue; drm_gpusvm_range_set_unmapped(range, mmu_range); driver_garbage_collector_add(gpusvm, range); } }h]hXvoid driver_invalidation(struct drm_gpusvm *gpusvm, struct drm_gpusvm_notifier *notifier, const struct mmu_notifier_range *mmu_range) { struct drm_gpusvm_ctx ctx = { .in_notifier = true, }; struct drm_gpusvm_range *range = NULL; driver_invalidate_device_pages(gpusvm, mmu_range->start, mmu_range->end); drm_gpusvm_for_each_range(range, notifier, mmu_range->start, mmu_range->end) { drm_gpusvm_range_unmap_pages(gpusvm, range, &ctx); if (mmu_range->event != MMU_NOTIFY_UNMAP) continue; drm_gpusvm_range_set_unmapped(range, mmu_range); driver_garbage_collector_add(gpusvm, range); } }}hjsbah}(h]h ]h"]h$]h&]hhj/j0j1j2}uh1jh[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:82: ./drivers/gpu/drm/drm_gpusvm.chKhjhhubeh}(h]overview-of-baseline-designah ]h"]overview of baseline designah$]h&]uh1hhhhhhhhKDubh)}(hhh](h)}(hPossible future design featuresh]hPossible future design features}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKVubh)}(hhh](h)}(hConcurrent GPU faults * CPU faults are concurrent so makes sense to have concurrent GPU faults. * Should be possible with fined grained locking in the driver GPU fault handler. * No expected GPU SVM changes required.h]h)}(hhh]h)}(hConcurrent GPU faults * CPU faults are concurrent so makes sense to have concurrent GPU faults. * Should be possible with fined grained locking in the driver GPU fault handler. * No expected GPU SVM changes required.h](h)}(hConcurrent GPU faultsh]hConcurrent GPU faults}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK\hjubj)}(hhh]h)}(hhh](h)}(hGCPU faults are concurrent so makes sense to have concurrent GPU faults.h]j )}(hGCPU faults are concurrent so makes sense to have concurrent GPU faults.h]hGCPU faults are concurrent so makes sense to have concurrent GPU faults.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKYhjubah}(h]h ]h"]h$]h&]uh1hhjubh)}(hNShould be possible with fined grained locking in the driver GPU fault handler.h]j )}(hNShould be possible with fined grained locking in the driver GPU fault handler.h]hNShould be possible with fined grained locking in the driver GPU fault handler.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK[hjubah}(h]h ]h"]h$]h&]uh1hhjubh)}(h%No expected GPU SVM changes required.h]j )}(hjh]h%No expected GPU SVM changes required.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK]hjubah}(h]h ]h"]h$]h&]uh1hhjubeh}(h]h ]h"]h$]h&]jjuh1hhhhKYhjubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1hhhhK\hjubah}(h]h ]h"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]uh1hhjhhhNhNubh)}(hqRanges with mixed system and device pages * Can be added if required to drm_gpusvm_get_pages fairly easily.h]h)}(hhh]h)}(hkRanges with mixed system and device pages * Can be added if required to drm_gpusvm_get_pages fairly easily.h](h)}(h)Ranges with mixed system and device pagesh]h)Ranges with mixed system and device pages}(hjEhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhK^hjAubj)}(hhh]h)}(hhh]h)}(h?Can be added if required to drm_gpusvm_get_pages fairly easily.h]j )}(hj[h]h?Can be added if required to drm_gpusvm_get_pages fairly easily.}(hj]hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK_hjYubah}(h]h ]h"]h$]h&]uh1hhjVubah}(h]h ]h"]h$]h&]jjuh1hhhhK_hjSubah}(h]h ]h"]h$]h&]uh1jhjAubeh}(h]h ]h"]h$]h&]uh1hhhhK^hj>ubah}(h]h ]h"]h$]h&]uh1hhj:ubah}(h]h ]h"]h$]h&]uh1hhjhhhNhNubh)}(hMulti-GPU support * Work in progress and patches expected after initially landing on GPU SVM. * Ideally can be done with little to no changes to GPU SVM.h]h)}(hhh]h)}(hMulti-GPU support * Work in progress and patches expected after initially landing on GPU SVM. * Ideally can be done with little to no changes to GPU SVM.h](h)}(hMulti-GPU supporth]hMulti-GPU support}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKbhjubj)}(hhh]h)}(hhh](h)}(hIWork in progress and patches expected after initially landing on GPU SVM.h]j )}(hIWork in progress and patches expected after initially landing on GPU SVM.h]hIWork in progress and patches expected after initially landing on GPU SVM.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKahjubah}(h]h ]h"]h$]h&]uh1hhjubh)}(h9Ideally can be done with little to no changes to GPU SVM.h]j )}(hjh]h9Ideally can be done with little to no changes to GPU SVM.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKchjubah}(h]h ]h"]h$]h&]uh1hhjubeh}(h]h ]h"]h$]h&]jjuh1hhhhKahjubah}(h]h ]h"]h$]h&]uh1jhjubeh}(h]h ]h"]h$]h&]uh1hhhhKbhjubah}(h]h ]h"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]uh1hhjhhhNhNubh)}(hQDrop ranges in favor of radix tree * May be desirable for faster notifiers.h]h)}(hhh]h)}(hKDrop ranges in favor of radix tree * May be desirable for faster notifiers.h](h)}(h"Drop ranges in favor of radix treeh]h"Drop ranges in favor of radix tree}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKdhj ubj)}(hhh]h)}(hhh]h)}(h&May be desirable for faster notifiers.h]j )}(hj h]h&May be desirable for faster notifiers.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKehj ubah}(h]h ]h"]h$]h&]uh1hhj ubah}(h]h ]h"]h$]h&]jjuh1hhhhKehj ubah}(h]h ]h"]h$]h&]uh1jhj ubeh}(h]h ]h"]h$]h&]uh1hhhhKdhjubah}(h]h ]h"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]uh1hhjhhhNhNubh)}(hXCompound device pages * Nvidia, AMD, and Intel all have agreed expensive core MM functions in migrate device layer are a performance bottleneck, having compound device pages should help increase performance by reducing the number of these expensive calls.h]h)}(hhh]h)}(hXCompound device pages * Nvidia, AMD, and Intel all have agreed expensive core MM functions in migrate device layer are a performance bottleneck, having compound device pages should help increase performance by reducing the number of these expensive calls.h](h)}(hCompound device pagesh]hCompound device pages}(hjY hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKihjU ubj)}(hhh]h)}(hhh]h)}(hNvidia, AMD, and Intel all have agreed expensive core MM functions in migrate device layer are a performance bottleneck, having compound device pages should help increase performance by reducing the number of these expensive calls.h]j )}(hNvidia, AMD, and Intel all have agreed expensive core MM functions in migrate device layer are a performance bottleneck, having compound device pages should help increase performance by reducing the number of these expensive calls.h]hNvidia, AMD, and Intel all have agreed expensive core MM functions in migrate device layer are a performance bottleneck, having compound device pages should help increase performance by reducing the number of these expensive calls.}(hjq hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKghjm ubah}(h]h ]h"]h$]h&]uh1hhjj ubah}(h]h ]h"]h$]h&]jjuh1hhhhKghjg ubah}(h]h ]h"]h$]h&]uh1jhjU ubeh}(h]h ]h"]h$]h&]uh1hhhhKihjR ubah}(h]h ]h"]h$]h&]uh1hhjN ubah}(h]h ]h"]h$]h&]uh1hhjhhhNhNubh)}(hHigher order dma mapping for migration * 4k dma mapping adversely affects migration performance on Intel hardware, higher order (2M) dma mapping should help here.h]h)}(hhh]h)}(hHigher order dma mapping for migration * 4k dma mapping adversely affects migration performance on Intel hardware, higher order (2M) dma mapping should help here.h](h)}(h&Higher order dma mapping for migrationh]h&Higher order dma mapping for migration}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKlhj ubj)}(hhh]h)}(hhh]h)}(hy4k dma mapping adversely affects migration performance on Intel hardware, higher order (2M) dma mapping should help here.h]j )}(hy4k dma mapping adversely affects migration performance on Intel hardware, higher order (2M) dma mapping should help here.h]hy4k dma mapping adversely affects migration performance on Intel hardware, higher order (2M) dma mapping should help here.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKlhj ubah}(h]h ]h"]h$]h&]uh1hhj ubah}(h]h ]h"]h$]h&]jjuh1hhhhKlhj ubah}(h]h ]h"]h$]h&]uh1jhj ubeh}(h]h ]h"]h$]h&]uh1hhhhKlhj ubah}(h]h ]h"]h$]h&]uh1hhj ubah}(h]h ]h"]h$]h&]uh1hhjhhhNhNubh)}(h5Build common userptr implementation on top of GPU SVMh]j )}(hj h]h5Build common userptr implementation on top of GPU SVM}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKnhj ubah}(h]h ]h"]h$]h&]uh1hhjhhhhhNubh)}(h9Driver side madvise implementation and migration policiesh]j )}(hj h]h9Driver side madvise implementation and migration policies}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKohj ubah}(h]h ]h"]h$]h&]uh1hhjhhhhhNubh)}(hJPull in pending dma-mapping API changes from Leon / Nvidia when these landh]j )}(hj( h]hJPull in pending dma-mapping API changes from Leon / Nvidia when these land}(hj* hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKphj& ubah}(h]h ]h"]h$]h&]uh1hhjhhhhhNubeh}(h]h ]h"]h$]h&]jjuh1hhhhKXhjhhubeh}(h]possible-future-design-featuresah ]h"]possible future design featuresah$]h&]uh1hhhhhhhhKVubeh}(h]gpu-svm-sectionah ]h"]gpu svm sectionah$]h&]uh1hhhhhhhhKubeh}(h]h ]h"]h$]h&]sourcehuh1hcurrent_sourceN current_lineNsettingsdocutils.frontendValues)}(hN generatorN datestampN source_linkN source_urlN toc_backlinksentryfootnote_backlinksK sectnum_xformKstrip_comments Nstrip_elements_with_classesN strip_classesN report_levelK halt_levelKexit_status_levelKdebugNwarning_streamN tracebackinput_encoding utf-8-siginput_encoding_error_handlerstrictoutput_encodingutf-8output_encoding_error_handlerjv error_encodingutf-8error_encoding_error_handlerbackslashreplace language_codeenrecord_dependenciesNconfigN id_prefixhauto_id_prefixid dump_settingsNdump_internalsNdump_transformsNdump_pseudo_xmlNexpose_internalsNstrict_visitorN_disable_configN_sourceh _destinationN _config_files]7/var/lib/git/docbuild/linux/Documentation/docutils.confafile_insertion_enabled raw_enabledKline_length_limitM'pep_referencesN pep_base_urlhttps://peps.python.org/pep_file_url_templatepep-%04drfc_referencesN rfc_base_url&https://datatracker.ietf.org/doc/html/ tab_widthKtrim_footnote_reference_spacesyntax_highlightlong smart_quotessmartquotes_locales]character_level_inline_markupdoctitle_xform docinfo_xformKsectsubtitle_xform image_loadinglinkembed_stylesheetcloak_email_addressessection_self_linkenvNubreporterNindirect_targets]substitution_defs}substitution_names}refnames}refids}nameids}(jP jM jjjjjH jE u nametypes}(jP jjjH uh}(jM hjhjjjE ju footnote_refs} citation_refs} autofootnotes]autofootnote_refs]symbol_footnotes]symbol_footnote_refs] footnotes] citations]autofootnote_startKsymbol_footnote_startK id_counter collectionsCounter}Rparse_messages](hsystem_message)}(hhh]j )}(h:Enumerated list start value not ordinal-1: "2" (ordinal 2)h]h>Enumerated list start value not ordinal-1: “2” (ordinal 2)}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]levelKtypeINFOsource[/var/lib/git/docbuild/linux/Documentation/gpu/rfc/gpusvm:82: ./drivers/gpu/drm/drm_gpusvm.clineM uh1j hjhhhNhNubj )}(hhh]j )}(h:Enumerated list start value not ordinal-1: "3" (ordinal 3)h]h>Enumerated list start value not ordinal-1: “3” (ordinal 3)}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]levelKtypej sourcej lineM uh1j hjhhhNhNubetransform_messages] transformerN include_log] decorationNhhub.