;sphinx.addnodesdocument)}( rawsourcechildren]( translations LanguagesNode)}(hhh](h pending_xref)}(hhh]docutils.nodesTextChinese (Simplified)}parenthsba attributes}(ids]classes]names]dupnames]backrefs] refdomainstdreftypedoc reftarget+/translations/zh_CN/gpu/drm-vm-bind-lockingmodnameN classnameN refexplicitutagnamehhh ubh)}(hhh]hChinese (Traditional)}hh2sbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget+/translations/zh_TW/gpu/drm-vm-bind-lockingmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hItalian}hhFsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget+/translations/it_IT/gpu/drm-vm-bind-lockingmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hJapanese}hhZsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget+/translations/ja_JP/gpu/drm-vm-bind-lockingmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hKorean}hhnsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget+/translations/ko_KR/gpu/drm-vm-bind-lockingmodnameN classnameN refexplicituh1hhh ubh)}(hhh]hSpanish}hhsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget+/translations/sp_SP/gpu/drm-vm-bind-lockingmodnameN classnameN refexplicituh1hhh ubeh}(h]h ]h"]h$]h&]current_languageEnglishuh1h hh _documenthsourceNlineNubhcomment)}(h*SPDX-License-Identifier: (GPL-2.0+ OR MIT)h]h*SPDX-License-Identifier: (GPL-2.0+ OR MIT)}hhsbah}(h]h ]h"]h$]h&] xml:spacepreserveuh1hhhhhhE/var/lib/git/docbuild/linux/Documentation/gpu/drm-vm-bind-locking.rsthKubhsection)}(hhh](htitle)}(hVM_BIND lockingh]hVM_BIND locking}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhhhhKubh paragraph)}(hXThis document attempts to describe what's needed to get VM_BIND locking right, including the userptr mmu_notifier locking. It also discusses some optimizations to get rid of the looping through of all userptr mappings and external / shared object mappings that is needed in the simplest implementation. In addition, there is a section describing the VM_BIND locking required for implementing recoverable pagefaults.h]hXThis document attempts to describe what’s needed to get VM_BIND locking right, including the userptr mmu_notifier locking. It also discusses some optimizations to get rid of the looping through of all userptr mappings and external / shared object mappings that is needed in the simplest implementation. In addition, there is a section describing the VM_BIND locking required for implementing recoverable pagefaults.}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhhhhubh)}(hhh](h)}(hThe DRM GPUVM set of helpersh]hThe DRM GPUVM set of helpers}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhhhhKubh)}(hXThere is a set of helpers for drivers implementing VM_BIND, and this set of helpers implements much, but not all of the locking described in this document. In particular, it is currently lacking a userptr implementation. This document does not intend to describe the DRM GPUVM implementation in detail, but it is covered in :ref:`its own documentation `. It is highly recommended for any driver implementing VM_BIND to use the DRM GPUVM helpers and to extend it if common functionality is missing.h](hXDThere is a set of helpers for drivers implementing VM_BIND, and this set of helpers implements much, but not all of the locking described in this document. In particular, it is currently lacking a userptr implementation. This document does not intend to describe the DRM GPUVM implementation in detail, but it is covered in }(hhhhhNhNubh)}(h(:ref:`its own documentation `h]hinline)}(hhh]hits own documentation}(hhhhhNhNubah}(h]h ](xrefstdstd-refeh"]h$]h&]uh1hhhubah}(h]h ]h"]h$]h&]refdocgpu/drm-vm-bind-locking refdomainjreftyperef refexplicitrefwarn reftarget drm_gpuvmuh1hhhhKhhubh. It is highly recommended for any driver implementing VM_BIND to use the DRM GPUVM helpers and to extend it if common functionality is missing.}(hhhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhhhhubeh}(h]the-drm-gpuvm-set-of-helpersah ]h"]the drm gpuvm set of helpersah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(h Nomenclatureh]h Nomenclature}(hj,hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj)hhhhhKubh bullet_list)}(hhh](h list_item)}(h``gpu_vm``: Abstraction of a virtual GPU address space with meta-data. Typically one per client (DRM file-private), or one per execution context.h]h)}(h``gpu_vm``: Abstraction of a virtual GPU address space with meta-data. Typically one per client (DRM file-private), or one per execution context.h](hliteral)}(h ``gpu_vm``h]hgpu_vm}(hjKhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjEubh: Abstraction of a virtual GPU address space with meta-data. Typically one per client (DRM file-private), or one per execution context.}(hjEhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjAubah}(h]h ]h"]h$]h&]uh1j?hj<hhhhhNubj@)}(h``gpu_vma``: Abstraction of a GPU address range within a gpu_vm with associated meta-data. The backing storage of a gpu_vma can either be a GEM object or anonymous or page-cache pages mapped also into the CPU address space for the process.h]h)}(h``gpu_vma``: Abstraction of a GPU address range within a gpu_vm with associated meta-data. The backing storage of a gpu_vma can either be a GEM object or anonymous or page-cache pages mapped also into the CPU address space for the process.h](jJ)}(h ``gpu_vma``h]hgpu_vma}(hjqhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjmubh: Abstraction of a GPU address range within a gpu_vm with associated meta-data. The backing storage of a gpu_vma can either be a GEM object or anonymous or page-cache pages mapped also into the CPU address space for the process.}(hjmhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK hjiubah}(h]h ]h"]h$]h&]uh1j?hj<hhhhhNubj@)}(h``gpu_vm_bo``: Abstracts the association of a GEM object and a VM. The GEM object maintains a list of gpu_vm_bos, where each gpu_vm_bo maintains a list of gpu_vmas.h]h)}(h``gpu_vm_bo``: Abstracts the association of a GEM object and a VM. The GEM object maintains a list of gpu_vm_bos, where each gpu_vm_bo maintains a list of gpu_vmas.h](jJ)}(h ``gpu_vm_bo``h]h gpu_vm_bo}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh: Abstracts the association of a GEM object and a VM. The GEM object maintains a list of gpu_vm_bos, where each gpu_vm_bo maintains a list of gpu_vmas.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK$hjubah}(h]h ]h"]h$]h&]uh1j?hj<hhhhhNubj@)}(hx``userptr gpu_vma or just userptr``: A gpu_vma, whose backing store is anonymous or page-cache pages as described above.h]h)}(hx``userptr gpu_vma or just userptr``: A gpu_vma, whose backing store is anonymous or page-cache pages as described above.h](jJ)}(h#``userptr gpu_vma or just userptr``h]huserptr gpu_vma or just userptr}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubhU: A gpu_vma, whose backing store is anonymous or page-cache pages as described above.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK'hjubah}(h]h ]h"]h$]h&]uh1j?hj<hhhhhNubj@)}(h``revalidating``: Revalidating a gpu_vma means making the latest version of the backing store resident and making sure the gpu_vma's page-table entries point to that backing store.h]h)}(h``revalidating``: Revalidating a gpu_vma means making the latest version of the backing store resident and making sure the gpu_vma's page-table entries point to that backing store.h](jJ)}(h``revalidating``h]h revalidating}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh: Revalidating a gpu_vma means making the latest version of the backing store resident and making sure the gpu_vma’s page-table entries point to that backing store.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK)hjubah}(h]h ]h"]h$]h&]uh1j?hj<hhhhhNubj@)}(h``dma_fence``: A struct dma_fence that is similar to a struct completion and which tracks GPU activity. When the GPU activity is finished, the dma_fence signals. Please refer to the ``DMA Fences`` section of the :doc:`dma-buf doc `.h]h)}(h``dma_fence``: A struct dma_fence that is similar to a struct completion and which tracks GPU activity. When the GPU activity is finished, the dma_fence signals. Please refer to the ``DMA Fences`` section of the :doc:`dma-buf doc `.h](jJ)}(h ``dma_fence``h]h dma_fence}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh: A struct dma_fence that is similar to a struct completion and which tracks GPU activity. When the GPU activity is finished, the dma_fence signals. Please refer to the }(hjhhhNhNubjJ)}(h``DMA Fences``h]h DMA Fences}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh section of the }(hjhhhNhNubh)}(h(:doc:`dma-buf doc `h]h)}(hj/h]h dma-buf doc}(hj1hhhNhNubah}(h]h ](jstdstd-doceh"]h$]h&]uh1hhj-ubah}(h]h ]h"]h$]h&]refdocj refdomainj;reftypedoc refexplicitrefwarnj/driver-api/dma-bufuh1hhhhK,hjubh.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK,hjubah}(h]h ]h"]h$]h&]uh1j?hj<hhhhhNubj@)}(hX``dma_resv``: A struct dma_resv (a.k.a reservation object) that is used to track GPU activity in the form of multiple dma_fences on a gpu_vm or a GEM object. The dma_resv contains an array / list of dma_fences and a lock that needs to be held when adding additional dma_fences to the dma_resv. The lock is of a type that allows deadlock-safe locking of multiple dma_resvs in arbitrary order. Please refer to the ``Reservation Objects`` section of the :doc:`dma-buf doc `.h]h)}(hX``dma_resv``: A struct dma_resv (a.k.a reservation object) that is used to track GPU activity in the form of multiple dma_fences on a gpu_vm or a GEM object. The dma_resv contains an array / list of dma_fences and a lock that needs to be held when adding additional dma_fences to the dma_resv. The lock is of a type that allows deadlock-safe locking of multiple dma_resvs in arbitrary order. Please refer to the ``Reservation Objects`` section of the :doc:`dma-buf doc `.h](jJ)}(h ``dma_resv``h]hdma_resv}(hjehhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjaubhX: A struct dma_resv (a.k.a reservation object) that is used to track GPU activity in the form of multiple dma_fences on a gpu_vm or a GEM object. The dma_resv contains an array / list of dma_fences and a lock that needs to be held when adding additional dma_fences to the dma_resv. The lock is of a type that allows deadlock-safe locking of multiple dma_resvs in arbitrary order. Please refer to the }(hjahhhNhNubjJ)}(h``Reservation Objects``h]hReservation Objects}(hjwhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjaubh section of the }(hjahhhNhNubh)}(h(:doc:`dma-buf doc `h]h)}(hjh]h dma-buf doc}(hjhhhNhNubah}(h]h ](jstdstd-doceh"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]refdocj refdomainjreftypedoc refexplicitrefwarnj/driver-api/dma-bufuh1hhhhK0hjaubh.}(hjahhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK0hj]ubah}(h]h ]h"]h$]h&]uh1j?hj<hhhhhNubj@)}(hX``exec function``: An exec function is a function that revalidates all affected gpu_vmas, submits a GPU command batch and registers the dma_fence representing the GPU command's activity with all affected dma_resvs. For completeness, although not covered by this document, it's worth mentioning that an exec function may also be the revalidation worker that is used by some drivers in compute / long-running mode.h]h)}(hX``exec function``: An exec function is a function that revalidates all affected gpu_vmas, submits a GPU command batch and registers the dma_fence representing the GPU command's activity with all affected dma_resvs. For completeness, although not covered by this document, it's worth mentioning that an exec function may also be the revalidation worker that is used by some drivers in compute / long-running mode.h](jJ)}(h``exec function``h]h exec function}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubhX: An exec function is a function that revalidates all affected gpu_vmas, submits a GPU command batch and registers the dma_fence representing the GPU command’s activity with all affected dma_resvs. For completeness, although not covered by this document, it’s worth mentioning that an exec function may also be the revalidation worker that is used by some drivers in compute / long-running mode.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK8hjubah}(h]h ]h"]h$]h&]uh1j?hj<hhhhhNubj@)}(hv``local object``: A GEM object which is only mapped within a single VM. Local GEM objects share the gpu_vm's dma_resv.h]h)}(hv``local object``: A GEM object which is only mapped within a single VM. Local GEM objects share the gpu_vm's dma_resv.h](jJ)}(h``local object``h]h local object}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubhh: A GEM object which is only mapped within a single VM. Local GEM objects share the gpu_vm’s dma_resv.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK?hjubah}(h]h ]h"]h$]h&]uh1j?hj<hhhhhNubj@)}(h``external object``: a.k.a shared object: A GEM object which may be shared by multiple gpu_vms and whose backing storage may be shared with other drivers. h]h)}(h``external object``: a.k.a shared object: A GEM object which may be shared by multiple gpu_vms and whose backing storage may be shared with other drivers.h](jJ)}(h``external object``h]hexternal object}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhj ubh: a.k.a shared object: A GEM object which may be shared by multiple gpu_vms and whose backing storage may be shared with other drivers.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKAhjubah}(h]h ]h"]h$]h&]uh1j?hj<hhhhhNubeh}(h]h ]h"]h$]h&]bullet*uh1j:hhhKhj)hhubeh}(h] nomenclatureah ]h"] nomenclatureah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(hLocks and locking orderh]hLocks and locking order}(hj>hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj;hhhhhKFubh)}(hOne of the benefits of VM_BIND is that local GEM objects share the gpu_vm's dma_resv object and hence the dma_resv lock. So, even with a huge number of local GEM objects, only one lock is needed to make the exec sequence atomic.h]hOne of the benefits of VM_BIND is that local GEM objects share the gpu_vm’s dma_resv object and hence the dma_resv lock. So, even with a huge number of local GEM objects, only one lock is needed to make the exec sequence atomic.}(hjLhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKHhj;hhubh)}(h0The following locks and locking orders are used:h]h0The following locks and locking orders are used:}(hjZhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKMhj;hhubj;)}(hhh](j@)}(hXtThe ``gpu_vm->lock`` (optionally an rwsem). Protects the gpu_vm's data structure keeping track of gpu_vmas. It can also protect the gpu_vm's list of userptr gpu_vmas. With a CPU mm analogy this would correspond to the mmap_lock. An rwsem allows several readers to walk the VM tree concurrently, but the benefit of that concurrency most likely varies from driver to driver.h]h)}(hXtThe ``gpu_vm->lock`` (optionally an rwsem). Protects the gpu_vm's data structure keeping track of gpu_vmas. It can also protect the gpu_vm's list of userptr gpu_vmas. With a CPU mm analogy this would correspond to the mmap_lock. An rwsem allows several readers to walk the VM tree concurrently, but the benefit of that concurrency most likely varies from driver to driver.h](hThe }(hjohhhNhNubjJ)}(h``gpu_vm->lock``h]h gpu_vm->lock}(hjwhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjoubhXd (optionally an rwsem). Protects the gpu_vm’s data structure keeping track of gpu_vmas. It can also protect the gpu_vm’s list of userptr gpu_vmas. With a CPU mm analogy this would correspond to the mmap_lock. An rwsem allows several readers to walk the VM tree concurrently, but the benefit of that concurrency most likely varies from driver to driver.}(hjohhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKOhjkubah}(h]h ]h"]h$]h&]uh1j?hjhhhhhhNubj@)}(hXwThe ``userptr_seqlock``. This lock is taken in read mode for each userptr gpu_vma on the gpu_vm's userptr list, and in write mode during mmu notifier invalidation. This is not a real seqlock but described in ``mm/mmu_notifier.c`` as a "Collision-retry read-side/write-side 'lock' a lot like a seqcount. However this allows multiple write-sides to hold it at once...". The read side critical section is enclosed by ``mmu_interval_read_begin() / mmu_interval_read_retry()`` with ``mmu_interval_read_begin()`` sleeping if the write side is held. The write side is held by the core mm while calling mmu interval invalidation notifiers.h]h)}(hXwThe ``userptr_seqlock``. This lock is taken in read mode for each userptr gpu_vma on the gpu_vm's userptr list, and in write mode during mmu notifier invalidation. This is not a real seqlock but described in ``mm/mmu_notifier.c`` as a "Collision-retry read-side/write-side 'lock' a lot like a seqcount. However this allows multiple write-sides to hold it at once...". The read side critical section is enclosed by ``mmu_interval_read_begin() / mmu_interval_read_retry()`` with ``mmu_interval_read_begin()`` sleeping if the write side is held. The write side is held by the core mm while calling mmu interval invalidation notifiers.h](hThe }(hjhhhNhNubjJ)}(h``userptr_seqlock``h]huserptr_seqlock}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh. This lock is taken in read mode for each userptr gpu_vma on the gpu_vm’s userptr list, and in write mode during mmu notifier invalidation. This is not a real seqlock but described in }(hjhhhNhNubjJ)}(h``mm/mmu_notifier.c``h]hmm/mmu_notifier.c}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh as a “Collision-retry read-side/write-side ‘lock’ a lot like a seqcount. However this allows multiple write-sides to hold it at once...”. The read side critical section is enclosed by }(hjhhhNhNubjJ)}(h9``mmu_interval_read_begin() / mmu_interval_read_retry()``h]h5mmu_interval_read_begin() / mmu_interval_read_retry()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh with }(hjhhhNhNubjJ)}(h``mmu_interval_read_begin()``h]hmmu_interval_read_begin()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh} sleeping if the write side is held. The write side is held by the core mm while calling mmu interval invalidation notifiers.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKUhjubah}(h]h ]h"]h$]h&]uh1j?hjhhhhhhNubj@)}(hThe ``gpu_vm->resv`` lock. Protects the gpu_vm's list of gpu_vmas needing rebinding, as well as the residency state of all the gpu_vm's local GEM objects. Furthermore, it typically protects the gpu_vm's list of evicted and external GEM objects.h]h)}(hThe ``gpu_vm->resv`` lock. Protects the gpu_vm's list of gpu_vmas needing rebinding, as well as the residency state of all the gpu_vm's local GEM objects. Furthermore, it typically protects the gpu_vm's list of evicted and external GEM objects.h](hThe }(hjhhhNhNubjJ)}(h``gpu_vm->resv``h]h gpu_vm->resv}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh lock. Protects the gpu_vm’s list of gpu_vmas needing rebinding, as well as the residency state of all the gpu_vm’s local GEM objects. Furthermore, it typically protects the gpu_vm’s list of evicted and external GEM objects.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhK`hjubah}(h]h ]h"]h$]h&]uh1j?hjhhhhhhNubj@)}(hThe ``gpu_vm->userptr_notifier_lock``. This is an rwsem that is taken in read mode during exec and write mode during a mmu notifier invalidation. The userptr notifier lock is per gpu_vm.h]h)}(hThe ``gpu_vm->userptr_notifier_lock``. This is an rwsem that is taken in read mode during exec and write mode during a mmu notifier invalidation. The userptr notifier lock is per gpu_vm.h](hThe }(hj#hhhNhNubjJ)}(h!``gpu_vm->userptr_notifier_lock``h]hgpu_vm->userptr_notifier_lock}(hj+hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhj#ubh. This is an rwsem that is taken in read mode during exec and write mode during a mmu notifier invalidation. The userptr notifier lock is per gpu_vm.}(hj#hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKehjubah}(h]h ]h"]h$]h&]uh1j?hjhhhhhhNubj@)}(hThe ``gem_object->gpuva_lock`` This lock protects the GEM object's list of gpu_vm_bos. This is usually the same lock as the GEM object's dma_resv, but some drivers protects this list differently, see below.h]h)}(hThe ``gem_object->gpuva_lock`` This lock protects the GEM object's list of gpu_vm_bos. This is usually the same lock as the GEM object's dma_resv, but some drivers protects this list differently, see below.h](hThe }(hjMhhhNhNubjJ)}(h``gem_object->gpuva_lock``h]hgem_object->gpuva_lock}(hjUhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjMubh This lock protects the GEM object’s list of gpu_vm_bos. This is usually the same lock as the GEM object’s dma_resv, but some drivers protects this list differently, see below.}(hjMhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhhjIubah}(h]h ]h"]h$]h&]uh1j?hjhhhhhhNubj@)}(hX_The ``gpu_vm list spinlocks``. With some implementations they are needed to be able to update the gpu_vm evicted- and external object list. For those implementations, the spinlocks are grabbed when the lists are manipulated. However, to avoid locking order violations with the dma_resv locks, a special scheme is needed when iterating over the lists. h]h)}(hX^The ``gpu_vm list spinlocks``. With some implementations they are needed to be able to update the gpu_vm evicted- and external object list. For those implementations, the spinlocks are grabbed when the lists are manipulated. However, to avoid locking order violations with the dma_resv locks, a special scheme is needed when iterating over the lists.h](hThe }(hjwhhhNhNubjJ)}(h``gpu_vm list spinlocks``h]hgpu_vm list spinlocks}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjwubhXA. With some implementations they are needed to be able to update the gpu_vm evicted- and external object list. For those implementations, the spinlocks are grabbed when the lists are manipulated. However, to avoid locking order violations with the dma_resv locks, a special scheme is needed when iterating over the lists.}(hjwhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKlhjsubah}(h]h ]h"]h$]h&]uh1j?hjhhhhhhNubeh}(h]h ]h"]h$]h&]j1j2uh1j:hhhKOhj;hhubhtarget)}(h.. _gpu_vma lifetime:h]h}(h]h ]h"]h$]h&]refidgpu-vma-lifetimeuh1jhKshj;hhhhubeh}(h]locks-and-locking-orderah ]h"]locks and locking orderah$]h&]uh1hhhhhhhhKFubh)}(hhh](h)}(h2Protection and lifetime of gpu_vm_bos and gpu_vmash]h2Protection and lifetime of gpu_vm_bos and gpu_vmas}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKvubh)}(hXYThe GEM object's list of gpu_vm_bos, and the gpu_vm_bo's list of gpu_vmas is protected by the ``gem_object->gpuva_lock``, which is typically the same as the GEM object's dma_resv, but if the driver needs to access these lists from within a dma_fence signalling critical section, it can instead choose to protect it with a separate lock, which can be locked from within the dma_fence signalling critical section. Such drivers then need to pay additional attention to what locks need to be taken from within the loop when iterating over the gpu_vm_bo and gpu_vma lists to avoid locking-order violations.h](hbThe GEM object’s list of gpu_vm_bos, and the gpu_vm_bo’s list of gpu_vmas is protected by the }(hjhhhNhNubjJ)}(h``gem_object->gpuva_lock``h]hgem_object->gpuva_lock}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubhX, which is typically the same as the GEM object’s dma_resv, but if the driver needs to access these lists from within a dma_fence signalling critical section, it can instead choose to protect it with a separate lock, which can be locked from within the dma_fence signalling critical section. Such drivers then need to pay additional attention to what locks need to be taken from within the loop when iterating over the gpu_vm_bo and gpu_vma lists to avoid locking-order violations.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKxhjhhubh)}(hThe DRM GPUVM set of helpers provide lockdep asserts that this lock is held in relevant situations and also provides a means of making itself aware of which lock is actually used: :c:func:`drm_gem_gpuva_set_lock`.h](hThe DRM GPUVM set of helpers provide lockdep asserts that this lock is held in relevant situations and also provides a means of making itself aware of which lock is actually used: }(hjhhhNhNubh)}(h :c:func:`drm_gem_gpuva_set_lock`h]jJ)}(hjh]hdrm_gem_gpuva_set_lock()}(hjhhhNhNubah}(h]h ](jcc-funceh"]h$]h&]uh1jIhjubah}(h]h ]h"]h$]h&]refdocj refdomainjreftypefunc refexplicitrefwarnjdrm_gem_gpuva_set_lockuh1hhhhKhjubh.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hX'Each gpu_vm_bo holds a reference counted pointer to the underlying GEM object, and each gpu_vma holds a reference counted pointer to the gpu_vm_bo. When iterating over the GEM object's list of gpu_vm_bos and over the gpu_vm_bo's list of gpu_vmas, the ``gem_object->gpuva_lock`` must not be dropped, otherwise, gpu_vmas attached to a gpu_vm_bo may disappear without notice since those are not reference-counted. A driver may implement its own scheme to allow this at the expense of additional complexity, but this is outside the scope of this document.h](hEach gpu_vm_bo holds a reference counted pointer to the underlying GEM object, and each gpu_vma holds a reference counted pointer to the gpu_vm_bo. When iterating over the GEM object’s list of gpu_vm_bos and over the gpu_vm_bo’s list of gpu_vmas, the }(hjhhhNhNubjJ)}(h``gem_object->gpuva_lock``h]hgem_object->gpuva_lock}(hj$hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubhX must not be dropped, otherwise, gpu_vmas attached to a gpu_vm_bo may disappear without notice since those are not reference-counted. A driver may implement its own scheme to allow this at the expense of additional complexity, but this is outside the scope of this document.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hXIn the DRM GPUVM implementation, each gpu_vm_bo and each gpu_vma holds a reference count on the gpu_vm itself. Due to this, and to avoid circular reference counting, cleanup of the gpu_vm's gpu_vmas must not be done from the gpu_vm's destructor. Drivers typically implements a gpu_vm close function for this cleanup. The gpu_vm close function will abort gpu execution using this VM, unmap all gpu_vmas and release page-table memory.h]hXIn the DRM GPUVM implementation, each gpu_vm_bo and each gpu_vma holds a reference count on the gpu_vm itself. Due to this, and to avoid circular reference counting, cleanup of the gpu_vm’s gpu_vmas must not be done from the gpu_vm’s destructor. Drivers typically implements a gpu_vm close function for this cleanup. The gpu_vm close function will abort gpu execution using this VM, unmap all gpu_vmas and release page-table memory.}(hj<hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubeh}(h](2protection-and-lifetime-of-gpu-vm-bos-and-gpu-vmasjeh ]h"](2protection and lifetime of gpu_vm_bos and gpu_vmasgpu_vma lifetimeeh$]h&]uh1hhhhhhhhKvexpect_referenced_by_name}jPjsexpect_referenced_by_id}jjsubh)}(hhh](h)}(h*Revalidation and eviction of local objectsh]h*Revalidation and eviction of local objects}(hjZhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjWhhhhhKubh)}(hNote that in all the code examples given below we use simplified pseudo-code. In particular, the dma_resv deadlock avoidance algorithm as well as reserving memory for dma_resv fences is left out.h]hNote that in all the code examples given below we use simplified pseudo-code. In particular, the dma_resv deadlock avoidance algorithm as well as reserving memory for dma_resv fences is left out.}(hjhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjWhhubh)}(hhh](h)}(h Revalidationh]h Revalidation}(hjyhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjvhhhhhKubh)}(hXWith VM_BIND, all local objects need to be resident when the gpu is executing using the gpu_vm, and the objects need to have valid gpu_vmas set up pointing to them. Typically, each gpu command buffer submission is therefore preceded with a re-validation section:h]hXWith VM_BIND, all local objects need to be resident when the gpu is executing using the gpu_vm, and the objects need to have valid gpu_vmas set up pointing to them. Typically, each gpu command buffer submission is therefore preceded with a re-validation section:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjvhhubh literal_block)}(hXdma_resv_lock(gpu_vm->resv); // Validation section starts here. for_each_gpu_vm_bo_on_evict_list(&gpu_vm->evict_list, &gpu_vm_bo) { validate_gem_bo(&gpu_vm_bo->gem_bo); // The following list iteration needs the Gem object's // dma_resv to be held (it protects the gpu_vm_bo's list of // gpu_vmas, but since local gem objects share the gpu_vm's // dma_resv, it is already held at this point. for_each_gpu_vma_of_gpu_vm_bo(&gpu_vm_bo, &gpu_vma) move_gpu_vma_to_rebind_list(&gpu_vma, &gpu_vm->rebind_list); } for_each_gpu_vma_on_rebind_list(&gpu vm->rebind_list, &gpu_vma) { rebind_gpu_vma(&gpu_vma); remove_gpu_vma_from_rebind_list(&gpu_vma); } // Validation section ends here, and job submission starts. add_dependencies(&gpu_job, &gpu_vm->resv); job_dma_fence = gpu_submit(&gpu_job)); add_dma_fence(job_dma_fence, &gpu_vm->resv); dma_resv_unlock(gpu_vm->resv);h]hXdma_resv_lock(gpu_vm->resv); // Validation section starts here. for_each_gpu_vm_bo_on_evict_list(&gpu_vm->evict_list, &gpu_vm_bo) { validate_gem_bo(&gpu_vm_bo->gem_bo); // The following list iteration needs the Gem object's // dma_resv to be held (it protects the gpu_vm_bo's list of // gpu_vmas, but since local gem objects share the gpu_vm's // dma_resv, it is already held at this point. for_each_gpu_vma_of_gpu_vm_bo(&gpu_vm_bo, &gpu_vma) move_gpu_vma_to_rebind_list(&gpu_vma, &gpu_vm->rebind_list); } for_each_gpu_vma_on_rebind_list(&gpu vm->rebind_list, &gpu_vma) { rebind_gpu_vma(&gpu_vma); remove_gpu_vma_from_rebind_list(&gpu_vma); } // Validation section ends here, and job submission starts. add_dependencies(&gpu_job, &gpu_vm->resv); job_dma_fence = gpu_submit(&gpu_job)); add_dma_fence(job_dma_fence, &gpu_vm->resv); dma_resv_unlock(gpu_vm->resv);}hjsbah}(h]h ]h"]h$]h&]hhforcelanguageChighlight_args}uh1jhhhKhjvhhubh)}(hThe reason for having a separate gpu_vm rebind list is that there might be userptr gpu_vmas that are not mapping a buffer object that also need rebinding.h]hThe reason for having a separate gpu_vm rebind list is that there might be userptr gpu_vmas that are not mapping a buffer object that also need rebinding.}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjvhhubeh}(h] revalidationah ]h"] revalidationah$]h&]uh1hhjWhhhhhKubh)}(hhh](h)}(hEvictionh]hEviction}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhKubh)}(hOEviction of one of these local objects will then look similar to the following:h]hOEviction of one of these local objects will then look similar to the following:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubj)}(hX[obj = get_object_from_lru(); dma_resv_lock(obj->resv); for_each_gpu_vm_bo_of_obj(obj, &gpu_vm_bo); add_gpu_vm_bo_to_evict_list(&gpu_vm_bo, &gpu_vm->evict_list); add_dependencies(&eviction_job, &obj->resv); job_dma_fence = gpu_submit(&eviction_job); add_dma_fence(&obj->resv, job_dma_fence); dma_resv_unlock(&obj->resv); put_object(obj);h]hX[obj = get_object_from_lru(); dma_resv_lock(obj->resv); for_each_gpu_vm_bo_of_obj(obj, &gpu_vm_bo); add_gpu_vm_bo_to_evict_list(&gpu_vm_bo, &gpu_vm->evict_list); add_dependencies(&eviction_job, &obj->resv); job_dma_fence = gpu_submit(&eviction_job); add_dma_fence(&obj->resv, job_dma_fence); dma_resv_unlock(&obj->resv); put_object(obj);}hjsbah}(h]h ]h"]h$]h&]hhjjjj}uh1jhhhKhjhhubh)}(hXNote that since the object is local to the gpu_vm, it will share the gpu_vm's dma_resv lock such that ``obj->resv == gpu_vm->resv``. The gpu_vm_bos marked for eviction are put on the gpu_vm's evict list, which is protected by ``gpu_vm->resv``. During eviction all local objects have their dma_resv locked and, due to the above equality, also the gpu_vm's dma_resv protecting the gpu_vm's evict list is locked.h](hhNote that since the object is local to the gpu_vm, it will share the gpu_vm’s dma_resv lock such that }(hjhhhNhNubjJ)}(h``obj->resv == gpu_vm->resv``h]hobj->resv == gpu_vm->resv}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubha. The gpu_vm_bos marked for eviction are put on the gpu_vm’s evict list, which is protected by }(hjhhhNhNubjJ)}(h``gpu_vm->resv``h]h gpu_vm->resv}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh. During eviction all local objects have their dma_resv locked and, due to the above equality, also the gpu_vm’s dma_resv protecting the gpu_vm’s evict list is locked.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hX With VM_BIND, gpu_vmas don't need to be unbound before eviction, since the driver must ensure that the eviction blit or copy will wait for GPU idle or depend on all previous GPU activity. Furthermore, any subsequent attempt by the GPU to access freed memory through the gpu_vma will be preceded by a new exec function, with a revalidation section which will make sure all gpu_vmas are rebound. The eviction code holding the object's dma_resv while revalidating will ensure a new exec function may not race with the eviction.h]hXWith VM_BIND, gpu_vmas don’t need to be unbound before eviction, since the driver must ensure that the eviction blit or copy will wait for GPU idle or depend on all previous GPU activity. Furthermore, any subsequent attempt by the GPU to access freed memory through the gpu_vma will be preceded by a new exec function, with a revalidation section which will make sure all gpu_vmas are rebound. The eviction code holding the object’s dma_resv while revalidating will ensure a new exec function may not race with the eviction.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhKhjhhubh)}(hA driver can be implemented in such a way that, on each exec function, only a subset of vmas are selected for rebind. In this case, all vmas that are *not* selected for rebind must be unbound before the exec function workload is submitted.h](hA driver can be implemented in such a way that, on each exec function, only a subset of vmas are selected for rebind. In this case, all vmas that are }(hj.hhhNhNubhemphasis)}(h*not*h]hnot}(hj8hhhNhNubah}(h]h ]h"]h$]h&]uh1j6hj.ubhT selected for rebind must be unbound before the exec function workload is submitted.}(hj.hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhjhhubeh}(h]evictionah ]h"]evictionah$]h&]uh1hhjWhhhhhKubeh}(h]*revalidation-and-eviction-of-local-objectsah ]h"]*revalidation and eviction of local objectsah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(h$Locking with external buffer objectsh]h$Locking with external buffer objects}(hjchhhNhNubah}(h]h ]h"]h$]h&]uh1hhj`hhhhhKubh)}(hXSince external buffer objects may be shared by multiple gpu_vm's they can't share their reservation object with a single gpu_vm. Instead they need to have a reservation object of their own. The external objects bound to a gpu_vm using one or many gpu_vmas are therefore put on a per-gpu_vm list which is protected by the gpu_vm's dma_resv lock or one of the :ref:`gpu_vm list spinlocks `. Once the gpu_vm's reservation object is locked, it is safe to traverse the external object list and lock the dma_resvs of all external objects. However, if instead a list spinlock is used, a more elaborate iteration scheme needs to be used.h](hXlSince external buffer objects may be shared by multiple gpu_vm’s they can’t share their reservation object with a single gpu_vm. Instead they need to have a reservation object of their own. The external objects bound to a gpu_vm using one or many gpu_vmas are therefore put on a per-gpu_vm list which is protected by the gpu_vm’s dma_resv lock or one of the }(hjqhhhNhNubh)}(h1:ref:`gpu_vm list spinlocks `h]h)}(hj{h]hgpu_vm list spinlocks}(hj}hhhNhNubah}(h]h ](jstdstd-refeh"]h$]h&]uh1hhjyubah}(h]h ]h"]h$]h&]refdocj refdomainjreftyperef refexplicitrefwarnjspinlock iterationuh1hhhhKhjqubh. Once the gpu_vm’s reservation object is locked, it is safe to traverse the external object list and lock the dma_resvs of all external objects. However, if instead a list spinlock is used, a more elaborate iteration scheme needs to be used.}(hjqhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhj`hhubh)}(hXAt eviction time, the gpu_vm_bos of *all* the gpu_vms an external object is bound to need to be put on their gpu_vm's evict list. However, when evicting an external object, the dma_resvs of the gpu_vms the object is bound to are typically not held. Only the object's private dma_resv can be guaranteed to be held. If there is a ww_acquire context at hand at eviction time we could grab those dma_resvs but that could cause expensive ww_mutex rollbacks. A simple option is to just mark the gpu_vm_bos of the evicted gem object with an ``evicted`` bool that is inspected before the next time the corresponding gpu_vm evicted list needs to be traversed. For example, when traversing the list of external objects and locking them. At that time, both the gpu_vm's dma_resv and the object's dma_resv is held, and the gpu_vm_bo marked evicted, can then be added to the gpu_vm's list of evicted gpu_vm_bos. The ``evicted`` bool is formally protected by the object's dma_resv.h](h$At eviction time, the gpu_vm_bos of }(hjhhhNhNubj7)}(h*all*h]hall}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j6hjubhX the gpu_vms an external object is bound to need to be put on their gpu_vm’s evict list. However, when evicting an external object, the dma_resvs of the gpu_vms the object is bound to are typically not held. Only the object’s private dma_resv can be guaranteed to be held. If there is a ww_acquire context at hand at eviction time we could grab those dma_resvs but that could cause expensive ww_mutex rollbacks. A simple option is to just mark the gpu_vm_bos of the evicted gem object with an }(hjhhhNhNubjJ)}(h ``evicted``h]hevicted}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubhXl bool that is inspected before the next time the corresponding gpu_vm evicted list needs to be traversed. For example, when traversing the list of external objects and locking them. At that time, both the gpu_vm’s dma_resv and the object’s dma_resv is held, and the gpu_vm_bo marked evicted, can then be added to the gpu_vm’s list of evicted gpu_vm_bos. The }(hjhhhNhNubjJ)}(h ``evicted``h]hevicted}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh7 bool is formally protected by the object’s dma_resv.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhKhj`hhubh)}(hThe exec function becomesh]hThe exec function becomes}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM hj`hhubj)}(hXdma_resv_lock(gpu_vm->resv); // External object list is protected by the gpu_vm->resv lock. for_each_gpu_vm_bo_on_extobj_list(gpu_vm, &gpu_vm_bo) { dma_resv_lock(gpu_vm_bo.gem_obj->resv); if (gpu_vm_bo_marked_evicted(&gpu_vm_bo)) add_gpu_vm_bo_to_evict_list(&gpu_vm_bo, &gpu_vm->evict_list); } for_each_gpu_vm_bo_on_evict_list(&gpu_vm->evict_list, &gpu_vm_bo) { validate_gem_bo(&gpu_vm_bo->gem_bo); for_each_gpu_vma_of_gpu_vm_bo(&gpu_vm_bo, &gpu_vma) move_gpu_vma_to_rebind_list(&gpu_vma, &gpu_vm->rebind_list); } for_each_gpu_vma_on_rebind_list(&gpu vm->rebind_list, &gpu_vma) { rebind_gpu_vma(&gpu_vma); remove_gpu_vma_from_rebind_list(&gpu_vma); } add_dependencies(&gpu_job, &gpu_vm->resv); job_dma_fence = gpu_submit(&gpu_job)); add_dma_fence(job_dma_fence, &gpu_vm->resv); for_each_external_obj(gpu_vm, &obj) add_dma_fence(job_dma_fence, &obj->resv); dma_resv_unlock_all_resv_locks();h]hXdma_resv_lock(gpu_vm->resv); // External object list is protected by the gpu_vm->resv lock. for_each_gpu_vm_bo_on_extobj_list(gpu_vm, &gpu_vm_bo) { dma_resv_lock(gpu_vm_bo.gem_obj->resv); if (gpu_vm_bo_marked_evicted(&gpu_vm_bo)) add_gpu_vm_bo_to_evict_list(&gpu_vm_bo, &gpu_vm->evict_list); } for_each_gpu_vm_bo_on_evict_list(&gpu_vm->evict_list, &gpu_vm_bo) { validate_gem_bo(&gpu_vm_bo->gem_bo); for_each_gpu_vma_of_gpu_vm_bo(&gpu_vm_bo, &gpu_vma) move_gpu_vma_to_rebind_list(&gpu_vma, &gpu_vm->rebind_list); } for_each_gpu_vma_on_rebind_list(&gpu vm->rebind_list, &gpu_vma) { rebind_gpu_vma(&gpu_vma); remove_gpu_vma_from_rebind_list(&gpu_vma); } add_dependencies(&gpu_job, &gpu_vm->resv); job_dma_fence = gpu_submit(&gpu_job)); add_dma_fence(job_dma_fence, &gpu_vm->resv); for_each_external_obj(gpu_vm, &obj) add_dma_fence(job_dma_fence, &obj->resv); dma_resv_unlock_all_resv_locks();}hjsbah}(h]h ]h"]h$]h&]hhjjjj}uh1jhhhMhj`hhubh)}(hCAnd the corresponding shared-object aware eviction would look like:h]hCAnd the corresponding shared-object aware eviction would look like:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM-hj`hhubj)}(hXobj = get_object_from_lru(); dma_resv_lock(obj->resv); for_each_gpu_vm_bo_of_obj(obj, &gpu_vm_bo) if (object_is_vm_local(obj)) add_gpu_vm_bo_to_evict_list(&gpu_vm_bo, &gpu_vm->evict_list); else mark_gpu_vm_bo_evicted(&gpu_vm_bo); add_dependencies(&eviction_job, &obj->resv); job_dma_fence = gpu_submit(&eviction_job); add_dma_fence(&obj->resv, job_dma_fence); dma_resv_unlock(&obj->resv); put_object(obj);h]hXobj = get_object_from_lru(); dma_resv_lock(obj->resv); for_each_gpu_vm_bo_of_obj(obj, &gpu_vm_bo) if (object_is_vm_local(obj)) add_gpu_vm_bo_to_evict_list(&gpu_vm_bo, &gpu_vm->evict_list); else mark_gpu_vm_bo_evicted(&gpu_vm_bo); add_dependencies(&eviction_job, &obj->resv); job_dma_fence = gpu_submit(&eviction_job); add_dma_fence(&obj->resv, job_dma_fence); dma_resv_unlock(&obj->resv); put_object(obj);}hjsbah}(h]h ]h"]h$]h&]hhjjjj}uh1jhhhM/hj`hhubj)}(h.. _Spinlock iteration:h]h}(h]h ]h"]h$]h&]jspinlock-iterationuh1jhMAhj`hhhhubeh}(h]$locking-with-external-buffer-objectsah ]h"]$locking with external buffer objectsah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(h;Accessing the gpu_vm's lists without the dma_resv lock heldh]h=Accessing the gpu_vm’s lists without the dma_resv lock held}(hj7hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj4hhhhhMDubh)}(hXqSome drivers will hold the gpu_vm's dma_resv lock when accessing the gpu_vm's evict list and external objects lists. However, there are drivers that need to access these lists without the dma_resv lock held, for example due to asynchronous state updates from within the dma_fence signalling critical path. In such cases, a spinlock can be used to protect manipulation of the lists. However, since higher level sleeping locks need to be taken for each list item while iterating over the lists, the items already iterated over need to be temporarily moved to a private list and the spinlock released while processing each item:h]hXuSome drivers will hold the gpu_vm’s dma_resv lock when accessing the gpu_vm’s evict list and external objects lists. However, there are drivers that need to access these lists without the dma_resv lock held, for example due to asynchronous state updates from within the dma_fence signalling critical path. In such cases, a spinlock can be used to protect manipulation of the lists. However, since higher level sleeping locks need to be taken for each list item while iterating over the lists, the items already iterated over need to be temporarily moved to a private list and the spinlock released while processing each item:}(hjEhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMFhj4hhubh)}(hX@code block:: C struct list_head still_in_list; INIT_LIST_HEAD(&still_in_list); spin_lock(&gpu_vm->list_lock); do { struct list_head *entry = list_first_entry_or_null(&gpu_vm->list, head); if (!entry) break; list_move_tail(&entry->head, &still_in_list); list_entry_get_unless_zero(entry); spin_unlock(&gpu_vm->list_lock); process(entry); spin_lock(&gpu_vm->list_lock); list_entry_put(entry); } while (true); list_splice_tail(&still_in_list, &gpu_vm->list); spin_unlock(&gpu_vm->list_lock);h]hX@code block:: C struct list_head still_in_list; INIT_LIST_HEAD(&still_in_list); spin_lock(&gpu_vm->list_lock); do { struct list_head *entry = list_first_entry_or_null(&gpu_vm->list, head); if (!entry) break; list_move_tail(&entry->head, &still_in_list); list_entry_get_unless_zero(entry); spin_unlock(&gpu_vm->list_lock); process(entry); spin_lock(&gpu_vm->list_lock); list_entry_put(entry); } while (true); list_splice_tail(&still_in_list, &gpu_vm->list); spin_unlock(&gpu_vm->list_lock);}hjSsbah}(h]h ]h"]h$]h&]hhuh1hhj4hhhhhMjubh)}(hXDue to the additional locking and atomic operations, drivers that *can* avoid accessing the gpu_vm's list outside of the dma_resv lock might want to avoid also this iteration scheme. Particularly, if the driver anticipates a large number of list items. For lists where the anticipated number of list items is small, where list iteration doesn't happen very often or if there is a significant additional cost associated with each iteration, the atomic operation overhead associated with this type of iteration is, most likely, negligible. Note that if this scheme is used, it is necessary to make sure this list iteration is protected by an outer level lock or semaphore, since list items are temporarily pulled off the list while iterating, and it is also worth mentioning that the local list ``still_in_list`` should also be considered protected by the ``gpu_vm->list_lock``, and it is thus possible that items can be removed also from the local list concurrently with list iteration.h](hBDue to the additional locking and atomic operations, drivers that }(hjahhhNhNubj7)}(h*can*h]hcan}(hjihhhNhNubah}(h]h ]h"]h$]h&]uh1j6hjaubhX avoid accessing the gpu_vm’s list outside of the dma_resv lock might want to avoid also this iteration scheme. Particularly, if the driver anticipates a large number of list items. For lists where the anticipated number of list items is small, where list iteration doesn’t happen very often or if there is a significant additional cost associated with each iteration, the atomic operation overhead associated with this type of iteration is, most likely, negligible. Note that if this scheme is used, it is necessary to make sure this list iteration is protected by an outer level lock or semaphore, since list items are temporarily pulled off the list while iterating, and it is also worth mentioning that the local list }(hjahhhNhNubjJ)}(h``still_in_list``h]h still_in_list}(hj{hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjaubh, should also be considered protected by the }(hjahhhNhNubjJ)}(h``gpu_vm->list_lock``h]hgpu_vm->list_lock}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjaubhn, and it is thus possible that items can be removed also from the local list concurrently with list iteration.}(hjahhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMkhj4hhubh)}(hPlease refer to the :ref:`DRM GPUVM locking section ` and its internal :c:func:`get_next_vm_bo_from_list` function.h](hPlease refer to the }(hjhhhNhNubh)}(h4:ref:`DRM GPUVM locking section `h]h)}(hjh]hDRM GPUVM locking section}(hjhhhNhNubah}(h]h ](jstdstd-refeh"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]refdocj refdomainjreftyperef refexplicitrefwarnjdrm_gpuvm_lockinguh1hhhhM{hjubh and its internal }(hjhhhNhNubh)}(h":c:func:`get_next_vm_bo_from_list`h]jJ)}(hjh]hget_next_vm_bo_from_list()}(hjhhhNhNubah}(h]h ](jjc-funceh"]h$]h&]uh1jIhjubah}(h]h ]h"]h$]h&]refdocj refdomainjreftypefunc refexplicitrefwarnjget_next_vm_bo_from_listuh1hhhhM{hjubh function.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM{hj4hhubeh}(h](;accessing-the-gpu-vm-s-lists-without-the-dma-resv-lock-heldj+eh ]h"](;accessing the gpu_vm's lists without the dma_resv lock heldspinlock iterationeh$]h&]uh1hhhhhhhhMDjS}jj!sjU}j+j!subh)}(hhh](h)}(huserptr gpu_vmash]huserptr gpu_vmas}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhjhhhhhMubh)}(hX4A userptr gpu_vma is a gpu_vma that, instead of mapping a buffer object to a GPU virtual address range, directly maps a CPU mm range of anonymous- or file page-cache pages. A very simple approach would be to just pin the pages using pin_user_pages() at bind time and unpin them at unbind time, but this creates a Denial-Of-Service vector since a single user-space process would be able to pin down all of system memory, which is not desirable. (For special use-cases and assuming proper accounting pinning might still be a desirable feature, though). What we need to do in the general case is to obtain a reference to the desired pages, make sure we are notified using a MMU notifier just before the CPU mm unmaps the pages, dirty them if they are not mapped read-only to the GPU, and then drop the reference. When we are notified by the MMU notifier that CPU mm is about to drop the pages, we need to stop GPU access to the pages by waiting for VM idle in the MMU notifier and make sure that before the next time the GPU tries to access whatever is now present in the CPU mm range, we unmap the old pages from the GPU page tables and repeat the process of obtaining new page references. (See the :ref:`notifier example ` below). Note that when the core mm decides to laundry pages, we get such an unmap MMU notification and can mark the pages dirty again before the next GPU access. We also get similar MMU notifications for NUMA accounting which the GPU driver doesn't really need to care about, but so far it has proven difficult to exclude certain notifications.h](hXA userptr gpu_vma is a gpu_vma that, instead of mapping a buffer object to a GPU virtual address range, directly maps a CPU mm range of anonymous- or file page-cache pages. A very simple approach would be to just pin the pages using pin_user_pages() at bind time and unpin them at unbind time, but this creates a Denial-Of-Service vector since a single user-space process would be able to pin down all of system memory, which is not desirable. (For special use-cases and assuming proper accounting pinning might still be a desirable feature, though). What we need to do in the general case is to obtain a reference to the desired pages, make sure we are notified using a MMU notifier just before the CPU mm unmaps the pages, dirty them if they are not mapped read-only to the GPU, and then drop the reference. When we are notified by the MMU notifier that CPU mm is about to drop the pages, we need to stop GPU access to the pages by waiting for VM idle in the MMU notifier and make sure that before the next time the GPU tries to access whatever is now present in the CPU mm range, we unmap the old pages from the GPU page tables and repeat the process of obtaining new page references. (See the }(hjhhhNhNubh)}(h.:ref:`notifier example `h]h)}(hj h]hnotifier example}(hj"hhhNhNubah}(h]h ](jstdstd-refeh"]h$]h&]uh1hhjubah}(h]h ]h"]h$]h&]refdocj refdomainj,reftyperef refexplicitrefwarnjinvalidation exampleuh1hhhhMhjubhX[ below). Note that when the core mm decides to laundry pages, we get such an unmap MMU notification and can mark the pages dirty again before the next GPU access. We also get similar MMU notifications for NUMA accounting which the GPU driver doesn’t really need to care about, but so far it has proven difficult to exclude certain notifications.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhjhhubh)}(hUsing a MMU notifier for device DMA (and other methods) is described in :ref:`the pin_user_pages() documentation `.h](hHUsing a MMU notifier for device DMA (and other methods) is described in }(hjHhhhNhNubh)}(hJ:ref:`the pin_user_pages() documentation `h]h)}(hjRh]h"the pin_user_pages() documentation}(hjThhhNhNubah}(h]h ](jstdstd-refeh"]h$]h&]uh1hhjPubah}(h]h ]h"]h$]h&]refdocj refdomainj^reftyperef refexplicitrefwarnjmmu-notifier-registration-caseuh1hhhhMhjHubh.}(hjHhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhjhhubh)}(hXNow, the method of obtaining struct page references using get_user_pages() unfortunately can't be used under a dma_resv lock since that would violate the locking order of the dma_resv lock vs the mmap_lock that is grabbed when resolving a CPU pagefault. This means the gpu_vm's list of userptr gpu_vmas needs to be protected by an outer lock, which in our example below is the ``gpu_vm->lock``.h](hX}Now, the method of obtaining struct page references using get_user_pages() unfortunately can’t be used under a dma_resv lock since that would violate the locking order of the dma_resv lock vs the mmap_lock that is grabbed when resolving a CPU pagefault. This means the gpu_vm’s list of userptr gpu_vmas needs to be protected by an outer lock, which in our example below is the }(hjzhhhNhNubjJ)}(h``gpu_vm->lock``h]h gpu_vm->lock}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjzubh.}(hjzhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhjhhubh)}(hLThe MMU interval seqlock for a userptr gpu_vma is used in the following way:h]hLThe MMU interval seqlock for a userptr gpu_vma is used in the following way:}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubj)}(hX// Exclusive locking mode here is strictly needed only if there are // invalidated userptr gpu_vmas present, to avoid concurrent userptr // revalidations of the same userptr gpu_vma. down_write(&gpu_vm->lock); retry: // Note: mmu_interval_read_begin() blocks until there is no // invalidation notifier running anymore. seq = mmu_interval_read_begin(&gpu_vma->userptr_interval); if (seq != gpu_vma->saved_seq) { obtain_new_page_pointers(&gpu_vma); dma_resv_lock(&gpu_vm->resv); add_gpu_vma_to_revalidate_list(&gpu_vma, &gpu_vm); dma_resv_unlock(&gpu_vm->resv); gpu_vma->saved_seq = seq; } // The usual revalidation goes here. // Final userptr sequence validation may not happen before the // submission dma_fence is added to the gpu_vm's resv, from the POW // of the MMU invalidation notifier. Hence the // userptr_notifier_lock that will make them appear atomic. add_dependencies(&gpu_job, &gpu_vm->resv); down_read(&gpu_vm->userptr_notifier_lock); if (mmu_interval_read_retry(&gpu_vma->userptr_interval, gpu_vma->saved_seq)) { up_read(&gpu_vm->userptr_notifier_lock); goto retry; } job_dma_fence = gpu_submit(&gpu_job)); add_dma_fence(job_dma_fence, &gpu_vm->resv); for_each_external_obj(gpu_vm, &obj) add_dma_fence(job_dma_fence, &obj->resv); dma_resv_unlock_all_resv_locks(); up_read(&gpu_vm->userptr_notifier_lock); up_write(&gpu_vm->lock);h]hX// Exclusive locking mode here is strictly needed only if there are // invalidated userptr gpu_vmas present, to avoid concurrent userptr // revalidations of the same userptr gpu_vma. down_write(&gpu_vm->lock); retry: // Note: mmu_interval_read_begin() blocks until there is no // invalidation notifier running anymore. seq = mmu_interval_read_begin(&gpu_vma->userptr_interval); if (seq != gpu_vma->saved_seq) { obtain_new_page_pointers(&gpu_vma); dma_resv_lock(&gpu_vm->resv); add_gpu_vma_to_revalidate_list(&gpu_vma, &gpu_vm); dma_resv_unlock(&gpu_vm->resv); gpu_vma->saved_seq = seq; } // The usual revalidation goes here. // Final userptr sequence validation may not happen before the // submission dma_fence is added to the gpu_vm's resv, from the POW // of the MMU invalidation notifier. Hence the // userptr_notifier_lock that will make them appear atomic. add_dependencies(&gpu_job, &gpu_vm->resv); down_read(&gpu_vm->userptr_notifier_lock); if (mmu_interval_read_retry(&gpu_vma->userptr_interval, gpu_vma->saved_seq)) { up_read(&gpu_vm->userptr_notifier_lock); goto retry; } job_dma_fence = gpu_submit(&gpu_job)); add_dma_fence(job_dma_fence, &gpu_vm->resv); for_each_external_obj(gpu_vm, &obj) add_dma_fence(job_dma_fence, &obj->resv); dma_resv_unlock_all_resv_locks(); up_read(&gpu_vm->userptr_notifier_lock); up_write(&gpu_vm->lock);}hjsbah}(h]h ]h"]h$]h&]hhjjjj}uh1jhhhMhjhhubh)}(hXEThe code between ``mmu_interval_read_begin()`` and the ``mmu_interval_read_retry()`` marks the read side critical section of what we call the ``userptr_seqlock``. In reality, the gpu_vm's userptr gpu_vma list is looped through, and the check is done for *all* of its userptr gpu_vmas, although we only show a single one here.h](hThe code between }(hjhhhNhNubjJ)}(h``mmu_interval_read_begin()``h]hmmu_interval_read_begin()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh and the }(hjhhhNhNubjJ)}(h``mmu_interval_read_retry()``h]hmmu_interval_read_retry()}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh: marks the read side critical section of what we call the }(hjhhhNhNubjJ)}(h``userptr_seqlock``h]huserptr_seqlock}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjubh_. In reality, the gpu_vm’s userptr gpu_vma list is looped through, and the check is done for }(hjhhhNhNubj7)}(h*all*h]hall}(hjhhhNhNubah}(h]h ]h"]h$]h&]uh1j6hjubhB of its userptr gpu_vmas, although we only show a single one here.}(hjhhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhjhhubh)}(hThe userptr gpu_vma MMU invalidation notifier might be called from reclaim context and, again, to avoid locking order violations, we can't take any dma_resv lock nor the gpu_vm->lock from within it.h]hThe userptr gpu_vma MMU invalidation notifier might be called from reclaim context and, again, to avoid locking order violations, we can’t take any dma_resv lock nor the gpu_vm->lock from within it.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubj)}(h.. _Invalidation example:h]h}(h]h ]h"]h$]h&]jinvalidation-exampleuh1jhMhjhhhhubj)}(hXhbool gpu_vma_userptr_invalidate(userptr_interval, cur_seq) { // Make sure the exec function either sees the new sequence // and backs off or we wait for the dma-fence: down_write(&gpu_vm->userptr_notifier_lock); mmu_interval_set_seq(userptr_interval, cur_seq); up_write(&gpu_vm->userptr_notifier_lock); // At this point, the exec function can't succeed in // submitting a new job, because cur_seq is an invalid // sequence number and will always cause a retry. When all // invalidation callbacks, the mmu notifier core will flip // the sequence number to a valid one. However we need to // stop gpu access to the old pages here. dma_resv_wait_timeout(&gpu_vm->resv, DMA_RESV_USAGE_BOOKKEEP, false, MAX_SCHEDULE_TIMEOUT); return true; }h]hXhbool gpu_vma_userptr_invalidate(userptr_interval, cur_seq) { // Make sure the exec function either sees the new sequence // and backs off or we wait for the dma-fence: down_write(&gpu_vm->userptr_notifier_lock); mmu_interval_set_seq(userptr_interval, cur_seq); up_write(&gpu_vm->userptr_notifier_lock); // At this point, the exec function can't succeed in // submitting a new job, because cur_seq is an invalid // sequence number and will always cause a retry. When all // invalidation callbacks, the mmu notifier core will flip // the sequence number to a valid one. However we need to // stop gpu access to the old pages here. dma_resv_wait_timeout(&gpu_vm->resv, DMA_RESV_USAGE_BOOKKEEP, false, MAX_SCHEDULE_TIMEOUT); return true; }<}hj& sbah}(h]j% ah ]h"]invalidation exampleah$]h&]hhjjjj}uh1jhhhMhjhhjS}j2 j sjU}j% j subh)}(hWhen this invalidation notifier returns, the GPU can no longer be accessing the old pages of the userptr gpu_vma and needs to redo the page-binding before a new GPU submission can succeed.h]hWhen this invalidation notifier returns, the GPU can no longer be accessing the old pages of the userptr gpu_vma and needs to redo the page-binding before a new GPU submission can succeed.}(hj8 hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjhhubh)}(hhh](h)}(h1Efficient userptr gpu_vma exec_function iterationh]h1Efficient userptr gpu_vma exec_function iteration}(hjI hhhNhNubah}(h]h ]h"]h$]h&]uh1hhjF hhhhhMubh)}(hXNIf the gpu_vm's list of userptr gpu_vmas becomes large, it's inefficient to iterate through the complete lists of userptrs on each exec function to check whether each userptr gpu_vma's saved sequence number is stale. A solution to this is to put all *invalidated* userptr gpu_vmas on a separate gpu_vm list and only check the gpu_vmas present on this list on each exec function. This list will then lend itself very-well to the spinlock locking scheme that is :ref:`described in the spinlock iteration section `, since in the mmu notifier, where we add the invalidated gpu_vmas to the list, it's not possible to take any outer locks like the ``gpu_vm->lock`` or the ``gpu_vm->resv`` lock. Note that the ``gpu_vm->lock`` still needs to be taken while iterating to ensure the list is complete, as also mentioned in that section.h](hXIf the gpu_vm’s list of userptr gpu_vmas becomes large, it’s inefficient to iterate through the complete lists of userptrs on each exec function to check whether each userptr gpu_vma’s saved sequence number is stale. A solution to this is to put all }(hjW hhhNhNubj7)}(h *invalidated*h]h invalidated}(hj_ hhhNhNubah}(h]h ]h"]h$]h&]uh1j6hjW ubh userptr gpu_vmas on a separate gpu_vm list and only check the gpu_vmas present on this list on each exec function. This list will then lend itself very-well to the spinlock locking scheme that is }(hjW hhhNhNubh)}(hG:ref:`described in the spinlock iteration section `h]h)}(hjs h]h+described in the spinlock iteration section}(hju hhhNhNubah}(h]h ](jstdstd-refeh"]h$]h&]uh1hhjq ubah}(h]h ]h"]h$]h&]refdocj refdomainj reftyperef refexplicitrefwarnjspinlock iterationuh1hhhhMhjW ubh, since in the mmu notifier, where we add the invalidated gpu_vmas to the list, it’s not possible to take any outer locks like the }(hjW hhhNhNubjJ)}(h``gpu_vm->lock``h]h gpu_vm->lock}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjW ubh or the }(hjW hhhNhNubjJ)}(h``gpu_vm->resv``h]h gpu_vm->resv}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjW ubh lock. Note that the }(hjW hhhNhNubjJ)}(h``gpu_vm->lock``h]h gpu_vm->lock}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhjW ubhk still needs to be taken while iterating to ensure the list is complete, as also mentioned in that section.}(hjW hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhjF hhubh)}(hIf using an invalidated userptr list like this, the retry check in the exec function trivially becomes a check for invalidated list empty.h]hIf using an invalidated userptr list like this, the retry check in the exec function trivially becomes a check for invalidated list empty.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMhjF hhubeh}(h]1efficient-userptr-gpu-vma-exec-function-iterationah ]h"]1efficient userptr gpu_vma exec_function iterationah$]h&]uh1hhjhhhhhMubeh}(h]userptr-gpu-vmasah ]h"]userptr gpu_vmasah$]h&]uh1hhhhhhhhMubh)}(hhh](h)}(hLocking at bind and unbind timeh]hLocking at bind and unbind time}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhj hhhhhMubh)}(hXAt bind time, assuming a GEM object backed gpu_vma, each gpu_vma needs to be associated with a gpu_vm_bo and that gpu_vm_bo in turn needs to be added to the GEM object's gpu_vm_bo list, and possibly to the gpu_vm's external object list. This is referred to as *linking* the gpu_vma, and typically requires that the ``gpu_vm->lock`` and the ``gem_object->gpuva_lock`` are held. When unlinking a gpu_vma the same locks should be held, and that ensures that when iterating over ``gpu_vmas`, either under the ``gpu_vm->resv`` or the GEM object's dma_resv, that the gpu_vmas stay alive as long as the lock under which we iterate is not released. For userptr gpu_vmas it's similarly required that during vma destroy, the outer ``gpu_vm->lock`` is held, since otherwise when iterating over the invalidated userptr list as described in the previous section, there is nothing keeping those userptr gpu_vmas alive.h](hXAt bind time, assuming a GEM object backed gpu_vma, each gpu_vma needs to be associated with a gpu_vm_bo and that gpu_vm_bo in turn needs to be added to the GEM object’s gpu_vm_bo list, and possibly to the gpu_vm’s external object list. This is referred to as }(hj hhhNhNubj7)}(h *linking*h]hlinking}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j6hj ubh. the gpu_vma, and typically requires that the }(hj hhhNhNubjJ)}(h``gpu_vm->lock``h]h gpu_vm->lock}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhj ubh and the }(hj hhhNhNubjJ)}(h``gem_object->gpuva_lock``h]hgem_object->gpuva_lock}(hj, hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhj ubhm are held. When unlinking a gpu_vma the same locks should be held, and that ensures that when iterating over }(hj hhhNhNubjJ)}(h.``gpu_vmas`, either under the ``gpu_vm->resv``h]h*gpu_vmas`, either under the ``gpu_vm->resv}(hj> hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhj ubh or the GEM object’s dma_resv, that the gpu_vmas stay alive as long as the lock under which we iterate is not released. For userptr gpu_vmas it’s similarly required that during vma destroy, the outer }(hj hhhNhNubjJ)}(h``gpu_vm->lock``h]h gpu_vm->lock}(hjP hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhj ubh is held, since otherwise when iterating over the invalidated userptr list as described in the previous section, there is nothing keeping those userptr gpu_vmas alive.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhMhj hhubeh}(h]locking-at-bind-and-unbind-timeah ]h"]locking at bind and unbind timeah$]h&]uh1hhhhhhhhMubh)}(hhh](h)}(h5Locking for recoverable page-fault page-table updatesh]h5Locking for recoverable page-fault page-table updates}(hjs hhhNhNubah}(h]h ]h"]h$]h&]uh1hhjp hhhhhM$ubh)}(hZThere are two important things we need to ensure with locking for recoverable page-faults:h]hZThere are two important things we need to ensure with locking for recoverable page-faults:}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM&hjp hhubj;)}(hhh](j@)}(hAt the time we return pages back to the system / allocator for reuse, there should be no remaining GPU mappings and any GPU TLB must have been flushed.h]h)}(hAt the time we return pages back to the system / allocator for reuse, there should be no remaining GPU mappings and any GPU TLB must have been flushed.h]hAt the time we return pages back to the system / allocator for reuse, there should be no remaining GPU mappings and any GPU TLB must have been flushed.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM)hj ubah}(h]h ]h"]h$]h&]uh1j?hj hhhhhNubj@)}(h6The unmapping and mapping of a gpu_vma must not race. h]h)}(h5The unmapping and mapping of a gpu_vma must not race.h]h5The unmapping and mapping of a gpu_vma must not race.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM,hj ubah}(h]h ]h"]h$]h&]uh1j?hj hhhhhNubeh}(h]h ]h"]h$]h&]j1j2uh1j:hhhM)hjp hhubh)}(hXUSince the unmapping (or zapping) of GPU ptes is typically taking place where it is hard or even impossible to take any outer level locks we must either introduce a new lock that is held at both mapping and unmapping time, or look at the locks we do hold at unmapping time and make sure that they are held also at mapping time. For userptr gpu_vmas, the ``userptr_seqlock`` is held in write mode in the mmu invalidation notifier where zapping happens. Hence, if the ``userptr_seqlock`` as well as the ``gpu_vm->userptr_notifier_lock`` is held in read mode during mapping, it will not race with the zapping. For GEM object backed gpu_vmas, zapping will take place under the GEM object's dma_resv and ensuring that the dma_resv is held also when populating the page-tables for any gpu_vma pointing to the GEM object, will similarly ensure we are race-free.h](hXaSince the unmapping (or zapping) of GPU ptes is typically taking place where it is hard or even impossible to take any outer level locks we must either introduce a new lock that is held at both mapping and unmapping time, or look at the locks we do hold at unmapping time and make sure that they are held also at mapping time. For userptr gpu_vmas, the }(hj hhhNhNubjJ)}(h``userptr_seqlock``h]huserptr_seqlock}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhj ubh] is held in write mode in the mmu invalidation notifier where zapping happens. Hence, if the }(hj hhhNhNubjJ)}(h``userptr_seqlock``h]huserptr_seqlock}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhj ubh as well as the }(hj hhhNhNubjJ)}(h!``gpu_vm->userptr_notifier_lock``h]hgpu_vm->userptr_notifier_lock}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1jIhj ubhXB is held in read mode during mapping, it will not race with the zapping. For GEM object backed gpu_vmas, zapping will take place under the GEM object’s dma_resv and ensuring that the dma_resv is held also when populating the page-tables for any gpu_vma pointing to the GEM object, will similarly ensure we are race-free.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1hhhhM.hjp hhubh)}(hIf any part of the mapping is performed asynchronously under a dma-fence with these locks released, the zapping will need to wait for that dma-fence to signal under the relevant lock before starting to modify the page-table.h]hIf any part of the mapping is performed asynchronously under a dma-fence with these locks released, the zapping will need to wait for that dma-fence to signal under the relevant lock before starting to modify the page-table.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhM<hjp hhubh)}(hX3Since modifying the page-table structure in a way that frees up page-table memory might also require outer level locks, the zapping of GPU ptes typically focuses only on zeroing page-table or page-directory entries and flushing TLB, whereas freeing of page-table memory is deferred to unbind or rebind time.h]hX3Since modifying the page-table structure in a way that frees up page-table memory might also require outer level locks, the zapping of GPU ptes typically focuses only on zeroing page-table or page-directory entries and flushing TLB, whereas freeing of page-table memory is deferred to unbind or rebind time.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhMAhjp hhubeh}(h]5locking-for-recoverable-page-fault-page-table-updatesah ]h"]5locking for recoverable page-fault page-table updatesah$]h&]uh1hhhhhhhhM$ubeh}(h]vm-bind-lockingah ]h"]vm_bind lockingah$]h&]uh1hhhhhhhhKubeh}(h]h ]h"]h$]h&]sourcehuh1hcurrent_sourceN current_lineNsettingsdocutils.frontendValues)}(hN generatorN datestampN source_linkN source_urlN toc_backlinksentryfootnote_backlinksK sectnum_xformKstrip_commentsNstrip_elements_with_classesN strip_classesN report_levelK halt_levelKexit_status_levelKdebugNwarning_streamN tracebackinput_encoding utf-8-siginput_encoding_error_handlerstrictoutput_encodingutf-8output_encoding_error_handlerj[ error_encodingutf-8error_encoding_error_handlerbackslashreplace language_codeenrecord_dependenciesNconfigN id_prefixhauto_id_prefixid dump_settingsNdump_internalsNdump_transformsNdump_pseudo_xmlNexpose_internalsNstrict_visitorN_disable_configN_sourceh _destinationN _config_files]7/var/lib/git/docbuild/linux/Documentation/docutils.confafile_insertion_enabled raw_enabledKline_length_limitM'pep_referencesN pep_base_urlhttps://peps.python.org/pep_file_url_templatepep-%04drfc_referencesN rfc_base_url&https://datatracker.ietf.org/doc/html/ tab_widthKtrim_footnote_reference_spacesyntax_highlightlong smart_quotessmartquotes_locales]character_level_inline_markupdoctitle_xform docinfo_xformKsectsubtitle_xform image_loadinglinkembed_stylesheetcloak_email_addressessection_self_linkenvNubreporterNindirect_targets]substitution_defs}substitution_names}refnames}refids}(j]jaj+]j!aj% ]j aunameids}(j5 j2 j&j#j8j5jjjPjjOjLj]jZjjjUjRj1j.jj+jjj j j2 j% j j jm jj j- j* u nametypes}(j5 j&j8jjPjOj]jjUj1jjj j2 j jm j- uh}(j2 hj#hj5j)jj;jjjLjjZjWjjvjRjj.j`j+j4jj4j jj% j& j jF jj j j* jp u footnote_refs} citation_refs} autofootnotes]autofootnote_refs]symbol_footnotes]symbol_footnote_refs] footnotes] citations]autofootnote_startKsymbol_footnote_startK id_counter collectionsCounter}Rparse_messages]transform_messages](hsystem_message)}(hhh]h)}(hhh]h6Hyperlink target "gpu-vma-lifetime" is not referenced.}hj sbah}(h]h ]h"]h$]h&]uh1hhj ubah}(h]h ]h"]h$]h&]levelKtypeINFOsourcehlineKsuh1j ubj )}(hhh]h)}(hhh]h8Hyperlink target "spinlock-iteration" is not referenced.}hj sbah}(h]h ]h"]h$]h&]uh1hhj ubah}(h]h ]h"]h$]h&]levelKtypej sourcehlineMAuh1j ubj )}(hhh]h)}(hhh]h:Hyperlink target "invalidation-example" is not referenced.}hj sbah}(h]h ]h"]h$]h&]uh1hhj ubah}(h]h ]h"]h$]h&]levelKtypej sourcehlineMuh1j ube transformerN include_log] decorationNhhub.