P sphinx.addnodesdocument)}( rawsource children](translations
LanguagesNode)}(hhh](h pending_xref)}(hhh]docutils.nodesTextChinese (Simplified)}parenthsba
attributes}(ids]classes]names]dupnames]backrefs] refdomainstdreftypedoc reftarget,/translations/zh_CN/filesystems/iomap/designmodnameN classnameNrefexplicitutagnamehhhubh)}(hhh]hChinese (Traditional)}hh2sbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget,/translations/zh_TW/filesystems/iomap/designmodnameN classnameNrefexplicituh1hhhubh)}(hhh]hItalian}hhFsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget,/translations/it_IT/filesystems/iomap/designmodnameN classnameNrefexplicituh1hhhubh)}(hhh]hJapanese}hhZsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget,/translations/ja_JP/filesystems/iomap/designmodnameN classnameNrefexplicituh1hhhubh)}(hhh]hKorean}hhnsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget,/translations/ko_KR/filesystems/iomap/designmodnameN classnameNrefexplicituh1hhhubh)}(hhh]hSpanish}hhsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget,/translations/sp_SP/filesystems/iomap/designmodnameN classnameNrefexplicituh1hhhubeh}(h]h ]h"]h$]h&]current_languageEnglishuh1h
hh _documenthsourceNlineNubhcomment)}(h SPDX-License-Identifier: GPL-2.0h]h SPDX-License-Identifier: GPL-2.0}hhsbah}(h]h ]h"]h$]h&] xml:spacepreserveuh1hhhhhhF/var/lib/git/docbuild/linux/Documentation/filesystems/iomap/design.rsthKubhtarget)}(h.. _iomap_design:h]h}(h]iomap-designah ]h"]iomap_designah$]h&]uh1hhKhhhhhhubh)}(hDumb style notes to maintain the author's sanity:
Please try to start sentences on separate lines so that
sentence changes don't bleed colors in diff.
Heading decorations are documented in sphinx.rst.h]hDumb style notes to maintain the author's sanity:
Please try to start sentences on separate lines so that
sentence changes don't bleed colors in diff.
Heading decorations are documented in sphinx.rst.}hhsbah}(h]h ]h"]h$]h&]hhuh1hhhhhhhhK ubhsection)}(hhh](htitle)}(hLibrary Designh]hLibrary Design}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhhhhKubhtopic)}(hTable of Contents
h](h)}(hTable of Contentsh]hTable of Contents}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhhKubhbullet_list)}(hhh](h list_item)}(hhh]h paragraph)}(hhh]h reference)}(hhh]hIntroduction}(hj
hhhNhNubah}(h]id1ah ]h"]h$]h&]refidintroductionuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh]j )}(hhh]j )}(hhh]hWho Should Read This?}(hj, hhhNhNubah}(h]id2ah ]h"]h$]h&]refidwho-should-read-thisuh1j hj) ubah}(h]h ]h"]h$]h&]uh1j hj& ubah}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh]j )}(hhh]j )}(hhh]hHow Is This Better?}(hjN hhhNhNubah}(h]id3ah ]h"]h$]h&]refidhow-is-this-betteruh1j hjK ubah}(h]h ]h"]h$]h&]uh1j hjH ubah}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh](j )}(hhh]j )}(hhh]hFile Range Iterator}(hjp hhhNhNubah}(h]id4ah ]h"]h$]h&]refidfile-range-iteratoruh1j hjm ubah}(h]h ]h"]h$]h&]uh1j hjj ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]hDefinitions}(hj hhhNhNubah}(h]id5ah ]h"]h$]h&]refiddefinitionsuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hliteral)}(h``struct iomap``h]hstruct iomap}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hNhNhj ubah}(h]id6ah ]h"]h$]h&]refidstruct-iomapuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh](j )}(hhh]j )}(hhh]j )}(h``struct iomap_ops``h]hstruct iomap_ops}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hNhNhj ubah}(h]id7ah ]h"]h$]h&]refidstruct-iomap-opsuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]j )}(h``->iomap_begin``h]h
->iomap_begin}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hNhNhj ubah}(h]id8ah ]h"]h$]h&]refidiomap-beginuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]j )}(h``->iomap_end``h]h->iomap_end}(hj7 hhhNhNubah}(h]h ]h"]h$]h&]uh1j hNhNhj4 ubah}(h]id9ah ]h"]h$]h&]refid iomap-enduh1j hj1 ubah}(h]h ]h"]h$]h&]uh1j hj. ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhjj ubeh}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh]j )}(hhh]j )}(hhh]hPreparing for File Operations}(hjx hhhNhNubah}(h]id10ah ]h"]h$]h&]refidpreparing-for-file-operationsuh1j hju ubah}(h]h ]h"]h$]h&]uh1j hjr ubah}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh]j )}(hhh]j )}(hhh]hLocking Hierarchy}(hj hhhNhNubah}(h]id11ah ]h"]h$]h&]refidlocking-hierarchyuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh]j )}(hhh]j )}(hhh]hBugs and Limitations}(hj hhhNhNubah}(h]id12ah ]h"]h$]h&]refidbugs-and-limitationsuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhhubeh}(h]h ]h"]h$]h&]uh1hhhhhhNhNubeh}(h]table-of-contentsah ](contentslocaleh"]table of contentsah$]h&]uh1hhhhKhhhhubh)}(hhh](h)}(hIntroductionh]hIntroduction}(hj hhhNhNubah}(h]h ]h"]h$]h&]refidj uh1hhj hhhhhKubj )}(h^iomap is a filesystem library for handling common file operations.
The library has two layers:h]h^iomap is a filesystem library for handling common file operations.
The library has two layers:}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhj hhubhblock_quote)}(hX@ 1. A lower layer that provides an iterator over ranges of file offsets.
This layer tries to obtain mappings of each file ranges to storage
from the filesystem, but the storage information is not necessarily
required.
2. An upper layer that acts upon the space mappings provided by the
lower layer iterator.
h]henumerated_list)}(hhh](h)}(hA lower layer that provides an iterator over ranges of file offsets.
This layer tries to obtain mappings of each file ranges to storage
from the filesystem, but the storage information is not necessarily
required.
h]j )}(hA lower layer that provides an iterator over ranges of file offsets.
This layer tries to obtain mappings of each file ranges to storage
from the filesystem, but the storage information is not necessarily
required.h]hA lower layer that provides an iterator over ranges of file offsets.
This layer tries to obtain mappings of each file ranges to storage
from the filesystem, but the storage information is not necessarily
required.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hWAn upper layer that acts upon the space mappings provided by the
lower layer iterator.
h]j )}(hVAn upper layer that acts upon the space mappings provided by the
lower layer iterator.h]hVAn upper layer that acts upon the space mappings provided by the
lower layer iterator.}(hj/ hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhj+ ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]enumtypearabicprefixhsuffix.uh1j hj
ubah}(h]h ]h"]h$]h&]uh1j hhhKhj hhubj )}(hX The iteration can involve mappings of file's logical offset ranges to
physical extents, but the storage layer information is not necessarily
required, e.g. for walking cached file information.
The library exports various APIs for implementing file operations such
as:h]hX
The iteration can involve mappings of file’s logical offset ranges to
physical extents, but the storage layer information is not necessarily
required, e.g. for walking cached file information.
The library exports various APIs for implementing file operations such
as:}(hjT hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhj hhubj )}(h* Pagecache reads and writes
* Folio write faults to the pagecache
* Writeback of dirty folios
* Direct I/O reads and writes
* fsdax I/O reads, writes, loads, and stores
* FIEMAP
* lseek ``SEEK_DATA`` and ``SEEK_HOLE``
* swapfile activation
h]h)}(hhh](h)}(hPagecache reads and writesh]j )}(hjk h]hPagecache reads and writes}(hjm hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK%hji ubah}(h]h ]h"]h$]h&]uh1hhjf ubh)}(h#Folio write faults to the pagecacheh]j )}(hj h]h#Folio write faults to the pagecache}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK&hj ubah}(h]h ]h"]h$]h&]uh1hhjf ubh)}(hWriteback of dirty foliosh]j )}(hj h]hWriteback of dirty folios}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK'hj ubah}(h]h ]h"]h$]h&]uh1hhjf ubh)}(hDirect I/O reads and writesh]j )}(hj h]hDirect I/O reads and writes}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK(hj ubah}(h]h ]h"]h$]h&]uh1hhjf ubh)}(h*fsdax I/O reads, writes, loads, and storesh]j )}(hj h]h*fsdax I/O reads, writes, loads, and stores}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK)hj ubah}(h]h ]h"]h$]h&]uh1hhjf ubh)}(hFIEMAPh]j )}(hj h]hFIEMAP}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK*hj ubah}(h]h ]h"]h$]h&]uh1hhjf ubh)}(h%lseek ``SEEK_DATA`` and ``SEEK_HOLE``h]j )}(hj h](hlseek }(hj hhhNhNubj )}(h
``SEEK_DATA``h]h SEEK_DATA}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh and }(hj hhhNhNubj )}(h
``SEEK_HOLE``h]h SEEK_HOLE}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubeh}(h]h ]h"]h$]h&]uh1j hhhK+hj ubah}(h]h ]h"]h$]h&]uh1hhjf ubh)}(hswapfile activation
h]j )}(hswapfile activationh]hswapfile activation}(hj. hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK,hj* ubah}(h]h ]h"]h$]h&]uh1hhjf ubeh}(h]h ]h"]h$]h&]bullet*uh1hhhhK%hjb ubah}(h]h ]h"]h$]h&]uh1j hhhK%hj hhubj )}(hThis origins of this library is the file I/O path that XFS once used; it
has now been extended to cover several other operations.h]hThis origins of this library is the file I/O path that XFS once used; it
has now been extended to cover several other operations.}(hjP hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK.hj hhubeh}(h]j ah ]h"]introductionah$]h&]uh1hhhhhhhhKubh)}(hhh](h)}(hWho Should Read This?h]hWho Should Read This?}(hjh hhhNhNubah}(h]h ]h"]h$]h&]j j5 uh1hhje hhhhhK2ubj )}(hlThe target audience for this document are filesystem, storage, and
pagecache programmers and code reviewers.h]hlThe target audience for this document are filesystem, storage, and
pagecache programmers and code reviewers.}(hjv hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK4hje hhubj )}(hlIf you are working on PCI, machine architectures, or device drivers, you
are most likely in the wrong place.h]hlIf you are working on PCI, machine architectures, or device drivers, you
are most likely in the wrong place.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK7hje hhubeh}(h]j; ah ]h"]who should read this?ah$]h&]uh1hhhhhhhhK2ubh)}(hhh](h)}(hHow Is This Better?h]hHow Is This Better?}(hj hhhNhNubah}(h]h ]h"]h$]h&]j jW uh1hhj hhhhhK;ubj )}(hX Unlike the classic Linux I/O model which breaks file I/O into small
units (generally memory pages or blocks) and looks up space mappings on
the basis of that unit, the iomap model asks the filesystem for the
largest space mappings that it can create for a given file operation and
initiates operations on that basis.
This strategy improves the filesystem's visibility into the size of the
operation being performed, which enables it to combat fragmentation with
larger space allocations when possible.
Larger space mappings improve runtime performance by amortizing the cost
of mapping function calls into the filesystem across a larger amount of
data.h]hX Unlike the classic Linux I/O model which breaks file I/O into small
units (generally memory pages or blocks) and looks up space mappings on
the basis of that unit, the iomap model asks the filesystem for the
largest space mappings that it can create for a given file operation and
initiates operations on that basis.
This strategy improves the filesystem’s visibility into the size of the
operation being performed, which enables it to combat fragmentation with
larger space allocations when possible.
Larger space mappings improve runtime performance by amortizing the cost
of mapping function calls into the filesystem across a larger amount of
data.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK=hj hhubj )}(h{At a high level, an iomap operation `looks like this
`_:h](h$At a high level, an iomap operation }(hj hhhNhNubj )}(hV`looks like this
`_h]hlooks like this}(hj hhhNhNubah}(h]h ]h"]h$]h&]namelooks like thisrefuriAhttps://lore.kernel.org/all/ZGbVaewzcCysclPt@dread.disaster.area/uh1j hj ubh)}(hD
h]h}(h]looks-like-thisah ]h"]looks like thisah$]h&]refurij uh1h
referencedKhj ubh:}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKIhj hhubj )}(hhh]h)}(hXh For each byte in the operation range...
1. Obtain a space mapping via ``->iomap_begin``
2. For each sub-unit of work...
1. Revalidate the mapping and go back to (1) above, if necessary.
So far only the pagecache operations need to do this.
2. Do the work
3. Increment operation cursor
4. Release the mapping via ``->iomap_end``, if necessary
h](j )}(h'For each byte in the operation range...h]h'For each byte in the operation range...}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKLhj ubj )}(hhh](h)}(h-Obtain a space mapping via ``->iomap_begin``
h]j )}(h,Obtain a space mapping via ``->iomap_begin``h](hObtain a space mapping via }(hj hhhNhNubj )}(h``->iomap_begin``h]h
->iomap_begin}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubeh}(h]h ]h"]h$]h&]uh1j hhhKNhj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hFor each sub-unit of work...
1. Revalidate the mapping and go back to (1) above, if necessary.
So far only the pagecache operations need to do this.
2. Do the work
h](j )}(hFor each sub-unit of work...h]hFor each sub-unit of work...}(hj, hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKPhj( ubj )}(hhh](h)}(huRevalidate the mapping and go back to (1) above, if necessary.
So far only the pagecache operations need to do this.
h]j )}(htRevalidate the mapping and go back to (1) above, if necessary.
So far only the pagecache operations need to do this.h]htRevalidate the mapping and go back to (1) above, if necessary.
So far only the pagecache operations need to do this.}(hjA hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKRhj= ubah}(h]h ]h"]h$]h&]uh1hhj: ubh)}(hDo the work
h]j )}(hDo the workh]hDo the work}(hjY hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKUhjU ubah}(h]h ]h"]h$]h&]uh1hhj: ubeh}(h]h ]h"]h$]h&]jI jJ jK hjL jM uh1j hj( ubeh}(h]h ]h"]h$]h&]uh1hhj ubh)}(hIncrement operation cursor
h]j )}(hIncrement operation cursorh]hIncrement operation cursor}(hj} hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKWhjy ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(h6Release the mapping via ``->iomap_end``, if necessary
h]j )}(h5Release the mapping via ``->iomap_end``, if necessaryh](hRelease the mapping via }(hj hhhNhNubj )}(h``->iomap_end``h]h->iomap_end}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh, if necessary}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKYhj ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]jI jJ jK hjL jM uh1j hj ubeh}(h]h ]h"]h$]h&]uh1hhj hhhNhNubah}(h]h ]h"]h$]h&]jI jJ jK hjL jM uh1j hj hhhhhKLubj )}(hEach iomap operation will be covered in more detail below.
This library was covered previously by an `LWN article
`_ and a `KernelNewbies page
`_.h](heEach iomap operation will be covered in more detail below.
This library was covered previously by an }(hj hhhNhNubj )}(h1`LWN article
`_h]hLWN article}(hj hhhNhNubah}(h]h ]h"]h$]h&]nameLWN articlej https://lwn.net/Articles/935934/uh1j hj ubh)}(h#
h]h}(h]lwn-articleah ]h"]lwn articleah$]h&]refurij uh1hj Khj ubh and a }(hj hhhNhNubj )}(hF`KernelNewbies page
`_h]hKernelNewbies page}(hj hhhNhNubah}(h]h ]h"]h$]h&]nameKernelNewbies pagej .https://kernelnewbies.org/KernelProjects/iomapuh1j hj ubh)}(h1
h]h}(h]kernelnewbies-pageah ]h"]kernelnewbies pageah$]h&]refurij uh1hj Khj ubh.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhK[hj hhubj )}(hThe goal of this document is to provide a brief discussion of the
design and capabilities of iomap, followed by a more detailed catalog
of the interfaces presented by iomap.
If you change iomap, please update this design document.h]hThe goal of this document is to provide a brief discussion of the
design and capabilities of iomap, followed by a more detailed catalog
of the interfaces presented by iomap.
If you change iomap, please update this design document.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK`hj hhubeh}(h]j] ah ]h"]how is this better?ah$]h&]uh1hhhhhhhhK;ubh)}(hhh](h)}(hFile Range Iteratorh]hFile Range Iterator}(hj7 hhhNhNubah}(h]h ]h"]h$]h&]j jy uh1hhj4 hhhhhKfubh)}(hhh](h)}(hDefinitionsh]hDefinitions}(hjH hhhNhNubah}(h]h ]h"]h$]h&]j j uh1hhjE hhhhhKiubj )}(hX * **buffer head**: Shattered remnants of the old buffer cache.
* ``fsblock``: The block size of a file, also known as ``i_blocksize``.
* ``i_rwsem``: The VFS ``struct inode`` rwsemaphore.
Processes hold this in shared mode to read file state and contents.
Some filesystems may allow shared mode for writes.
Processes often hold this in exclusive mode to change file state and
contents.
* ``invalidate_lock``: The pagecache ``struct address_space``
rwsemaphore that protects against folio insertion and removal for
filesystems that support punching out folios below EOF.
Processes wishing to insert folios must hold this lock in shared
mode to prevent removal, though concurrent insertion is allowed.
Processes wishing to remove folios must hold this lock in exclusive
mode to prevent insertions.
Concurrent removals are not allowed.
* ``dax_read_lock``: The RCU read lock that dax takes to prevent a
device pre-shutdown hook from returning before other threads have
released resources.
* **filesystem mapping lock**: This synchronization primitive is
internal to the filesystem and must protect the file mapping data
from updates while a mapping is being sampled.
The filesystem author must determine how this coordination should
happen; it does not need to be an actual lock.
* **iomap internal operation lock**: This is a general term for
synchronization primitives that iomap functions take while holding a
mapping.
A specific example would be taking the folio lock while reading or
writing the pagecache.
* **pure overwrite**: A write operation that does not require any
metadata or zeroing operations to perform during either submission
or completion.
This implies that the filesystem must have already allocated space
on disk as ``IOMAP_MAPPED`` and the filesystem must not place any
constraints on IO alignment or size.
The only constraints on I/O alignment are device level (minimum I/O
size and alignment, typically sector size).
h]h)}(hhh](h)}(h=**buffer head**: Shattered remnants of the old buffer cache.
h]j )}(h<**buffer head**: Shattered remnants of the old buffer cache.h](hstrong)}(h**buffer head**h]hbuffer head}(hjg hhhNhNubah}(h]h ]h"]h$]h&]uh1je hja ubh-: Shattered remnants of the old buffer cache.}(hja hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKkhj] ubah}(h]h ]h"]h$]h&]uh1hhjZ ubh)}(hF``fsblock``: The block size of a file, also known as ``i_blocksize``.
h]j )}(hE``fsblock``: The block size of a file, also known as ``i_blocksize``.h](j )}(h``fsblock``h]hfsblock}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh*: The block size of a file, also known as }(hj hhhNhNubj )}(h``i_blocksize``h]hi_blocksize}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKmhj ubah}(h]h ]h"]h$]h&]uh1hhjZ ubh)}(h``i_rwsem``: The VFS ``struct inode`` rwsemaphore.
Processes hold this in shared mode to read file state and contents.
Some filesystems may allow shared mode for writes.
Processes often hold this in exclusive mode to change file state and
contents.
h]j )}(h``i_rwsem``: The VFS ``struct inode`` rwsemaphore.
Processes hold this in shared mode to read file state and contents.
Some filesystems may allow shared mode for writes.
Processes often hold this in exclusive mode to change file state and
contents.h](j )}(h``i_rwsem``h]hi_rwsem}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh
: The VFS }(hj hhhNhNubj )}(h``struct inode``h]hstruct inode}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh rwsemaphore.
Processes hold this in shared mode to read file state and contents.
Some filesystems may allow shared mode for writes.
Processes often hold this in exclusive mode to change file state and
contents.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKohj ubah}(h]h ]h"]h$]h&]uh1hhjZ ubh)}(hX ``invalidate_lock``: The pagecache ``struct address_space``
rwsemaphore that protects against folio insertion and removal for
filesystems that support punching out folios below EOF.
Processes wishing to insert folios must hold this lock in shared
mode to prevent removal, though concurrent insertion is allowed.
Processes wishing to remove folios must hold this lock in exclusive
mode to prevent insertions.
Concurrent removals are not allowed.
h]j )}(hX ``invalidate_lock``: The pagecache ``struct address_space``
rwsemaphore that protects against folio insertion and removal for
filesystems that support punching out folios below EOF.
Processes wishing to insert folios must hold this lock in shared
mode to prevent removal, though concurrent insertion is allowed.
Processes wishing to remove folios must hold this lock in exclusive
mode to prevent insertions.
Concurrent removals are not allowed.h](j )}(h``invalidate_lock``h]hinvalidate_lock}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh: The pagecache }(hj hhhNhNubj )}(h``struct address_space``h]hstruct address_space}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubhX
rwsemaphore that protects against folio insertion and removal for
filesystems that support punching out folios below EOF.
Processes wishing to insert folios must hold this lock in shared
mode to prevent removal, though concurrent insertion is allowed.
Processes wishing to remove folios must hold this lock in exclusive
mode to prevent insertions.
Concurrent removals are not allowed.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKuhj ubah}(h]h ]h"]h$]h&]uh1hhjZ ubh)}(h``dax_read_lock``: The RCU read lock that dax takes to prevent a
device pre-shutdown hook from returning before other threads have
released resources.
h]j )}(h``dax_read_lock``: The RCU read lock that dax takes to prevent a
device pre-shutdown hook from returning before other threads have
released resources.h](j )}(h``dax_read_lock``h]h
dax_read_lock}(hj5 hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj1 ubh: The RCU read lock that dax takes to prevent a
device pre-shutdown hook from returning before other threads have
released resources.}(hj1 hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhK~hj- ubah}(h]h ]h"]h$]h&]uh1hhjZ ubh)}(hX! **filesystem mapping lock**: This synchronization primitive is
internal to the filesystem and must protect the file mapping data
from updates while a mapping is being sampled.
The filesystem author must determine how this coordination should
happen; it does not need to be an actual lock.
h]j )}(hX **filesystem mapping lock**: This synchronization primitive is
internal to the filesystem and must protect the file mapping data
from updates while a mapping is being sampled.
The filesystem author must determine how this coordination should
happen; it does not need to be an actual lock.h](jf )}(h**filesystem mapping lock**h]hfilesystem mapping lock}(hj[ hhhNhNubah}(h]h ]h"]h$]h&]uh1je hjW ubhX : This synchronization primitive is
internal to the filesystem and must protect the file mapping data
from updates while a mapping is being sampled.
The filesystem author must determine how this coordination should
happen; it does not need to be an actual lock.}(hjW hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhjS ubah}(h]h ]h"]h$]h&]uh1hhjZ ubh)}(h**iomap internal operation lock**: This is a general term for
synchronization primitives that iomap functions take while holding a
mapping.
A specific example would be taking the folio lock while reading or
writing the pagecache.
h]j )}(h**iomap internal operation lock**: This is a general term for
synchronization primitives that iomap functions take while holding a
mapping.
A specific example would be taking the folio lock while reading or
writing the pagecache.h](jf )}(h!**iomap internal operation lock**h]hiomap internal operation lock}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1je hj} ubh: This is a general term for
synchronization primitives that iomap functions take while holding a
mapping.
A specific example would be taking the folio lock while reading or
writing the pagecache.}(hj} hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhjy ubah}(h]h ]h"]h$]h&]uh1hhjZ ubh)}(hX **pure overwrite**: A write operation that does not require any
metadata or zeroing operations to perform during either submission
or completion.
This implies that the filesystem must have already allocated space
on disk as ``IOMAP_MAPPED`` and the filesystem must not place any
constraints on IO alignment or size.
The only constraints on I/O alignment are device level (minimum I/O
size and alignment, typically sector size).
h]j )}(hX **pure overwrite**: A write operation that does not require any
metadata or zeroing operations to perform during either submission
or completion.
This implies that the filesystem must have already allocated space
on disk as ``IOMAP_MAPPED`` and the filesystem must not place any
constraints on IO alignment or size.
The only constraints on I/O alignment are device level (minimum I/O
size and alignment, typically sector size).h](jf )}(h**pure overwrite**h]hpure overwrite}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1je hj ubh: A write operation that does not require any
metadata or zeroing operations to perform during either submission
or completion.
This implies that the filesystem must have already allocated space
on disk as }(hj hhhNhNubj )}(h``IOMAP_MAPPED``h]hIOMAP_MAPPED}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh and the filesystem must not place any
constraints on IO alignment or size.
The only constraints on I/O alignment are device level (minimum I/O
size and alignment, typically sector size).}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj ubah}(h]h ]h"]h$]h&]uh1hhjZ ubeh}(h]h ]h"]h$]h&]jH jI uh1hhhhKkhjV ubah}(h]h ]h"]h$]h&]uh1j hhhKkhjE hhubeh}(h]j ah ]h"]definitionsah$]h&]uh1hhj4 hhhhhKiubh)}(hhh](h)}(hj h]j )}(hj h]hstruct iomap}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]j j uh1hhj hhhhhKubj )}(hThe filesystem communicates to the iomap iterator the mapping of
byte ranges of a file to byte ranges of a storage device with the
structure below:h]hThe filesystem communicates to the iomap iterator the mapping of
byte ranges of a file to byte ranges of a storage device with the
structure below:}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhj hhubh
literal_block)}(hX struct iomap {
u64 addr;
loff_t offset;
u64 length;
u16 type;
u16 flags;
struct block_device *bdev;
struct dax_device *dax_dev;
void *inline_data;
void *private;
const struct iomap_folio_ops *folio_ops;
u64 validity_cookie;
};h]hX struct iomap {
u64 addr;
loff_t offset;
u64 length;
u16 type;
u16 flags;
struct block_device *bdev;
struct dax_device *dax_dev;
void *inline_data;
void *private;
const struct iomap_folio_ops *folio_ops;
u64 validity_cookie;
};}hj sbah}(h]h ]h"]h$]h&]hhforcelanguagechighlight_args}uh1j hhhKhj hhubj )}(hThe fields are as follows:h]hThe fields are as follows:}(hj& hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhj hhubj )}(hX * ``offset`` and ``length`` describe the range of file offsets, in
bytes, covered by this mapping.
These fields must always be set by the filesystem.
* ``type`` describes the type of the space mapping:
* **IOMAP_HOLE**: No storage has been allocated.
This type must never be returned in response to an ``IOMAP_WRITE``
operation because writes must allocate and map space, and return
the mapping.
The ``addr`` field must be set to ``IOMAP_NULL_ADDR``.
iomap does not support writing (whether via pagecache or direct
I/O) to a hole.
* **IOMAP_DELALLOC**: A promise to allocate space at a later time
("delayed allocation").
If the filesystem returns IOMAP_F_NEW here and the write fails, the
``->iomap_end`` function must delete the reservation.
The ``addr`` field must be set to ``IOMAP_NULL_ADDR``.
* **IOMAP_MAPPED**: The file range maps to specific space on the
storage device.
The device is returned in ``bdev`` or ``dax_dev``.
The device address, in bytes, is returned via ``addr``.
* **IOMAP_UNWRITTEN**: The file range maps to specific space on the
storage device, but the space has not yet been initialized.
The device is returned in ``bdev`` or ``dax_dev``.
The device address, in bytes, is returned via ``addr``.
Reads from this type of mapping will return zeroes to the caller.
For a write or writeback operation, the ioend should update the
mapping to MAPPED.
Refer to the sections about ioends for more details.
* **IOMAP_INLINE**: The file range maps to the memory buffer
specified by ``inline_data``.
For write operation, the ``->iomap_end`` function presumably
handles persisting the data.
The ``addr`` field must be set to ``IOMAP_NULL_ADDR``.
* ``flags`` describe the status of the space mapping.
These flags should be set by the filesystem in ``->iomap_begin``:
* **IOMAP_F_NEW**: The space under the mapping is newly allocated.
Areas that will not be written to must be zeroed.
If a write fails and the mapping is a space reservation, the
reservation must be deleted.
* **IOMAP_F_DIRTY**: The inode will have uncommitted metadata needed
to access any data written.
fdatasync is required to commit these changes to persistent
storage.
This needs to take into account metadata changes that *may* be made
at I/O completion, such as file size updates from direct I/O.
* **IOMAP_F_SHARED**: The space under the mapping is shared.
Copy on write is necessary to avoid corrupting other file data.
* **IOMAP_F_BUFFER_HEAD**: This mapping requires the use of buffer
heads for pagecache operations.
Do not add more uses of this.
* **IOMAP_F_MERGED**: Multiple contiguous block mappings were
coalesced into this single mapping.
This is only useful for FIEMAP.
* **IOMAP_F_XATTR**: The mapping is for extended attribute data, not
regular file data.
This is only useful for FIEMAP.
* **IOMAP_F_PRIVATE**: Starting with this value, the upper bits can
be set by the filesystem for its own purposes.
* **IOMAP_F_ANON_WRITE**: Indicates that (write) I/O does not have a target
block assigned to it yet and the file system will do that in the bio
submission handler, splitting the I/O as needed.
These flags can be set by iomap itself during file operations.
The filesystem should supply an ``->iomap_end`` function if it needs
to observe these flags:
* **IOMAP_F_SIZE_CHANGED**: The file size has changed as a result of
using this mapping.
* **IOMAP_F_STALE**: The mapping was found to be stale.
iomap will call ``->iomap_end`` on this mapping and then
``->iomap_begin`` to obtain a new mapping.
Currently, these flags are only set by pagecache operations.
* ``addr`` describes the device address, in bytes.
* ``bdev`` describes the block device for this mapping.
This only needs to be set for mapped or unwritten operations.
* ``dax_dev`` describes the DAX device for this mapping.
This only needs to be set for mapped or unwritten operations, and
only for a fsdax operation.
* ``inline_data`` points to a memory buffer for I/O involving
``IOMAP_INLINE`` mappings.
This value is ignored for all other mapping types.
* ``private`` is a pointer to `filesystem-private information
`_.
This value will be passed unchanged to ``->iomap_end``.
* ``folio_ops`` will be covered in the section on pagecache operations.
* ``validity_cookie`` is a magic freshness value set by the filesystem
that should be used to detect stale mappings.
For pagecache operations this is critical for correct operation
because page faults can occur, which implies that filesystem locks
should not be held between ``->iomap_begin`` and ``->iomap_end``.
Filesystems with completely static mappings need not set this value.
Only pagecache operations revalidate mappings; see the section about
``iomap_valid`` for details.
h]h)}(hhh](h)}(h``offset`` and ``length`` describe the range of file offsets, in
bytes, covered by this mapping.
These fields must always be set by the filesystem.
h]j )}(h``offset`` and ``length`` describe the range of file offsets, in
bytes, covered by this mapping.
These fields must always be set by the filesystem.h](j )}(h
``offset``h]hoffset}(hjC hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj? ubh and }(hj? hhhNhNubj )}(h
``length``h]hlength}(hjU hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj? ubhz describe the range of file offsets, in
bytes, covered by this mapping.
These fields must always be set by the filesystem.}(hj? hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj; ubah}(h]h ]h"]h$]h&]uh1hhj8 ubh)}(hX ``type`` describes the type of the space mapping:
* **IOMAP_HOLE**: No storage has been allocated.
This type must never be returned in response to an ``IOMAP_WRITE``
operation because writes must allocate and map space, and return
the mapping.
The ``addr`` field must be set to ``IOMAP_NULL_ADDR``.
iomap does not support writing (whether via pagecache or direct
I/O) to a hole.
* **IOMAP_DELALLOC**: A promise to allocate space at a later time
("delayed allocation").
If the filesystem returns IOMAP_F_NEW here and the write fails, the
``->iomap_end`` function must delete the reservation.
The ``addr`` field must be set to ``IOMAP_NULL_ADDR``.
* **IOMAP_MAPPED**: The file range maps to specific space on the
storage device.
The device is returned in ``bdev`` or ``dax_dev``.
The device address, in bytes, is returned via ``addr``.
* **IOMAP_UNWRITTEN**: The file range maps to specific space on the
storage device, but the space has not yet been initialized.
The device is returned in ``bdev`` or ``dax_dev``.
The device address, in bytes, is returned via ``addr``.
Reads from this type of mapping will return zeroes to the caller.
For a write or writeback operation, the ioend should update the
mapping to MAPPED.
Refer to the sections about ioends for more details.
* **IOMAP_INLINE**: The file range maps to the memory buffer
specified by ``inline_data``.
For write operation, the ``->iomap_end`` function presumably
handles persisting the data.
The ``addr`` field must be set to ``IOMAP_NULL_ADDR``.
h](j )}(h1``type`` describes the type of the space mapping:h](j )}(h``type``h]htype}(hj{ hhhNhNubah}(h]h ]h"]h$]h&]uh1j hjw ubh) describes the type of the space mapping:}(hjw hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhjs ubh)}(hhh](h)}(hXG **IOMAP_HOLE**: No storage has been allocated.
This type must never be returned in response to an ``IOMAP_WRITE``
operation because writes must allocate and map space, and return
the mapping.
The ``addr`` field must be set to ``IOMAP_NULL_ADDR``.
iomap does not support writing (whether via pagecache or direct
I/O) to a hole.
h]j )}(hXF **IOMAP_HOLE**: No storage has been allocated.
This type must never be returned in response to an ``IOMAP_WRITE``
operation because writes must allocate and map space, and return
the mapping.
The ``addr`` field must be set to ``IOMAP_NULL_ADDR``.
iomap does not support writing (whether via pagecache or direct
I/O) to a hole.h](jf )}(h**IOMAP_HOLE**h]h
IOMAP_HOLE}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1je hj ubhT: No storage has been allocated.
This type must never be returned in response to an }(hj hhhNhNubj )}(h``IOMAP_WRITE``h]hIOMAP_WRITE}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubhS
operation because writes must allocate and map space, and return
the mapping.
The }(hj hhhNhNubj )}(h``addr``h]haddr}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh field must be set to }(hj hhhNhNubj )}(h``IOMAP_NULL_ADDR``h]hIOMAP_NULL_ADDR}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubhQ.
iomap does not support writing (whether via pagecache or direct
I/O) to a hole.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hX **IOMAP_DELALLOC**: A promise to allocate space at a later time
("delayed allocation").
If the filesystem returns IOMAP_F_NEW here and the write fails, the
``->iomap_end`` function must delete the reservation.
The ``addr`` field must be set to ``IOMAP_NULL_ADDR``.
h]j )}(hX **IOMAP_DELALLOC**: A promise to allocate space at a later time
("delayed allocation").
If the filesystem returns IOMAP_F_NEW here and the write fails, the
``->iomap_end`` function must delete the reservation.
The ``addr`` field must be set to ``IOMAP_NULL_ADDR``.h](jf )}(h**IOMAP_DELALLOC**h]hIOMAP_DELALLOC}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1je hj ubh: A promise to allocate space at a later time
(“delayed allocation”).
If the filesystem returns IOMAP_F_NEW here and the write fails, the
}(hj hhhNhNubj )}(h``->iomap_end``h]h->iomap_end}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh+ function must delete the reservation.
The }(hj hhhNhNubj )}(h``addr``h]haddr}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh field must be set to }(hj hhhNhNubj )}(h``IOMAP_NULL_ADDR``h]hIOMAP_NULL_ADDR}(hj0 hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(h**IOMAP_MAPPED**: The file range maps to specific space on the
storage device.
The device is returned in ``bdev`` or ``dax_dev``.
The device address, in bytes, is returned via ``addr``.
h]j )}(h**IOMAP_MAPPED**: The file range maps to specific space on the
storage device.
The device is returned in ``bdev`` or ``dax_dev``.
The device address, in bytes, is returned via ``addr``.h](jf )}(h**IOMAP_MAPPED**h]hIOMAP_MAPPED}(hjV hhhNhNubah}(h]h ]h"]h$]h&]uh1je hjR ubhY: The file range maps to specific space on the
storage device.
The device is returned in }(hjR hhhNhNubj )}(h``bdev``h]hbdev}(hjh hhhNhNubah}(h]h ]h"]h$]h&]uh1j hjR ubh or }(hjR hhhNhNubj )}(h``dax_dev``h]hdax_dev}(hjz hhhNhNubah}(h]h ]h"]h$]h&]uh1j hjR ubh0.
The device address, in bytes, is returned via }(hjR hhhNhNubj )}(h``addr``h]haddr}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hjR ubh.}(hjR hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhjN ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hX **IOMAP_UNWRITTEN**: The file range maps to specific space on the
storage device, but the space has not yet been initialized.
The device is returned in ``bdev`` or ``dax_dev``.
The device address, in bytes, is returned via ``addr``.
Reads from this type of mapping will return zeroes to the caller.
For a write or writeback operation, the ioend should update the
mapping to MAPPED.
Refer to the sections about ioends for more details.
h]j )}(hX **IOMAP_UNWRITTEN**: The file range maps to specific space on the
storage device, but the space has not yet been initialized.
The device is returned in ``bdev`` or ``dax_dev``.
The device address, in bytes, is returned via ``addr``.
Reads from this type of mapping will return zeroes to the caller.
For a write or writeback operation, the ioend should update the
mapping to MAPPED.
Refer to the sections about ioends for more details.h](jf )}(h**IOMAP_UNWRITTEN**h]hIOMAP_UNWRITTEN}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1je hj ubh: The file range maps to specific space on the
storage device, but the space has not yet been initialized.
The device is returned in }(hj hhhNhNubj )}(h``bdev``h]hbdev}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh or }(hj hhhNhNubj )}(h``dax_dev``h]hdax_dev}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh0.
The device address, in bytes, is returned via }(hj hhhNhNubj )}(h``addr``h]haddr}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh.
Reads from this type of mapping will return zeroes to the caller.
For a write or writeback operation, the ioend should update the
mapping to MAPPED.
Refer to the sections about ioends for more details.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(h**IOMAP_INLINE**: The file range maps to the memory buffer
specified by ``inline_data``.
For write operation, the ``->iomap_end`` function presumably
handles persisting the data.
The ``addr`` field must be set to ``IOMAP_NULL_ADDR``.
h]j )}(h**IOMAP_INLINE**: The file range maps to the memory buffer
specified by ``inline_data``.
For write operation, the ``->iomap_end`` function presumably
handles persisting the data.
The ``addr`` field must be set to ``IOMAP_NULL_ADDR``.h](jf )}(h**IOMAP_INLINE**h]hIOMAP_INLINE}(hj
hhhNhNubah}(h]h ]h"]h$]h&]uh1je hj
ubh8: The file range maps to the memory buffer
specified by }(hj
hhhNhNubj )}(h``inline_data``h]hinline_data}(hj
hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj
ubh.
For write operation, the }(hj
hhhNhNubj )}(h``->iomap_end``h]h->iomap_end}(hj2
hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj
ubh6 function presumably
handles persisting the data.
The }(hj
hhhNhNubj )}(h``addr``h]haddr}(hjD
hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj
ubh field must be set to }(hj
hhhNhNubj )}(h``IOMAP_NULL_ADDR``h]hIOMAP_NULL_ADDR}(hjV
hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj
ubh.}(hj
hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj
ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]jH jI uh1hhhhKhjs ubeh}(h]h ]h"]h$]h&]uh1hhj8 ubh)}(hX ``flags`` describe the status of the space mapping.
These flags should be set by the filesystem in ``->iomap_begin``:
* **IOMAP_F_NEW**: The space under the mapping is newly allocated.
Areas that will not be written to must be zeroed.
If a write fails and the mapping is a space reservation, the
reservation must be deleted.
* **IOMAP_F_DIRTY**: The inode will have uncommitted metadata needed
to access any data written.
fdatasync is required to commit these changes to persistent
storage.
This needs to take into account metadata changes that *may* be made
at I/O completion, such as file size updates from direct I/O.
* **IOMAP_F_SHARED**: The space under the mapping is shared.
Copy on write is necessary to avoid corrupting other file data.
* **IOMAP_F_BUFFER_HEAD**: This mapping requires the use of buffer
heads for pagecache operations.
Do not add more uses of this.
* **IOMAP_F_MERGED**: Multiple contiguous block mappings were
coalesced into this single mapping.
This is only useful for FIEMAP.
* **IOMAP_F_XATTR**: The mapping is for extended attribute data, not
regular file data.
This is only useful for FIEMAP.
* **IOMAP_F_PRIVATE**: Starting with this value, the upper bits can
be set by the filesystem for its own purposes.
* **IOMAP_F_ANON_WRITE**: Indicates that (write) I/O does not have a target
block assigned to it yet and the file system will do that in the bio
submission handler, splitting the I/O as needed.
These flags can be set by iomap itself during file operations.
The filesystem should supply an ``->iomap_end`` function if it needs
to observe these flags:
* **IOMAP_F_SIZE_CHANGED**: The file size has changed as a result of
using this mapping.
* **IOMAP_F_STALE**: The mapping was found to be stale.
iomap will call ``->iomap_end`` on this mapping and then
``->iomap_begin`` to obtain a new mapping.
Currently, these flags are only set by pagecache operations.
h](j )}(hu``flags`` describe the status of the space mapping.
These flags should be set by the filesystem in ``->iomap_begin``:h](j )}(h ``flags``h]hflags}(hj
hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj
ubhZ describe the status of the space mapping.
These flags should be set by the filesystem in }(hj
hhhNhNubj )}(h``->iomap_begin``h]h
->iomap_begin}(hj
hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj
ubh:}(hj
hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj
ubh)}(hhh](h)}(h**IOMAP_F_NEW**: The space under the mapping is newly allocated.
Areas that will not be written to must be zeroed.
If a write fails and the mapping is a space reservation, the
reservation must be deleted.
h]j )}(h**IOMAP_F_NEW**: The space under the mapping is newly allocated.
Areas that will not be written to must be zeroed.
If a write fails and the mapping is a space reservation, the
reservation must be deleted.h](jf )}(h**IOMAP_F_NEW**h]hIOMAP_F_NEW}(hj
hhhNhNubah}(h]h ]h"]h$]h&]uh1je hj
ubh: The space under the mapping is newly allocated.
Areas that will not be written to must be zeroed.
If a write fails and the mapping is a space reservation, the
reservation must be deleted.}(hj
hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj
ubah}(h]h ]h"]h$]h&]uh1hhj
ubh)}(hX& **IOMAP_F_DIRTY**: The inode will have uncommitted metadata needed
to access any data written.
fdatasync is required to commit these changes to persistent
storage.
This needs to take into account metadata changes that *may* be made
at I/O completion, such as file size updates from direct I/O.
h]j )}(hX% **IOMAP_F_DIRTY**: The inode will have uncommitted metadata needed
to access any data written.
fdatasync is required to commit these changes to persistent
storage.
This needs to take into account metadata changes that *may* be made
at I/O completion, such as file size updates from direct I/O.h](jf )}(h**IOMAP_F_DIRTY**h]h
IOMAP_F_DIRTY}(hj
hhhNhNubah}(h]h ]h"]h$]h&]uh1je hj
ubh: The inode will have uncommitted metadata needed
to access any data written.
fdatasync is required to commit these changes to persistent
storage.
This needs to take into account metadata changes that }(hj
hhhNhNubhemphasis)}(h*may*h]hmay}(hj
hhhNhNubah}(h]h ]h"]h$]h&]uh1j
hj
ubhF be made
at I/O completion, such as file size updates from direct I/O.}(hj
hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj
ubah}(h]h ]h"]h$]h&]uh1hhj
ubh)}(h{**IOMAP_F_SHARED**: The space under the mapping is shared.
Copy on write is necessary to avoid corrupting other file data.
h]j )}(hz**IOMAP_F_SHARED**: The space under the mapping is shared.
Copy on write is necessary to avoid corrupting other file data.h](jf )}(h**IOMAP_F_SHARED**h]hIOMAP_F_SHARED}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1je hj ubhh: The space under the mapping is shared.
Copy on write is necessary to avoid corrupting other file data.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj ubah}(h]h ]h"]h$]h&]uh1hhj
ubh)}(h**IOMAP_F_BUFFER_HEAD**: This mapping requires the use of buffer
heads for pagecache operations.
Do not add more uses of this.
h]j )}(h~**IOMAP_F_BUFFER_HEAD**: This mapping requires the use of buffer
heads for pagecache operations.
Do not add more uses of this.h](jf )}(h**IOMAP_F_BUFFER_HEAD**h]hIOMAP_F_BUFFER_HEAD}(hjC hhhNhNubah}(h]h ]h"]h$]h&]uh1je hj? ubhg: This mapping requires the use of buffer
heads for pagecache operations.
Do not add more uses of this.}(hj? hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj; ubah}(h]h ]h"]h$]h&]uh1hhj
ubh)}(h**IOMAP_F_MERGED**: Multiple contiguous block mappings were
coalesced into this single mapping.
This is only useful for FIEMAP.
h]j )}(h**IOMAP_F_MERGED**: Multiple contiguous block mappings were
coalesced into this single mapping.
This is only useful for FIEMAP.h](jf )}(h**IOMAP_F_MERGED**h]hIOMAP_F_MERGED}(hji hhhNhNubah}(h]h ]h"]h$]h&]uh1je hje ubhm: Multiple contiguous block mappings were
coalesced into this single mapping.
This is only useful for FIEMAP.}(hje hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhja ubah}(h]h ]h"]h$]h&]uh1hhj
ubh)}(hv**IOMAP_F_XATTR**: The mapping is for extended attribute data, not
regular file data.
This is only useful for FIEMAP.
h]j )}(hu**IOMAP_F_XATTR**: The mapping is for extended attribute data, not
regular file data.
This is only useful for FIEMAP.h](jf )}(h**IOMAP_F_XATTR**h]h
IOMAP_F_XATTR}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1je hj ubhd: The mapping is for extended attribute data, not
regular file data.
This is only useful for FIEMAP.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj ubah}(h]h ]h"]h$]h&]uh1hhj
ubh)}(hq**IOMAP_F_PRIVATE**: Starting with this value, the upper bits can
be set by the filesystem for its own purposes.
h]j )}(hp**IOMAP_F_PRIVATE**: Starting with this value, the upper bits can
be set by the filesystem for its own purposes.h](jf )}(h**IOMAP_F_PRIVATE**h]hIOMAP_F_PRIVATE}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1je hj ubh]: Starting with this value, the upper bits can
be set by the filesystem for its own purposes.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj ubah}(h]h ]h"]h$]h&]uh1hhj
ubh)}(h**IOMAP_F_ANON_WRITE**: Indicates that (write) I/O does not have a target
block assigned to it yet and the file system will do that in the bio
submission handler, splitting the I/O as needed.
h]j )}(h**IOMAP_F_ANON_WRITE**: Indicates that (write) I/O does not have a target
block assigned to it yet and the file system will do that in the bio
submission handler, splitting the I/O as needed.h](jf )}(h**IOMAP_F_ANON_WRITE**h]hIOMAP_F_ANON_WRITE}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1je hj ubh: Indicates that (write) I/O does not have a target
block assigned to it yet and the file system will do that in the bio
submission handler, splitting the I/O as needed.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj ubah}(h]h ]h"]h$]h&]uh1hhj
ubeh}(h]h ]h"]h$]h&]jH jI uh1hhhhKhj
ubj )}(hThese flags can be set by iomap itself during file operations.
The filesystem should supply an ``->iomap_end`` function if it needs
to observe these flags:h](h_These flags can be set by iomap itself during file operations.
The filesystem should supply an }(hj hhhNhNubj )}(h``->iomap_end``h]h->iomap_end}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hj ubh- function if it needs
to observe these flags:}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj
ubh)}(hhh](h)}(hW**IOMAP_F_SIZE_CHANGED**: The file size has changed as a result of
using this mapping.
h]j )}(hV**IOMAP_F_SIZE_CHANGED**: The file size has changed as a result of
using this mapping.h](jf )}(h**IOMAP_F_SIZE_CHANGED**h]hIOMAP_F_SIZE_CHANGED}(hj* hhhNhNubah}(h]h ]h"]h$]h&]uh1je hj& ubh>: The file size has changed as a result of
using this mapping.}(hj& hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhMhj" ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(h**IOMAP_F_STALE**: The mapping was found to be stale.
iomap will call ``->iomap_end`` on this mapping and then
``->iomap_begin`` to obtain a new mapping.
h]j )}(h**IOMAP_F_STALE**: The mapping was found to be stale.
iomap will call ``->iomap_end`` on this mapping and then
``->iomap_begin`` to obtain a new mapping.h](jf )}(h**IOMAP_F_STALE**h]h
IOMAP_F_STALE}(hjP hhhNhNubah}(h]h ]h"]h$]h&]uh1je hjL ubh5: The mapping was found to be stale.
iomap will call }(hjL hhhNhNubj )}(h``->iomap_end``h]h->iomap_end}(hjb hhhNhNubah}(h]h ]h"]h$]h&]uh1j hjL ubh on this mapping and then
}(hjL hhhNhNubj )}(h``->iomap_begin``h]h
->iomap_begin}(hjt hhhNhNubah}(h]h ]h"]h$]h&]uh1j hjL ubh to obtain a new mapping.}(hjL hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhMhjH ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]jH jI uh1hhhhMhj
ubj )}(h