sphinx.addnodesdocument)}( rawsource children](translations
LanguagesNode)}(hhh](h pending_xref)}(hhh]docutils.nodesTextChinese (Simplified)}parenthsba
attributes}(ids]classes]names]dupnames]backrefs] refdomainstdreftypedoc reftarget0/translations/zh_CN/filesystems/iomap/operationsmodnameN classnameNrefexplicitutagnamehhhubh)}(hhh]hChinese (Traditional)}hh2sbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget0/translations/zh_TW/filesystems/iomap/operationsmodnameN classnameNrefexplicituh1hhhubh)}(hhh]hItalian}hhFsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget0/translations/it_IT/filesystems/iomap/operationsmodnameN classnameNrefexplicituh1hhhubh)}(hhh]hJapanese}hhZsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget0/translations/ja_JP/filesystems/iomap/operationsmodnameN classnameNrefexplicituh1hhhubh)}(hhh]hKorean}hhnsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget0/translations/ko_KR/filesystems/iomap/operationsmodnameN classnameNrefexplicituh1hhhubh)}(hhh]hSpanish}hhsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget0/translations/sp_SP/filesystems/iomap/operationsmodnameN classnameNrefexplicituh1hhhubeh}(h]h ]h"]h$]h&]current_languageEnglishuh1h
hh _documenthsourceNlineNubhcomment)}(h SPDX-License-Identifier: GPL-2.0h]h SPDX-License-Identifier: GPL-2.0}hhsbah}(h]h ]h"]h$]h&] xml:spacepreserveuh1hhhhhhJ/var/lib/git/docbuild/linux/Documentation/filesystems/iomap/operations.rsthKubhtarget)}(h.. _iomap_operations:h]h}(h]iomap-operationsah ]h"]iomap_operationsah$]h&]uh1hhKhhhhhhubh)}(hDumb style notes to maintain the author's sanity:
Please try to start sentences on separate lines so that
sentence changes don't bleed colors in diff.
Heading decorations are documented in sphinx.rst.h]hDumb style notes to maintain the author's sanity:
Please try to start sentences on separate lines so that
sentence changes don't bleed colors in diff.
Heading decorations are documented in sphinx.rst.}hhsbah}(h]h ]h"]h$]h&]hhuh1hhhhhhhhK ubhsection)}(hhh](htitle)}(hSupported File Operationsh]hSupported File Operations}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhhhhKubhtopic)}(hTable of Contents
h](h)}(hTable of Contentsh]hTable of Contents}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhhKubhbullet_list)}(hhh](h list_item)}(hhh](h paragraph)}(hhh]h reference)}(hhh]hBuffered I/O}(hj
hhhNhNubah}(h]id1ah ]h"]h$]h&]refidbuffered-i-ouh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]hliteral)}(h#``struct address_space_operations``h]hstruct address_space_operations}(hj. hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hNhNhj) ubah}(h]id2ah ]h"]h$]h&]refidstruct-address-space-operationsuh1j hj& ubah}(h]h ]h"]h$]h&]uh1j hj# ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]j- )}(h``struct iomap_folio_ops``h]hstruct iomap_folio_ops}(hjZ hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hNhNhjW ubah}(h]id3ah ]h"]h$]h&]refidstruct-iomap-folio-opsuh1j hjT ubah}(h]h ]h"]h$]h&]uh1j hjQ ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hInternal per-Folio State}(hj hhhNhNubah}(h]id4ah ]h"]h$]h&]refidinternal-per-folio-stateuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj} ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hBuffered Readahead and Reads}(hj hhhNhNubah}(h]id5ah ]h"]h$]h&]refidbuffered-readahead-and-readsuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh](j )}(hhh]j )}(hhh]hBuffered Writes}(hj hhhNhNubah}(h]id6ah ]h"]h$]h&]refidbuffered-writesuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]hmmap Write Faults}(hj hhhNhNubah}(h]id7ah ]h"]h$]h&]refidmmap-write-faultsuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hBuffered Write Failures}(hj hhhNhNubah}(h]id8ah ]h"]h$]h&]refidbuffered-write-failuresuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hZeroing for File Operations}(hj* hhhNhNubah}(h]id9ah ]h"]h$]h&]refidzeroing-for-file-operationsuh1j hj' ubah}(h]h ]h"]h$]h&]uh1j hj$ ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hUnsharing Reflinked File Data}(hjL hhhNhNubah}(h]id10ah ]h"]h$]h&]refidunsharing-reflinked-file-datauh1j hjI ubah}(h]h ]h"]h$]h&]uh1j hjF ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]h
Truncation}(hjz hhhNhNubah}(h]id11ah ]h"]h$]h&]refid
truncationuh1j hjw ubah}(h]h ]h"]h$]h&]uh1j hjt ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh](j )}(hhh]j )}(hhh]hPagecache Writeback}(hj hhhNhNubah}(h]id12ah ]h"]h$]h&]refidpagecache-writebackuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]j- )}(h``struct iomap_writeback_ops``h]hstruct iomap_writeback_ops}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hNhNhj ubah}(h]id13ah ]h"]h$]h&]refidstruct-iomap-writeback-opsuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hPagecache Writeback Completion}(hj hhhNhNubah}(h]id14ah ]h"]h$]h&]refidpagecache-writeback-completionuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh](j )}(hhh]j )}(hhh]h
Direct I/O}(hj! hhhNhNubah}(h]id15ah ]h"]h$]h&]refid
direct-i-ouh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]h
Return Values}(hj@ hhhNhNubah}(h]id16ah ]h"]h$]h&]refid
return-valuesuh1j hj= ubah}(h]h ]h"]h$]h&]uh1j hj: ubah}(h]h ]h"]h$]h&]uh1hhj7 ubh)}(hhh]j )}(hhh]j )}(hhh]hDirect Reads}(hjb hhhNhNubah}(h]id17ah ]h"]h$]h&]refiddirect-readsuh1j hj_ ubah}(h]h ]h"]h$]h&]uh1j hj\ ubah}(h]h ]h"]h$]h&]uh1hhj7 ubh)}(hhh]j )}(hhh]j )}(hhh]h
Direct Writes}(hj hhhNhNubah}(h]id18ah ]h"]h$]h&]refid
direct-writesuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj~ ubah}(h]h ]h"]h$]h&]uh1hhj7 ubh)}(hhh]j )}(hhh]j )}(hhh]j- )}(h``struct iomap_dio_ops:``h]hstruct iomap_dio_ops:}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hNhNhj ubah}(h]id19ah ]h"]h$]h&]refidstruct-iomap-dio-opsuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj7 ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh](j )}(hhh]j )}(hhh]hDAX I/O}(hj hhhNhNubah}(h]id20ah ]h"]h$]h&]refiddax-i-ouh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]hfsdax Reads}(hj hhhNhNubah}(h]id21ah ]h"]h$]h&]refidfsdax-readsuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh](j )}(hhh]j )}(hhh]hfsdax Writes}(hj hhhNhNubah}(h]id22ah ]h"]h$]h&]refidfsdax-writesuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubh)}(hhh]h)}(hhh]j )}(hhh]j )}(hhh]hfsdax mmap Faults}(hj> hhhNhNubah}(h]id23ah ]h"]h$]h&]refidfsdax-mmap-faultsuh1j hj; ubah}(h]h ]h"]h$]h&]uh1j hj8 ubah}(h]h ]h"]h$]h&]uh1hhj5 ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]h*fsdax Truncation, fallocate, and Unsharing}(hjl hhhNhNubah}(h]id24ah ]h"]h$]h&]refid(fsdax-truncation-fallocate-and-unsharinguh1j hji ubah}(h]h ]h"]h$]h&]uh1j hjf ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hfsdax Deduplication}(hj hhhNhNubah}(h]id25ah ]h"]h$]h&]refidfsdax-deduplicationuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh](j )}(hhh]j )}(hhh]h
Seeking Files}(hj hhhNhNubah}(h]id26ah ]h"]h$]h&]refid
seeking-filesuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]h SEEK_DATA}(hj hhhNhNubah}(h]id27ah ]h"]h$]h&]refid seek-datauh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]h SEEK_HOLE}(hj hhhNhNubah}(h]id28ah ]h"]h$]h&]refid seek-holeuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh]j )}(hhh]j )}(hhh]hSwap File Activation}(hj+ hhhNhNubah}(h]id29ah ]h"]h$]h&]refidswap-file-activationuh1j hj( ubah}(h]h ]h"]h$]h&]uh1j hj% ubah}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh](j )}(hhh]j )}(hhh]hFile Space Mapping Reporting}(hjM hhhNhNubah}(h]id30ah ]h"]h$]h&]refidfile-space-mapping-reportinguh1j hjJ ubah}(h]h ]h"]h$]h&]uh1j hjG ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]h
FS_IOC_FIEMAP}(hjl hhhNhNubah}(h]id31ah ]h"]h$]h&]refid
fs-ioc-fiemapuh1j hji ubah}(h]h ]h"]h$]h&]uh1j hjf ubah}(h]h ]h"]h$]h&]uh1hhjc ubh)}(hhh]j )}(hhh]j )}(hhh]hFIBMAP (deprecated)}(hj hhhNhNubah}(h]id32ah ]h"]h$]h&]refidfibmap-deprecateduh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhjc ubeh}(h]h ]h"]h$]h&]uh1hhjG ubeh}(h]h ]h"]h$]h&]uh1hhhubeh}(h]h ]h"]h$]h&]uh1hhhhhhNhNubeh}(h]table-of-contentsah ](contentslocaleh"]table of contentsah$]h&]uh1hhhhKhhhhubj )}(hOBelow are a discussion of the high level file operations that iomap
implements.h]hOBelow are a discussion of the high level file operations that iomap
implements.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhhhhubh)}(hhh](h)}(hBuffered I/Oh]hBuffered I/O}(hj hhhNhNubah}(h]h ]h"]h$]h&]refidj uh1hhj hhhhhKubj )}(hBuffered I/O is the default file I/O path in Linux.
File contents are cached in memory ("pagecache") to satisfy reads and
writes.
Dirty cache will be written back to disk at some point that can be
forced via ``fsync`` and variants.h](hBuffered I/O is the default file I/O path in Linux.
File contents are cached in memory (“pagecache”) to satisfy reads and
writes.
Dirty cache will be written back to disk at some point that can be
forced via }(hj hhhNhNubj- )}(h ``fsync``h]hfsync}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubh and variants.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj hhubj )}(hX\ iomap implements nearly all the folio and pagecache management that
filesystems have to implement themselves under the legacy I/O model.
This means that the filesystem need not know the details of allocating,
mapping, managing uptodate and dirty state, or writeback of pagecache
folios.
Under the legacy I/O model, this was managed very inefficiently with
linked lists of buffer heads instead of the per-folio bitmaps that iomap
uses.
Unless the filesystem explicitly opts in to buffer heads, they will not
be used, which makes buffered I/O much more efficient, and the pagecache
maintainer much happier.h]hX\ iomap implements nearly all the folio and pagecache management that
filesystems have to implement themselves under the legacy I/O model.
This means that the filesystem need not know the details of allocating,
mapping, managing uptodate and dirty state, or writeback of pagecache
folios.
Under the legacy I/O model, this was managed very inefficiently with
linked lists of buffer heads instead of the per-folio bitmaps that iomap
uses.
Unless the filesystem explicitly opts in to buffer heads, they will not
be used, which makes buffered I/O much more efficient, and the pagecache
maintainer much happier.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhj hhubh)}(hhh](h)}(hj0 h]j- )}(hj0 h]hstruct address_space_operations}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubah}(h]h ]h"]h$]h&]j j> uh1hhj hhhhhK*ubj )}(heThe following iomap functions can be referenced directly from the
address space operations structure:h]heThe following iomap functions can be referenced directly from the
address space operations structure:}(hj- hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK,hj hhubhblock_quote)}(hq* ``iomap_dirty_folio``
* ``iomap_release_folio``
* ``iomap_invalidate_folio``
* ``iomap_is_partially_uptodate``
h]h)}(hhh](h)}(h``iomap_dirty_folio``h]j )}(hjF h]j- )}(hjF h]hiomap_dirty_folio}(hjK hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjH ubah}(h]h ]h"]h$]h&]uh1j hhhK/hjD ubah}(h]h ]h"]h$]h&]uh1hhjA ubh)}(h``iomap_release_folio``h]j )}(hjf h]j- )}(hjf h]hiomap_release_folio}(hjk hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjh ubah}(h]h ]h"]h$]h&]uh1j hhhK0hjd ubah}(h]h ]h"]h$]h&]uh1hhjA ubh)}(h``iomap_invalidate_folio``h]j )}(hj h]j- )}(hj h]hiomap_invalidate_folio}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubah}(h]h ]h"]h$]h&]uh1j hhhK1hj ubah}(h]h ]h"]h$]h&]uh1hhjA ubh)}(h ``iomap_is_partially_uptodate``
h]j )}(h``iomap_is_partially_uptodate``h]j- )}(hj h]hiomap_is_partially_uptodate}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubah}(h]h ]h"]h$]h&]uh1j hhhK2hj ubah}(h]h ]h"]h$]h&]uh1hhjA ubeh}(h]h ]h"]h$]h&]bullet*uh1hhhhK/hj= ubah}(h]h ]h"]h$]h&]uh1j; hhhK/hj hhubj )}(h=The following address space operations can be wrapped easily:h]h=The following address space operations can be wrapped easily:}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK4hj hhubj< )}(hQ* ``read_folio``
* ``readahead``
* ``writepages``
* ``bmap``
* ``swap_activate``
h]h)}(hhh](h)}(h``read_folio``h]j )}(hj h]j- )}(hj h]h
read_folio}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubah}(h]h ]h"]h$]h&]uh1j hhhK6hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(h
``readahead``h]j )}(hj
h]j- )}(hj
h]h readahead}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubah}(h]h ]h"]h$]h&]uh1j hhhK7hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(h``writepages``h]j )}(hj* h]j- )}(hj* h]h
writepages}(hj/ hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj, ubah}(h]h ]h"]h$]h&]uh1j hhhK8hj( ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(h``bmap``h]j )}(hjJ h]j- )}(hjJ h]hbmap}(hjO hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjL ubah}(h]h ]h"]h$]h&]uh1j hhhK9hjH ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(h``swap_activate``
h]j )}(h``swap_activate``h]j- )}(hjn h]h
swap_activate}(hjp hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjl ubah}(h]h ]h"]h$]h&]uh1j hhhK:hjh ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]j j uh1hhhhK6hj ubah}(h]h ]h"]h$]h&]uh1j; hhhK6hj hhubeh}(h]jD ah ]h"]struct address_space_operationsah$]h&]uh1hhj hhhhhK*ubh)}(hhh](h)}(hj\ h]j- )}(hj\ h]hstruct iomap_folio_ops}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubah}(h]h ]h"]h$]h&]j jj uh1hhj hhhhhK=ubj )}(hThe ``->iomap_begin`` function for pagecache operations may set the
``struct iomap::folio_ops`` field to an ops structure to override
default behaviors of iomap:h](hThe }(hj hhhNhNubj- )}(h``->iomap_begin``h]h
->iomap_begin}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubh/ function for pagecache operations may set the
}(hj hhhNhNubj- )}(h``struct iomap::folio_ops``h]hstruct iomap::folio_ops}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubhB field to an ops structure to override
default behaviors of iomap:}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhK?hj hhubh
literal_block)}(hXJ struct iomap_folio_ops {
struct folio *(*get_folio)(struct iomap_iter *iter, loff_t pos,
unsigned len);
void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied,
struct folio *folio);
bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap);
};h]hXJ struct iomap_folio_ops {
struct folio *(*get_folio)(struct iomap_iter *iter, loff_t pos,
unsigned len);
void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied,
struct folio *folio);
bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap);
};}hj sbah}(h]h ]h"]h$]h&]hhforcelanguagechighlight_args}uh1j hhhKChj hhubj )}(hiomap calls these functions:h]hiomap calls these functions:}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKMhj hhubj< )}(hX - ``get_folio``: Called to allocate and return an active reference to
a locked folio prior to starting a write.
If this function is not provided, iomap will call
``iomap_get_folio``.
This could be used to `set up per-folio filesystem state
`_
for a write.
- ``put_folio``: Called to unlock and put a folio after a pagecache
operation completes.
If this function is not provided, iomap will ``folio_unlock`` and
``folio_put`` on its own.
This could be used to `commit per-folio filesystem state
`_
that was set up by ``->get_folio``.
- ``iomap_valid``: The filesystem may not hold locks between
``->iomap_begin`` and ``->iomap_end`` because pagecache operations
can take folio locks, fault on userspace pages, initiate writeback
for memory reclamation, or engage in other time-consuming actions.
If a file's space mapping data are mutable, it is possible that the
mapping for a particular pagecache folio can `change in the time it
takes
`_
to allocate, install, and lock that folio.
For the pagecache, races can happen if writeback doesn't take
``i_rwsem`` or ``invalidate_lock`` and updates mapping information.
Races can also happen if the filesystem allows concurrent writes.
For such files, the mapping *must* be revalidated after the folio
lock has been taken so that iomap can manage the folio correctly.
fsdax does not need this revalidation because there's no writeback
and no support for unwritten extents.
Filesystems subject to this kind of race must provide a
``->iomap_valid`` function to decide if the mapping is still valid.
If the mapping is not valid, the mapping will be sampled again.
To support making the validity decision, the filesystem's
``->iomap_begin`` function may set ``struct iomap::validity_cookie``
at the same time that it populates the other iomap fields.
A simple validation cookie implementation is a sequence counter.
If the filesystem bumps the sequence counter every time it modifies
the inode's extent map, it can be placed in the ``struct
iomap::validity_cookie`` during ``->iomap_begin``.
If the value in the cookie is found to be different to the value
the filesystem holds when the mapping is passed back to
``->iomap_valid``, then the iomap should considered stale and the
validation failed.
h]h)}(hhh](h)}(hXG ``get_folio``: Called to allocate and return an active reference to
a locked folio prior to starting a write.
If this function is not provided, iomap will call
``iomap_get_folio``.
This could be used to `set up per-folio filesystem state
`_
for a write.
h]j )}(hXF ``get_folio``: Called to allocate and return an active reference to
a locked folio prior to starting a write.
If this function is not provided, iomap will call
``iomap_get_folio``.
This could be used to `set up per-folio filesystem state
`_
for a write.h](j- )}(h
``get_folio``h]h get_folio}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubh: Called to allocate and return an active reference to
a locked folio prior to starting a write.
If this function is not provided, iomap will call
}(hj hhhNhNubj- )}(h``iomap_get_folio``h]hiomap_get_folio}(hj+ hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubh.
This could be used to }(hj hhhNhNubj )}(hn`set up per-folio filesystem state
`_h]h!set up per-folio filesystem state}(hj= hhhNhNubah}(h]h ]h"]h$]h&]name!set up per-folio filesystem staterefuriGhttps://lore.kernel.org/all/20190429220934.10415-5-agruenba@redhat.com/uh1j hj ubh)}(hJ
h]h}(h]!set-up-per-folio-filesystem-stateah ]h"]!set up per-folio filesystem stateah$]h&]refurijN uh1h
referencedKhj ubh
for a write.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKOhj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hXS ``put_folio``: Called to unlock and put a folio after a pagecache
operation completes.
If this function is not provided, iomap will ``folio_unlock`` and
``folio_put`` on its own.
This could be used to `commit per-folio filesystem state
`_
that was set up by ``->get_folio``.
h]j )}(hXR ``put_folio``: Called to unlock and put a folio after a pagecache
operation completes.
If this function is not provided, iomap will ``folio_unlock`` and
``folio_put`` on its own.
This could be used to `commit per-folio filesystem state
`_
that was set up by ``->get_folio``.h](j- )}(h
``put_folio``h]h put_folio}(hju hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjq ubhw: Called to unlock and put a folio after a pagecache
operation completes.
If this function is not provided, iomap will }(hjq hhhNhNubj- )}(h``folio_unlock``h]hfolio_unlock}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjq ubh and
}(hjq hhhNhNubj- )}(h
``folio_put``h]h folio_put}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjq ubh# on its own.
This could be used to }(hjq hhhNhNubj )}(he`commit per-folio filesystem state
`_h]h!commit per-folio filesystem state}(hj hhhNhNubah}(h]h ]h"]h$]h&]name!commit per-folio filesystem statejM >https://lore.kernel.org/all/20180619164137.13720-6-hch@lst.de/uh1j hjq ubh)}(hA
h]h}(h]!commit-per-folio-filesystem-stateah ]h"]!commit per-folio filesystem stateah$]h&]refurij uh1hj\ Khjq ubh
that was set up by }(hjq hhhNhNubj- )}(h``->get_folio``h]h->get_folio}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjq ubh.}(hjq hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKWhjm ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hX ``iomap_valid``: The filesystem may not hold locks between
``->iomap_begin`` and ``->iomap_end`` because pagecache operations
can take folio locks, fault on userspace pages, initiate writeback
for memory reclamation, or engage in other time-consuming actions.
If a file's space mapping data are mutable, it is possible that the
mapping for a particular pagecache folio can `change in the time it
takes
`_
to allocate, install, and lock that folio.
For the pagecache, races can happen if writeback doesn't take
``i_rwsem`` or ``invalidate_lock`` and updates mapping information.
Races can also happen if the filesystem allows concurrent writes.
For such files, the mapping *must* be revalidated after the folio
lock has been taken so that iomap can manage the folio correctly.
fsdax does not need this revalidation because there's no writeback
and no support for unwritten extents.
Filesystems subject to this kind of race must provide a
``->iomap_valid`` function to decide if the mapping is still valid.
If the mapping is not valid, the mapping will be sampled again.
To support making the validity decision, the filesystem's
``->iomap_begin`` function may set ``struct iomap::validity_cookie``
at the same time that it populates the other iomap fields.
A simple validation cookie implementation is a sequence counter.
If the filesystem bumps the sequence counter every time it modifies
the inode's extent map, it can be placed in the ``struct
iomap::validity_cookie`` during ``->iomap_begin``.
If the value in the cookie is found to be different to the value
the filesystem holds when the mapping is passed back to
``->iomap_valid``, then the iomap should considered stale and the
validation failed.
h](j )}(hX ``iomap_valid``: The filesystem may not hold locks between
``->iomap_begin`` and ``->iomap_end`` because pagecache operations
can take folio locks, fault on userspace pages, initiate writeback
for memory reclamation, or engage in other time-consuming actions.
If a file's space mapping data are mutable, it is possible that the
mapping for a particular pagecache folio can `change in the time it
takes
`_
to allocate, install, and lock that folio.h](j- )}(h``iomap_valid``h]hiomap_valid}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubh,: The filesystem may not hold locks between
}(hj hhhNhNubj- )}(h``->iomap_begin``h]h
->iomap_begin}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubh and }(hj hhhNhNubj- )}(h``->iomap_end``h]h->iomap_end}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubhX because pagecache operations
can take folio locks, fault on userspace pages, initiate writeback
for memory reclamation, or engage in other time-consuming actions.
If a file’s space mapping data are mutable, it is possible that the
mapping for a particular pagecache folio can }(hj hhhNhNubj )}(hi`change in the time it
takes
`_h]hchange in the time it
takes}(hj) hhhNhNubah}(h]h ]h"]h$]h&]namechange in the time it takesjM Hhttps://lore.kernel.org/all/20221123055812.747923-8-david@fromorbit.com/uh1j hj ubh)}(hK
h]h}(h]change-in-the-time-it-takesah ]h"]change in the time it takesah$]h&]refurij9 uh1hj\ Khj ubh+
to allocate, install, and lock that folio.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhK_hj ubj )}(hXG For the pagecache, races can happen if writeback doesn't take
``i_rwsem`` or ``invalidate_lock`` and updates mapping information.
Races can also happen if the filesystem allows concurrent writes.
For such files, the mapping *must* be revalidated after the folio
lock has been taken so that iomap can manage the folio correctly.h](h@For the pagecache, races can happen if writeback doesn’t take
}(hjQ hhhNhNubj- )}(h``i_rwsem``h]hi_rwsem}(hjY hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjQ ubh or }(hjQ hhhNhNubj- )}(h``invalidate_lock``h]hinvalidate_lock}(hjk hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjQ ubh and updates mapping information.
Races can also happen if the filesystem allows concurrent writes.
For such files, the mapping }(hjQ hhhNhNubhemphasis)}(h*must*h]hmust}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j} hjQ ubha be revalidated after the folio
lock has been taken so that iomap can manage the folio correctly.}(hjQ hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKihj ubj )}(hhfsdax does not need this revalidation because there's no writeback
and no support for unwritten extents.h]hjfsdax does not need this revalidation because there’s no writeback
and no support for unwritten extents.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKohj ubj )}(hFilesystems subject to this kind of race must provide a
``->iomap_valid`` function to decide if the mapping is still valid.
If the mapping is not valid, the mapping will be sampled again.h](h8Filesystems subject to this kind of race must provide a
}(hj hhhNhNubj- )}(h``->iomap_valid``h]h
->iomap_valid}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubhr function to decide if the mapping is still valid.
If the mapping is not valid, the mapping will be sampled again.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKrhj ubj )}(hXx To support making the validity decision, the filesystem's
``->iomap_begin`` function may set ``struct iomap::validity_cookie``
at the same time that it populates the other iomap fields.
A simple validation cookie implementation is a sequence counter.
If the filesystem bumps the sequence counter every time it modifies
the inode's extent map, it can be placed in the ``struct
iomap::validity_cookie`` during ``->iomap_begin``.
If the value in the cookie is found to be different to the value
the filesystem holds when the mapping is passed back to
``->iomap_valid``, then the iomap should considered stale and the
validation failed.h](h