sphinx.addnodesdocument)}( rawsource children](translations
LanguagesNode)}(hhh](h pending_xref)}(hhh]docutils.nodesTextChinese (Simplified)}parenthsba
attributes}(ids]classes]names]dupnames]backrefs] refdomainstdreftypedoc reftarget0/translations/zh_CN/filesystems/iomap/operationsmodnameN classnameNrefexplicitutagnamehhhubh)}(hhh]hChinese (Traditional)}hh2sbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget0/translations/zh_TW/filesystems/iomap/operationsmodnameN classnameNrefexplicituh1hhhubh)}(hhh]hItalian}hhFsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget0/translations/it_IT/filesystems/iomap/operationsmodnameN classnameNrefexplicituh1hhhubh)}(hhh]hJapanese}hhZsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget0/translations/ja_JP/filesystems/iomap/operationsmodnameN classnameNrefexplicituh1hhhubh)}(hhh]hKorean}hhnsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget0/translations/ko_KR/filesystems/iomap/operationsmodnameN classnameNrefexplicituh1hhhubh)}(hhh]hSpanish}hhsbah}(h]h ]h"]h$]h&] refdomainh)reftypeh+ reftarget0/translations/sp_SP/filesystems/iomap/operationsmodnameN classnameNrefexplicituh1hhhubeh}(h]h ]h"]h$]h&]current_languageEnglishuh1h
hh _documenthsourceNlineNubhcomment)}(h SPDX-License-Identifier: GPL-2.0h]h SPDX-License-Identifier: GPL-2.0}hhsbah}(h]h ]h"]h$]h&] xml:spacepreserveuh1hhhhhhJ/var/lib/git/docbuild/linux/Documentation/filesystems/iomap/operations.rsthKubhtarget)}(h.. _iomap_operations:h]h}(h]iomap-operationsah ]h"]iomap_operationsah$]h&]uh1hhKhhhhhhubh)}(hDumb style notes to maintain the author's sanity:
Please try to start sentences on separate lines so that
sentence changes don't bleed colors in diff.
Heading decorations are documented in sphinx.rst.h]hDumb style notes to maintain the author's sanity:
Please try to start sentences on separate lines so that
sentence changes don't bleed colors in diff.
Heading decorations are documented in sphinx.rst.}hhsbah}(h]h ]h"]h$]h&]hhuh1hhhhhhhhK ubhsection)}(hhh](htitle)}(hSupported File Operationsh]hSupported File Operations}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhhhhKubhtopic)}(hTable of Contents
h](h)}(hTable of Contentsh]hTable of Contents}(hhhhhNhNubah}(h]h ]h"]h$]h&]uh1hhhhhhKubhbullet_list)}(hhh](h list_item)}(hhh](h paragraph)}(hhh]h reference)}(hhh]hBuffered I/O}(hj
hhhNhNubah}(h]id1ah ]h"]h$]h&]refidbuffered-i-ouh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]hliteral)}(h#``struct address_space_operations``h]hstruct address_space_operations}(hj. hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hNhNhj) ubah}(h]id2ah ]h"]h$]h&]refidstruct-address-space-operationsuh1j hj& ubah}(h]h ]h"]h$]h&]uh1j hj# ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]j- )}(h``struct iomap_write_ops``h]hstruct iomap_write_ops}(hjZ hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hNhNhjW ubah}(h]id3ah ]h"]h$]h&]refidstruct-iomap-write-opsuh1j hjT ubah}(h]h ]h"]h$]h&]uh1j hjQ ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]j- )}(h``struct iomap_read_ops``h]hstruct iomap_read_ops}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hNhNhj ubah}(h]id4ah ]h"]h$]h&]refidstruct-iomap-read-opsuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj} ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hInternal per-Folio State}(hj hhhNhNubah}(h]id5ah ]h"]h$]h&]refidinternal-per-folio-stateuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hBuffered Readahead and Reads}(hj hhhNhNubah}(h]id6ah ]h"]h$]h&]refidbuffered-readahead-and-readsuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh](j )}(hhh]j )}(hhh]hBuffered Writes}(hj hhhNhNubah}(h]id7ah ]h"]h$]h&]refidbuffered-writesuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]hmmap Write Faults}(hj hhhNhNubah}(h]id8ah ]h"]h$]h&]refidmmap-write-faultsuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hBuffered Write Failures}(hj4 hhhNhNubah}(h]id9ah ]h"]h$]h&]refidbuffered-write-failuresuh1j hj1 ubah}(h]h ]h"]h$]h&]uh1j hj. ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hZeroing for File Operations}(hjV hhhNhNubah}(h]id10ah ]h"]h$]h&]refidzeroing-for-file-operationsuh1j hjS ubah}(h]h ]h"]h$]h&]uh1j hjP ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hUnsharing Reflinked File Data}(hjx hhhNhNubah}(h]id11ah ]h"]h$]h&]refidunsharing-reflinked-file-datauh1j hju ubah}(h]h ]h"]h$]h&]uh1j hjr ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]h
Truncation}(hj hhhNhNubah}(h]id12ah ]h"]h$]h&]refid
truncationuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh](j )}(hhh]j )}(hhh]hPagecache Writeback}(hj hhhNhNubah}(h]id13ah ]h"]h$]h&]refidpagecache-writebackuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]j- )}(h``struct iomap_writeback_ops``h]hstruct iomap_writeback_ops}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hNhNhj ubah}(h]id14ah ]h"]h$]h&]refidstruct-iomap-writeback-opsuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hPagecache Writeback Completion}(hj hhhNhNubah}(h]id15ah ]h"]h$]h&]refidpagecache-writeback-completionuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj
ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh](j )}(hhh]j )}(hhh]h
Direct I/O}(hjM hhhNhNubah}(h]id16ah ]h"]h$]h&]refid
direct-i-ouh1j hjJ ubah}(h]h ]h"]h$]h&]uh1j hjG ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]h
Return Values}(hjl hhhNhNubah}(h]id17ah ]h"]h$]h&]refid
return-valuesuh1j hji ubah}(h]h ]h"]h$]h&]uh1j hjf ubah}(h]h ]h"]h$]h&]uh1hhjc ubh)}(hhh]j )}(hhh]j )}(hhh]hDirect Reads}(hj hhhNhNubah}(h]id18ah ]h"]h$]h&]refiddirect-readsuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhjc ubh)}(hhh]j )}(hhh]j )}(hhh]h
Direct Writes}(hj hhhNhNubah}(h]id19ah ]h"]h$]h&]refid
direct-writesuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhjc ubh)}(hhh]j )}(hhh]j )}(hhh]j- )}(h``struct iomap_dio_ops:``h]hstruct iomap_dio_ops:}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hNhNhj ubah}(h]id20ah ]h"]h$]h&]refidstruct-iomap-dio-opsuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhjc ubeh}(h]h ]h"]h$]h&]uh1hhjG ubeh}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh](j )}(hhh]j )}(hhh]hDAX I/O}(hj
hhhNhNubah}(h]id21ah ]h"]h$]h&]refiddax-i-ouh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]hfsdax Reads}(hj) hhhNhNubah}(h]id22ah ]h"]h$]h&]refidfsdax-readsuh1j hj& ubah}(h]h ]h"]h$]h&]uh1j hj# ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh](j )}(hhh]j )}(hhh]hfsdax Writes}(hjK hhhNhNubah}(h]id23ah ]h"]h$]h&]refidfsdax-writesuh1j hjH ubah}(h]h ]h"]h$]h&]uh1j hjE ubh)}(hhh]h)}(hhh]j )}(hhh]j )}(hhh]hfsdax mmap Faults}(hjj hhhNhNubah}(h]id24ah ]h"]h$]h&]refidfsdax-mmap-faultsuh1j hjg ubah}(h]h ]h"]h$]h&]uh1j hjd ubah}(h]h ]h"]h$]h&]uh1hhja ubah}(h]h ]h"]h$]h&]uh1hhjE ubeh}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]h*fsdax Truncation, fallocate, and Unsharing}(hj hhhNhNubah}(h]id25ah ]h"]h$]h&]refid(fsdax-truncation-fallocate-and-unsharinguh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hfsdax Deduplication}(hj hhhNhNubah}(h]id26ah ]h"]h$]h&]refidfsdax-deduplicationuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh](j )}(hhh]j )}(hhh]h
Seeking Files}(hj hhhNhNubah}(h]id27ah ]h"]h$]h&]refid
seeking-filesuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]h SEEK_DATA}(hj hhhNhNubah}(h]id28ah ]h"]h$]h&]refid seek-datauh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]h SEEK_HOLE}(hj) hhhNhNubah}(h]id29ah ]h"]h$]h&]refid seek-holeuh1j hj& ubah}(h]h ]h"]h$]h&]uh1j hj# ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh]j )}(hhh]j )}(hhh]hSwap File Activation}(hjW hhhNhNubah}(h]id30ah ]h"]h$]h&]refidswap-file-activationuh1j hjT ubah}(h]h ]h"]h$]h&]uh1j hjQ ubah}(h]h ]h"]h$]h&]uh1hhhubh)}(hhh](j )}(hhh]j )}(hhh]hFile Space Mapping Reporting}(hjy hhhNhNubah}(h]id31ah ]h"]h$]h&]refidfile-space-mapping-reportinguh1j hjv ubah}(h]h ]h"]h$]h&]uh1j hjs ubh)}(hhh](h)}(hhh]j )}(hhh]j )}(hhh]h
FS_IOC_FIEMAP}(hj hhhNhNubah}(h]id32ah ]h"]h$]h&]refid
fs-ioc-fiemapuh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hhh]j )}(hhh]j )}(hhh]hFIBMAP (deprecated)}(hj hhhNhNubah}(h]id33ah ]h"]h$]h&]refidfibmap-deprecateduh1j hj ubah}(h]h ]h"]h$]h&]uh1j hj ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]uh1hhjs ubeh}(h]h ]h"]h$]h&]uh1hhhubeh}(h]h ]h"]h$]h&]uh1hhhhhhNhNubeh}(h]table-of-contentsah ](contentslocaleh"]table of contentsah$]h&]uh1hhhhKhhhhubj )}(hOBelow are a discussion of the high level file operations that iomap
implements.h]hOBelow are a discussion of the high level file operations that iomap
implements.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhhhhubh)}(hhh](h)}(hBuffered I/Oh]hBuffered I/O}(hj hhhNhNubah}(h]h ]h"]h$]h&]refidj uh1hhj hhhhhKubj )}(hBuffered I/O is the default file I/O path in Linux.
File contents are cached in memory ("pagecache") to satisfy reads and
writes.
Dirty cache will be written back to disk at some point that can be
forced via ``fsync`` and variants.h](hBuffered I/O is the default file I/O path in Linux.
File contents are cached in memory (“pagecache”) to satisfy reads and
writes.
Dirty cache will be written back to disk at some point that can be
forced via }(hj hhhNhNubj- )}(h ``fsync``h]hfsync}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubh and variants.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKhj hhubj )}(hX\ iomap implements nearly all the folio and pagecache management that
filesystems have to implement themselves under the legacy I/O model.
This means that the filesystem need not know the details of allocating,
mapping, managing uptodate and dirty state, or writeback of pagecache
folios.
Under the legacy I/O model, this was managed very inefficiently with
linked lists of buffer heads instead of the per-folio bitmaps that iomap
uses.
Unless the filesystem explicitly opts in to buffer heads, they will not
be used, which makes buffered I/O much more efficient, and the pagecache
maintainer much happier.h]hX\ iomap implements nearly all the folio and pagecache management that
filesystems have to implement themselves under the legacy I/O model.
This means that the filesystem need not know the details of allocating,
mapping, managing uptodate and dirty state, or writeback of pagecache
folios.
Under the legacy I/O model, this was managed very inefficiently with
linked lists of buffer heads instead of the per-folio bitmaps that iomap
uses.
Unless the filesystem explicitly opts in to buffer heads, they will not
be used, which makes buffered I/O much more efficient, and the pagecache
maintainer much happier.}(hj2 hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKhj hhubh)}(hhh](h)}(hj0 h]j- )}(hj0 h]hstruct address_space_operations}(hjF hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjC ubah}(h]h ]h"]h$]h&]j j> uh1hhj@ hhhhhK*ubj )}(heThe following iomap functions can be referenced directly from the
address space operations structure:h]heThe following iomap functions can be referenced directly from the
address space operations structure:}(hjY hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK,hj@ hhubhblock_quote)}(hq* ``iomap_dirty_folio``
* ``iomap_release_folio``
* ``iomap_invalidate_folio``
* ``iomap_is_partially_uptodate``
h]h)}(hhh](h)}(h``iomap_dirty_folio``h]j )}(hjr h]j- )}(hjr h]hiomap_dirty_folio}(hjw hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjt ubah}(h]h ]h"]h$]h&]uh1j hhhK/hjp ubah}(h]h ]h"]h$]h&]uh1hhjm ubh)}(h``iomap_release_folio``h]j )}(hj h]j- )}(hj h]hiomap_release_folio}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubah}(h]h ]h"]h$]h&]uh1j hhhK0hj ubah}(h]h ]h"]h$]h&]uh1hhjm ubh)}(h``iomap_invalidate_folio``h]j )}(hj h]j- )}(hj h]hiomap_invalidate_folio}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubah}(h]h ]h"]h$]h&]uh1j hhhK1hj ubah}(h]h ]h"]h$]h&]uh1hhjm ubh)}(h ``iomap_is_partially_uptodate``
h]j )}(h``iomap_is_partially_uptodate``h]j- )}(hj h]hiomap_is_partially_uptodate}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubah}(h]h ]h"]h$]h&]uh1j hhhK2hj ubah}(h]h ]h"]h$]h&]uh1hhjm ubeh}(h]h ]h"]h$]h&]bullet*uh1hhhhK/hji ubah}(h]h ]h"]h$]h&]uh1jg hhhK/hj@ hhubj )}(h=The following address space operations can be wrapped easily:h]h=The following address space operations can be wrapped easily:}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhK4hj@ hhubjh )}(hQ* ``read_folio``
* ``readahead``
* ``writepages``
* ``bmap``
* ``swap_activate``
h]h)}(hhh](h)}(h``read_folio``h]j )}(hj h]j- )}(hj h]h
read_folio}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubah}(h]h ]h"]h$]h&]uh1j hhhK6hj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(h
``readahead``h]j )}(hj6 h]j- )}(hj6 h]h readahead}(hj; hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj8 ubah}(h]h ]h"]h$]h&]uh1j hhhK7hj4 ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(h``writepages``h]j )}(hjV h]j- )}(hjV h]h
writepages}(hj[ hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjX ubah}(h]h ]h"]h$]h&]uh1j hhhK8hjT ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(h``bmap``h]j )}(hjv h]j- )}(hjv h]hbmap}(hj{ hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjx ubah}(h]h ]h"]h$]h&]uh1j hhhK9hjt ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(h``swap_activate``
h]j )}(h``swap_activate``h]j- )}(hj h]h
swap_activate}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubah}(h]h ]h"]h$]h&]uh1j hhhK:hj ubah}(h]h ]h"]h$]h&]uh1hhj ubeh}(h]h ]h"]h$]h&]j j uh1hhhhK6hj
ubah}(h]h ]h"]h$]h&]uh1jg hhhK6hj@ hhubeh}(h]jD ah ]h"]struct address_space_operationsah$]h&]uh1hhj hhhhhK*ubh)}(hhh](h)}(hj\ h]j- )}(hj\ h]hstruct iomap_write_ops}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubah}(h]h ]h"]h$]h&]j jj uh1hhj hhhhhK=ubh
literal_block)}(hX struct iomap_write_ops {
struct folio *(*get_folio)(struct iomap_iter *iter, loff_t pos,
unsigned len);
void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied,
struct folio *folio);
bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap);
int (*read_folio_range)(const struct iomap_iter *iter,
struct folio *folio, loff_t pos, size_t len);
};h]hX struct iomap_write_ops {
struct folio *(*get_folio)(struct iomap_iter *iter, loff_t pos,
unsigned len);
void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied,
struct folio *folio);
bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap);
int (*read_folio_range)(const struct iomap_iter *iter,
struct folio *folio, loff_t pos, size_t len);
};}hj sbah}(h]h ]h"]h$]h&]hhforcelanguagechighlight_args}uh1j hhhK?hj hhubj )}(hiomap calls these functions:h]hiomap calls these functions:}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKKhj hhubjh )}(hX
- ``get_folio``: Called to allocate and return an active reference to
a locked folio prior to starting a write.
If this function is not provided, iomap will call
``iomap_get_folio``.
This could be used to `set up per-folio filesystem state
`_
for a write.
- ``put_folio``: Called to unlock and put a folio after a pagecache
operation completes.
If this function is not provided, iomap will ``folio_unlock`` and
``folio_put`` on its own.
This could be used to `commit per-folio filesystem state
`_
that was set up by ``->get_folio``.
- ``iomap_valid``: The filesystem may not hold locks between
``->iomap_begin`` and ``->iomap_end`` because pagecache operations
can take folio locks, fault on userspace pages, initiate writeback
for memory reclamation, or engage in other time-consuming actions.
If a file's space mapping data are mutable, it is possible that the
mapping for a particular pagecache folio can `change in the time it
takes
`_
to allocate, install, and lock that folio.
For the pagecache, races can happen if writeback doesn't take
``i_rwsem`` or ``invalidate_lock`` and updates mapping information.
Races can also happen if the filesystem allows concurrent writes.
For such files, the mapping *must* be revalidated after the folio
lock has been taken so that iomap can manage the folio correctly.
fsdax does not need this revalidation because there's no writeback
and no support for unwritten extents.
Filesystems subject to this kind of race must provide a
``->iomap_valid`` function to decide if the mapping is still valid.
If the mapping is not valid, the mapping will be sampled again.
To support making the validity decision, the filesystem's
``->iomap_begin`` function may set ``struct iomap::validity_cookie``
at the same time that it populates the other iomap fields.
A simple validation cookie implementation is a sequence counter.
If the filesystem bumps the sequence counter every time it modifies
the inode's extent map, it can be placed in the ``struct
iomap::validity_cookie`` during ``->iomap_begin``.
If the value in the cookie is found to be different to the value
the filesystem holds when the mapping is passed back to
``->iomap_valid``, then the iomap should considered stale and the
validation failed.
- ``read_folio_range``: Called to synchronously read in the range that will
be written to. If this function is not provided, iomap will default to
submitting a bio read request.
h]h)}(hhh](h)}(hXG ``get_folio``: Called to allocate and return an active reference to
a locked folio prior to starting a write.
If this function is not provided, iomap will call
``iomap_get_folio``.
This could be used to `set up per-folio filesystem state
`_
for a write.
h]j )}(hXF ``get_folio``: Called to allocate and return an active reference to
a locked folio prior to starting a write.
If this function is not provided, iomap will call
``iomap_get_folio``.
This could be used to `set up per-folio filesystem state
`_
for a write.h](j- )}(h
``get_folio``h]h get_folio}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubh: Called to allocate and return an active reference to
a locked folio prior to starting a write.
If this function is not provided, iomap will call
}(hj hhhNhNubj- )}(h``iomap_get_folio``h]hiomap_get_folio}(hj% hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubh.
This could be used to }(hj hhhNhNubj )}(hn`set up per-folio filesystem state
`_h]h!set up per-folio filesystem state}(hj7 hhhNhNubah}(h]h ]h"]h$]h&]name!set up per-folio filesystem staterefuriGhttps://lore.kernel.org/all/20190429220934.10415-5-agruenba@redhat.com/uh1j hj ubh)}(hJ
h]h}(h]!set-up-per-folio-filesystem-stateah ]h"]!set up per-folio filesystem stateah$]h&]refurijH uh1h
referencedKhj ubh
for a write.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKMhj ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hXS ``put_folio``: Called to unlock and put a folio after a pagecache
operation completes.
If this function is not provided, iomap will ``folio_unlock`` and
``folio_put`` on its own.
This could be used to `commit per-folio filesystem state
`_
that was set up by ``->get_folio``.
h]j )}(hXR ``put_folio``: Called to unlock and put a folio after a pagecache
operation completes.
If this function is not provided, iomap will ``folio_unlock`` and
``folio_put`` on its own.
This could be used to `commit per-folio filesystem state
`_
that was set up by ``->get_folio``.h](j- )}(h
``put_folio``h]h put_folio}(hjo hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjk ubhw: Called to unlock and put a folio after a pagecache
operation completes.
If this function is not provided, iomap will }(hjk hhhNhNubj- )}(h``folio_unlock``h]hfolio_unlock}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjk ubh and
}(hjk hhhNhNubj- )}(h
``folio_put``h]h folio_put}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjk ubh# on its own.
This could be used to }(hjk hhhNhNubj )}(he`commit per-folio filesystem state
`_h]h!commit per-folio filesystem state}(hj hhhNhNubah}(h]h ]h"]h$]h&]name!commit per-folio filesystem statejG >https://lore.kernel.org/all/20180619164137.13720-6-hch@lst.de/uh1j hjk ubh)}(hA
h]h}(h]!commit-per-folio-filesystem-stateah ]h"]!commit per-folio filesystem stateah$]h&]refurij uh1hjV Khjk ubh
that was set up by }(hjk hhhNhNubj- )}(h``->get_folio``h]h->get_folio}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjk ubh.}(hjk hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKUhjg ubah}(h]h ]h"]h$]h&]uh1hhj ubh)}(hX ``iomap_valid``: The filesystem may not hold locks between
``->iomap_begin`` and ``->iomap_end`` because pagecache operations
can take folio locks, fault on userspace pages, initiate writeback
for memory reclamation, or engage in other time-consuming actions.
If a file's space mapping data are mutable, it is possible that the
mapping for a particular pagecache folio can `change in the time it
takes
`_
to allocate, install, and lock that folio.
For the pagecache, races can happen if writeback doesn't take
``i_rwsem`` or ``invalidate_lock`` and updates mapping information.
Races can also happen if the filesystem allows concurrent writes.
For such files, the mapping *must* be revalidated after the folio
lock has been taken so that iomap can manage the folio correctly.
fsdax does not need this revalidation because there's no writeback
and no support for unwritten extents.
Filesystems subject to this kind of race must provide a
``->iomap_valid`` function to decide if the mapping is still valid.
If the mapping is not valid, the mapping will be sampled again.
To support making the validity decision, the filesystem's
``->iomap_begin`` function may set ``struct iomap::validity_cookie``
at the same time that it populates the other iomap fields.
A simple validation cookie implementation is a sequence counter.
If the filesystem bumps the sequence counter every time it modifies
the inode's extent map, it can be placed in the ``struct
iomap::validity_cookie`` during ``->iomap_begin``.
If the value in the cookie is found to be different to the value
the filesystem holds when the mapping is passed back to
``->iomap_valid``, then the iomap should considered stale and the
validation failed.
h](j )}(hX ``iomap_valid``: The filesystem may not hold locks between
``->iomap_begin`` and ``->iomap_end`` because pagecache operations
can take folio locks, fault on userspace pages, initiate writeback
for memory reclamation, or engage in other time-consuming actions.
If a file's space mapping data are mutable, it is possible that the
mapping for a particular pagecache folio can `change in the time it
takes
`_
to allocate, install, and lock that folio.h](j- )}(h``iomap_valid``h]hiomap_valid}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubh,: The filesystem may not hold locks between
}(hj hhhNhNubj- )}(h``->iomap_begin``h]h
->iomap_begin}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubh and }(hj hhhNhNubj- )}(h``->iomap_end``h]h->iomap_end}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubhX because pagecache operations
can take folio locks, fault on userspace pages, initiate writeback
for memory reclamation, or engage in other time-consuming actions.
If a file’s space mapping data are mutable, it is possible that the
mapping for a particular pagecache folio can }(hj hhhNhNubj )}(hi`change in the time it
takes
`_h]hchange in the time it
takes}(hj# hhhNhNubah}(h]h ]h"]h$]h&]namechange in the time it takesjG Hhttps://lore.kernel.org/all/20221123055812.747923-8-david@fromorbit.com/uh1j hj ubh)}(hK
h]h}(h]change-in-the-time-it-takesah ]h"]change in the time it takesah$]h&]refurij3 uh1hjV Khj ubh+
to allocate, install, and lock that folio.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhK]hj ubj )}(hXG For the pagecache, races can happen if writeback doesn't take
``i_rwsem`` or ``invalidate_lock`` and updates mapping information.
Races can also happen if the filesystem allows concurrent writes.
For such files, the mapping *must* be revalidated after the folio
lock has been taken so that iomap can manage the folio correctly.h](h@For the pagecache, races can happen if writeback doesn’t take
}(hjK hhhNhNubj- )}(h``i_rwsem``h]hi_rwsem}(hjS hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjK ubh or }(hjK hhhNhNubj- )}(h``invalidate_lock``h]hinvalidate_lock}(hje hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hjK ubh and updates mapping information.
Races can also happen if the filesystem allows concurrent writes.
For such files, the mapping }(hjK hhhNhNubhemphasis)}(h*must*h]hmust}(hjy hhhNhNubah}(h]h ]h"]h$]h&]uh1jw hjK ubha be revalidated after the folio
lock has been taken so that iomap can manage the folio correctly.}(hjK hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKghj ubj )}(hhfsdax does not need this revalidation because there's no writeback
and no support for unwritten extents.h]hjfsdax does not need this revalidation because there’s no writeback
and no support for unwritten extents.}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j hhhKmhj ubj )}(hFilesystems subject to this kind of race must provide a
``->iomap_valid`` function to decide if the mapping is still valid.
If the mapping is not valid, the mapping will be sampled again.h](h8Filesystems subject to this kind of race must provide a
}(hj hhhNhNubj- )}(h``->iomap_valid``h]h
->iomap_valid}(hj hhhNhNubah}(h]h ]h"]h$]h&]uh1j, hj ubhr function to decide if the mapping is still valid.
If the mapping is not valid, the mapping will be sampled again.}(hj hhhNhNubeh}(h]h ]h"]h$]h&]uh1j hhhKphj ubj )}(hXx To support making the validity decision, the filesystem's
``->iomap_begin`` function may set ``struct iomap::validity_cookie``
at the same time that it populates the other iomap fields.
A simple validation cookie implementation is a sequence counter.
If the filesystem bumps the sequence counter every time it modifies
the inode's extent map, it can be placed in the ``struct
iomap::validity_cookie`` during ``->iomap_begin``.
If the value in the cookie is found to be different to the value
the filesystem holds when the mapping is passed back to
``->iomap_valid``, then the iomap should considered stale and the
validation failed.h](h