€•X8Œsphinx.addnodes”Œdocument”“”)”}”(Œ rawsource”Œ”Œchildren”]”(Œ translations”Œ LanguagesNode”“”)”}”(hhh]”(hŒ pending_xref”“”)”}”(hhh]”Œdocutils.nodes”ŒText”“”ŒChinese (Simplified)”…””}”Œparent”hsbaŒ attributes”}”(Œids”]”Œclasses”]”Œnames”]”Œdupnames”]”Œbackrefs”]”Œ refdomain”Œstd”Œreftype”Œdoc”Œ reftarget”Œ#/translations/zh_CN/mm/mmu_notifier”Œmodname”NŒ classname”NŒ refexplicit”ˆuŒtagname”hhh ubh)”}”(hhh]”hŒChinese (Traditional)”…””}”hh2sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ#/translations/zh_TW/mm/mmu_notifier”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒItalian”…””}”hhFsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ#/translations/it_IT/mm/mmu_notifier”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒJapanese”…””}”hhZsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ#/translations/ja_JP/mm/mmu_notifier”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒKorean”…””}”hhnsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ#/translations/ko_KR/mm/mmu_notifier”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒSpanish”…””}”hh‚sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ#/translations/sp_SP/mm/mmu_notifier”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubeh}”(h]”h ]”h"]”h$]”h&]”Œcurrent_language”ŒEnglish”uh1h hhŒ _document”hŒsource”NŒline”NubhŒsection”“”)”}”(hhh]”(hŒtitle”“”)”}”(hŒ3When do you need to notify inside page table lock ?”h]”hŒ3When do you need to notify inside page table lock ?”…””}”(hh¨hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h¦hh£hžhhŸŒ=/var/lib/git/docbuild/linux/Documentation/mm/mmu_notifier.rst”h KubhŒ paragraph”“”)”}”(hŒßWhen clearing a pte/pmd we are given a choice to notify the event through (notify version of \*_clear_flush call mmu_notifier_invalidate_range) under the page table lock. But that notification is not necessary in all cases.”h]”hŒßWhen clearing a pte/pmd we are given a choice to notify the event through (notify version of *_clear_flush call mmu_notifier_invalidate_range) under the page table lock. But that notification is not necessary in all cases.”…””}”(hh¹hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubh¸)”}”(hX3For secondary TLB (non CPU TLB) like IOMMU TLB or device TLB (when device use thing like ATS/PASID to get the IOMMU to walk the CPU page table to access a process virtual address space). There is only 2 cases when you need to notify those secondary TLB while holding page table lock when clearing a pte/pmd:”h]”hX3For secondary TLB (non CPU TLB) like IOMMU TLB or device TLB (when device use thing like ATS/PASID to get the IOMMU to walk the CPU page table to access a process virtual address space). There is only 2 cases when you need to notify those secondary TLB while holding page table lock when clearing a pte/pmd:”…””}”(hhÇhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubhŒ block_quote”“”)”}”(hŒ½A) page backing address is free before mmu_notifier_invalidate_range_end() B) a page table entry is updated to point to a new page (COW, write fault on zero page, __replace_page(), ...) ”h]”hŒenumerated_list”“”)”}”(hhh]”(hŒ list_item”“”)”}”(hŒGpage backing address is free before mmu_notifier_invalidate_range_end()”h]”h¸)”}”(hhäh]”hŒGpage backing address is free before mmu_notifier_invalidate_range_end()”…””}”(hhæhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K hhâubah}”(h]”h ]”h"]”h$]”h&]”uh1hàhhÝubhá)”}”(hŒla page table entry is updated to point to a new page (COW, write fault on zero page, __replace_page(), ...) ”h]”h¸)”}”(hŒka page table entry is updated to point to a new page (COW, write fault on zero page, __replace_page(), ...)”h]”hŒka page table entry is updated to point to a new page (COW, write fault on zero page, __replace_page(), ...)”…””}”(hhýhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khhùubah}”(h]”h ]”h"]”h$]”h&]”uh1hàhhÝubeh}”(h]”h ]”h"]”h$]”h&]”Œenumtype”Œ upperalpha”Œprefix”hŒsuffix”Œ)”uh1hÛhh×ubah}”(h]”h ]”h"]”h$]”h&]”uh1hÕhŸh¶h K hh£hžhubh¸)”}”(hŒŽCase A is obvious you do not want to take the risk for the device to write to a page that might now be used by some completely different task.”h]”hŒŽCase A is obvious you do not want to take the risk for the device to write to a page that might now be used by some completely different task.”…””}”(hj"hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubh¸)”}”(hŒTCase B is more subtle. For correctness it requires the following sequence to happen:”h]”hŒTCase B is more subtle. For correctness it requires the following sequence to happen:”…””}”(hj0hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubhÖ)”}”(hŒŽ- take page table lock - clear page table entry and notify ([pmd/pte]p_huge_clear_flush_notify()) - set page table entry to point to new page ”h]”hŒ bullet_list”“”)”}”(hhh]”(há)”}”(hŒtake page table lock”h]”h¸)”}”(hjIh]”hŒtake page table lock”…””}”(hjKhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KhjGubah}”(h]”h ]”h"]”h$]”h&]”uh1hàhjDubhá)”}”(hŒHclear page table entry and notify ([pmd/pte]p_huge_clear_flush_notify())”h]”h¸)”}”(hj`h]”hŒHclear page table entry and notify ([pmd/pte]p_huge_clear_flush_notify())”…””}”(hjbhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khj^ubah}”(h]”h ]”h"]”h$]”h&]”uh1hàhjDubhá)”}”(hŒ*set page table entry to point to new page ”h]”h¸)”}”(hŒ)set page table entry to point to new page”h]”hŒ)set page table entry to point to new page”…””}”(hjyhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khjuubah}”(h]”h ]”h"]”h$]”h&]”uh1hàhjDubeh}”(h]”h ]”h"]”h$]”h&]”Œbullet”Œ-”uh1jBhŸh¶h Khj>ubah}”(h]”h ]”h"]”h$]”h&]”uh1hÕhŸh¶h Khh£hžhubh¸)”}”(hŒ£If clearing the page table entry is not followed by a notify before setting the new pte/pmd value then you can break memory model like C11 or C++11 for the device.”h]”hŒ£If clearing the page table entry is not followed by a notify before setting the new pte/pmd value then you can break memory model like C11 or C++11 for the device.”…””}”(hj›hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubh¸)”}”(hŒLConsider the following scenario (device use a feature similar to ATS/PASID):”h]”hŒLConsider the following scenario (device use a feature similar to ATS/PASID):”…””}”(hj©hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubh¸)”}”(hŒŒTwo address addrA and addrB such that \|addrA - addrB\| >= PAGE_SIZE we assume they are write protected for COW (other case of B apply too).”h]”hŒŒTwo address addrA and addrB such that |addrA - addrB| >= PAGE_SIZE we assume they are write protected for COW (other case of B apply too).”…””}”(hj·hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K!hh£hžhubhŒ literal_block”“”)”}”(hX2[Time N] -------------------------------------------------------------------- CPU-thread-0 {try to write to addrA} CPU-thread-1 {try to write to addrB} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {read addrA and populate device TLB} DEV-thread-2 {read addrB and populate device TLB} [Time N+1] ------------------------------------------------------------------ CPU-thread-0 {COW_step0: {mmu_notifier_invalidate_range_start(addrA)}} CPU-thread-1 {COW_step0: {mmu_notifier_invalidate_range_start(addrB)}} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {} DEV-thread-2 {} [Time N+2] ------------------------------------------------------------------ CPU-thread-0 {COW_step1: {update page table to point to new page for addrA}} CPU-thread-1 {COW_step1: {update page table to point to new page for addrB}} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {} DEV-thread-2 {} [Time N+3] ------------------------------------------------------------------ CPU-thread-0 {preempted} CPU-thread-1 {preempted} CPU-thread-2 {write to addrA which is a write to new page} CPU-thread-3 {} DEV-thread-0 {} DEV-thread-2 {} [Time N+3] ------------------------------------------------------------------ CPU-thread-0 {preempted} CPU-thread-1 {preempted} CPU-thread-2 {} CPU-thread-3 {write to addrB which is a write to new page} DEV-thread-0 {} DEV-thread-2 {} [Time N+4] ------------------------------------------------------------------ CPU-thread-0 {preempted} CPU-thread-1 {COW_step3: {mmu_notifier_invalidate_range_end(addrB)}} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {} DEV-thread-2 {} [Time N+5] ------------------------------------------------------------------ CPU-thread-0 {preempted} CPU-thread-1 {} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {read addrA from old page} DEV-thread-2 {read addrB from new page}”h]”hX2[Time N] -------------------------------------------------------------------- CPU-thread-0 {try to write to addrA} CPU-thread-1 {try to write to addrB} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {read addrA and populate device TLB} DEV-thread-2 {read addrB and populate device TLB} [Time N+1] ------------------------------------------------------------------ CPU-thread-0 {COW_step0: {mmu_notifier_invalidate_range_start(addrA)}} CPU-thread-1 {COW_step0: {mmu_notifier_invalidate_range_start(addrB)}} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {} DEV-thread-2 {} [Time N+2] ------------------------------------------------------------------ CPU-thread-0 {COW_step1: {update page table to point to new page for addrA}} CPU-thread-1 {COW_step1: {update page table to point to new page for addrB}} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {} DEV-thread-2 {} [Time N+3] ------------------------------------------------------------------ CPU-thread-0 {preempted} CPU-thread-1 {preempted} CPU-thread-2 {write to addrA which is a write to new page} CPU-thread-3 {} DEV-thread-0 {} DEV-thread-2 {} [Time N+3] ------------------------------------------------------------------ CPU-thread-0 {preempted} CPU-thread-1 {preempted} CPU-thread-2 {} CPU-thread-3 {write to addrB which is a write to new page} DEV-thread-0 {} DEV-thread-2 {} [Time N+4] ------------------------------------------------------------------ CPU-thread-0 {preempted} CPU-thread-1 {COW_step3: {mmu_notifier_invalidate_range_end(addrB)}} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {} DEV-thread-2 {} [Time N+5] ------------------------------------------------------------------ CPU-thread-0 {preempted} CPU-thread-1 {} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {read addrA from old page} DEV-thread-2 {read addrB from new page}”…””}”hjÇsbah}”(h]”h ]”h"]”h$]”h&]”Œ xml:space”Œpreserve”uh1jÅhŸh¶h K&hh£hžhubh¸)”}”(hŒ÷So here because at time N+2 the clear page table entry was not pair with a notification to invalidate the secondary TLB, the device see the new value for addrB before seeing the new value for addrA. This break total memory ordering for the device.”h]”hŒ÷So here because at time N+2 the clear page table entry was not pair with a notification to invalidate the secondary TLB, the device see the new value for addrB before seeing the new value for addrA. This break total memory ordering for the device.”…””}”(hj×hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KXhh£hžhubh¸)”}”(hX‰When changing a pte to write protect or to point to a new write protected page with same content (KSM) it is fine to delay the mmu_notifier_invalidate_range call to mmu_notifier_invalidate_range_end() outside the page table lock. This is true even if the thread doing the page table update is preempted right after releasing page table lock but before call mmu_notifier_invalidate_range_end().”h]”hX‰When changing a pte to write protect or to point to a new write protected page with same content (KSM) it is fine to delay the mmu_notifier_invalidate_range call to mmu_notifier_invalidate_range_end() outside the page table lock. This is true even if the thread doing the page table update is preempted right after releasing page table lock but before call mmu_notifier_invalidate_range_end().”…””}”(hjåhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K]hh£hžhubeh}”(h]”Œ1when-do-you-need-to-notify-inside-page-table-lock”ah ]”h"]”Œ3when do you need to notify inside page table lock ?”ah$]”h&]”uh1h¡hhhžhhŸh¶h Kubeh}”(h]”h ]”h"]”h$]”h&]”Œsource”h¶uh1hŒcurrent_source”NŒ current_line”NŒsettings”Œdocutils.frontend”ŒValues”“”)”}”(h¦NŒ generator”NŒ datestamp”NŒ source_link”NŒ source_url”NŒ toc_backlinks”Œentry”Œfootnote_backlinks”KŒ sectnum_xform”KŒstrip_comments”NŒstrip_elements_with_classes”NŒ strip_classes”NŒ report_level”KŒ halt_level”KŒexit_status_level”KŒdebug”NŒwarning_stream”NŒ traceback”ˆŒinput_encoding”Œ utf-8-sig”Œinput_encoding_error_handler”Œstrict”Œoutput_encoding”Œutf-8”Œoutput_encoding_error_handler”jŒerror_encoding”Œutf-8”Œerror_encoding_error_handler”Œbackslashreplace”Œ language_code”Œen”Œrecord_dependencies”NŒconfig”NŒ id_prefix”hŒauto_id_prefix”Œid”Œ dump_settings”NŒdump_internals”NŒdump_transforms”NŒdump_pseudo_xml”NŒexpose_internals”NŒstrict_visitor”NŒ_disable_config”NŒ_source”h¶Œ _destination”NŒ _config_files”]”Œ7/var/lib/git/docbuild/linux/Documentation/docutils.conf”aŒfile_insertion_enabled”ˆŒ raw_enabled”KŒline_length_limit”M'Œpep_references”NŒ pep_base_url”Œhttps://peps.python.org/”Œpep_file_url_template”Œpep-%04d”Œrfc_references”NŒ rfc_base_url”Œ&https://datatracker.ietf.org/doc/html/”Œ tab_width”KŒtrim_footnote_reference_space”‰Œsyntax_highlight”Œlong”Œ smart_quotes”ˆŒsmartquotes_locales”]”Œcharacter_level_inline_markup”‰Œdoctitle_xform”‰Œ docinfo_xform”KŒsectsubtitle_xform”‰Œ image_loading”Œlink”Œembed_stylesheet”‰Œcloak_email_addresses”ˆŒsection_self_link”‰Œenv”NubŒreporter”NŒindirect_targets”]”Œsubstitution_defs”}”Œsubstitution_names”}”Œrefnames”}”Œrefids”}”Œnameids”}”jøjõsŒ nametypes”}”jø‰sh}”jõh£sŒ footnote_refs”}”Œ citation_refs”}”Œ autofootnotes”]”Œautofootnote_refs”]”Œsymbol_footnotes”]”Œsymbol_footnote_refs”]”Œ footnotes”]”Œ citations”]”Œautofootnote_start”KŒsymbol_footnote_start”KŒ id_counter”Œ collections”ŒCounter”“”}”…”R”Œparse_messages”]”Œtransform_messages”]”Œ transformer”NŒ include_log”]”Œ decoration”Nhžhub.