€•ûAŒsphinx.addnodes”Œdocument”“”)”}”(Œ rawsource”Œ”Œchildren”]”(Œ translations”Œ LanguagesNode”“”)”}”(hhh]”(hŒ pending_xref”“”)”}”(hhh]”Œdocutils.nodes”ŒText”“”ŒChinese (Simplified)”…””}”Œparent”hsbaŒ attributes”}”(Œids”]”Œclasses”]”Œnames”]”Œdupnames”]”Œbackrefs”]”Œ refdomain”Œstd”Œreftype”Œdoc”Œ reftarget”Œ /translations/zh_CN/arch/x86/tlb”Œmodname”NŒ classname”NŒ refexplicit”ˆuŒtagname”hhh ubh)”}”(hhh]”hŒChinese (Traditional)”…””}”hh2sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ /translations/zh_TW/arch/x86/tlb”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒItalian”…””}”hhFsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ /translations/it_IT/arch/x86/tlb”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒJapanese”…””}”hhZsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ /translations/ja_JP/arch/x86/tlb”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒKorean”…””}”hhnsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ /translations/ko_KR/arch/x86/tlb”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒSpanish”…””}”hh‚sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ /translations/sp_SP/arch/x86/tlb”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubeh}”(h]”h ]”h"]”h$]”h&]”Œcurrent_language”ŒEnglish”uh1h hhŒ _document”hŒsource”NŒline”NubhŒcomment”“”)”}”(hŒ SPDX-License-Identifier: GPL-2.0”h]”hŒ SPDX-License-Identifier: GPL-2.0”…””}”hh£sbah}”(h]”h ]”h"]”h$]”h&]”Œ xml:space”Œpreserve”uh1h¡hhhžhhŸŒ:/var/lib/git/docbuild/linux/Documentation/arch/x86/tlb.rst”h KubhŒsection”“”)”}”(hhh]”(hŒtitle”“”)”}”(hŒThe TLB”h]”hŒThe TLB”…””}”(hh»hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h¹hh¶hžhhŸh³h KubhŒ paragraph”“”)”}”(hŒ[When the kernel unmaps or modified the attributes of a range of memory, it has two choices:”h]”hŒ[When the kernel unmaps or modified the attributes of a range of memory, it has two choices:”…””}”(hhËhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Khh¶hžhubhŒ block_quote”“”)”}”(hXÛ1. Flush the entire TLB with a two-instruction sequence. This is a quick operation, but it causes collateral damage: TLB entries from areas other than the one we are trying to flush will be destroyed and must be refilled later, at some cost. 2. Use the invlpg instruction to invalidate a single page at a time. This could potentially cost many more instructions, but it is a much more precise operation, causing no collateral damage to other TLB entries. ”h]”hŒenumerated_list”“”)”}”(hhh]”(hŒ list_item”“”)”}”(hŒïFlush the entire TLB with a two-instruction sequence. This is a quick operation, but it causes collateral damage: TLB entries from areas other than the one we are trying to flush will be destroyed and must be refilled later, at some cost.”h]”hÊ)”}”(hŒïFlush the entire TLB with a two-instruction sequence. This is a quick operation, but it causes collateral damage: TLB entries from areas other than the one we are trying to flush will be destroyed and must be refilled later, at some cost.”h]”hŒïFlush the entire TLB with a two-instruction sequence. This is a quick operation, but it causes collateral damage: TLB entries from areas other than the one we are trying to flush will be destroyed and must be refilled later, at some cost.”…””}”(hhêhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K hhæubah}”(h]”h ]”h"]”h$]”h&]”uh1hähháubhå)”}”(hŒÓUse the invlpg instruction to invalidate a single page at a time. This could potentially cost many more instructions, but it is a much more precise operation, causing no collateral damage to other TLB entries. ”h]”hÊ)”}”(hŒÒUse the invlpg instruction to invalidate a single page at a time. This could potentially cost many more instructions, but it is a much more precise operation, causing no collateral damage to other TLB entries.”h]”hŒÒUse the invlpg instruction to invalidate a single page at a time. This could potentially cost many more instructions, but it is a much more precise operation, causing no collateral damage to other TLB entries.”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Khhþubah}”(h]”h ]”h"]”h$]”h&]”uh1hähháubeh}”(h]”h ]”h"]”h$]”h&]”Œenumtype”Œarabic”Œprefix”hŒsuffix”Œ.”uh1hßhhÛubah}”(h]”h ]”h"]”h$]”h&]”uh1hÙhŸh³h K hh¶hžhubhÊ)”}”(hŒ+Which method to do depends on a few things:”h]”hŒ+Which method to do depends on a few things:”…””}”(hj'hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Khh¶hžhubhÚ)”}”(hX+1. The size of the flush being performed. A flush of the entire address space is obviously better performed by flushing the entire TLB than doing 2^48/PAGE_SIZE individual flushes. 2. The contents of the TLB. If the TLB is empty, then there will be no collateral damage caused by doing the global flush, and all of the individual flush will have ended up being wasted work. 3. The size of the TLB. The larger the TLB, the more collateral damage we do with a full flush. So, the larger the TLB, the more attractive an individual flush looks. Data and instructions have separate TLBs, as do different page sizes. 4. The microarchitecture. The TLB has become a multi-level cache on modern CPUs, and the global flushes have become more expensive relative to single-page flushes. ”h]”hà)”}”(hhh]”(hå)”}”(hŒ²The size of the flush being performed. A flush of the entire address space is obviously better performed by flushing the entire TLB than doing 2^48/PAGE_SIZE individual flushes.”h]”hÊ)”}”(hŒ²The size of the flush being performed. A flush of the entire address space is obviously better performed by flushing the entire TLB than doing 2^48/PAGE_SIZE individual flushes.”h]”hŒ²The size of the flush being performed. A flush of the entire address space is obviously better performed by flushing the entire TLB than doing 2^48/PAGE_SIZE individual flushes.”…””}”(hj@hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Khj<ubah}”(h]”h ]”h"]”h$]”h&]”uh1hähj9ubhå)”}”(hŒ¾The contents of the TLB. If the TLB is empty, then there will be no collateral damage caused by doing the global flush, and all of the individual flush will have ended up being wasted work.”h]”hÊ)”}”(hŒ¾The contents of the TLB. If the TLB is empty, then there will be no collateral damage caused by doing the global flush, and all of the individual flush will have ended up being wasted work.”h]”hŒ¾The contents of the TLB. If the TLB is empty, then there will be no collateral damage caused by doing the global flush, and all of the individual flush will have ended up being wasted work.”…””}”(hjXhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h KhjTubah}”(h]”h ]”h"]”h$]”h&]”uh1hähj9ubhå)”}”(hŒìThe size of the TLB. The larger the TLB, the more collateral damage we do with a full flush. So, the larger the TLB, the more attractive an individual flush looks. Data and instructions have separate TLBs, as do different page sizes.”h]”hÊ)”}”(hŒìThe size of the TLB. The larger the TLB, the more collateral damage we do with a full flush. So, the larger the TLB, the more attractive an individual flush looks. Data and instructions have separate TLBs, as do different page sizes.”h]”hŒìThe size of the TLB. The larger the TLB, the more collateral damage we do with a full flush. So, the larger the TLB, the more attractive an individual flush looks. Data and instructions have separate TLBs, as do different page sizes.”…””}”(hjphžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Khjlubah}”(h]”h ]”h"]”h$]”h&]”uh1hähj9ubhå)”}”(hŒ¢The microarchitecture. The TLB has become a multi-level cache on modern CPUs, and the global flushes have become more expensive relative to single-page flushes. ”h]”hÊ)”}”(hŒ¡The microarchitecture. The TLB has become a multi-level cache on modern CPUs, and the global flushes have become more expensive relative to single-page flushes.”h]”hŒ¡The microarchitecture. The TLB has become a multi-level cache on modern CPUs, and the global flushes have become more expensive relative to single-page flushes.”…””}”(hjˆhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K hj„ubah}”(h]”h ]”h"]”h$]”h&]”uh1hähj9ubeh}”(h]”h ]”h"]”h$]”h&]”jjjhjj uh1hßhj5ubah}”(h]”h ]”h"]”h$]”h&]”uh1hÙhŸh³h Khh¶hžhubhÊ)”}”(hŒ÷There is obviously no way the kernel can know all these things, especially the contents of the TLB during a given flush. The sizes of the flush will vary greatly depending on the workload as well. There is essentially no "right" point to choose.”h]”hŒûThere is obviously no way the kernel can know all these things, especially the contents of the TLB during a given flush. The sizes of the flush will vary greatly depending on the workload as well. There is essentially no “right†point to choose.”…””}”(hj¨hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K$hh¶hžhubhÊ)”}”(hŒìYou may be doing too many individual invalidations if you see the invlpg instruction (or instructions _near_ it) show up high in profiles. If you believe that individual invalidations being called too often, you can lower the tunable::”h]”hŒëYou may be doing too many individual invalidations if you see the invlpg instruction (or instructions _near_ it) show up high in profiles. If you believe that individual invalidations being called too often, you can lower the tunable:”…””}”(hj¶hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K)hh¶hžhubhŒ literal_block”“”)”}”(hŒ3/sys/kernel/debug/x86/tlb_single_page_flush_ceiling”h]”hŒ3/sys/kernel/debug/x86/tlb_single_page_flush_ceiling”…””}”hjÆsbah}”(h]”h ]”h"]”h$]”h&]”h±h²uh1jÄhŸh³h K.hh¶hžhubhÊ)”}”(hŒæThis will cause us to do the global flush for more cases. Lowering it to 0 will disable the use of the individual flushes. Setting it to 1 is a very conservative setting and it should never need to be 0 under normal circumstances.”h]”hŒæThis will cause us to do the global flush for more cases. Lowering it to 0 will disable the use of the individual flushes. Setting it to 1 is a very conservative setting and it should never need to be 0 under normal circumstances.”…””}”(hjÔhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K0hh¶hžhubhÊ)”}”(hŒ¹Despite the fact that a single individual flush on x86 is guaranteed to flush a full 2MB [1]_, hugetlbfs always uses the full flushes. THP is treated exactly the same as normal memory.”h]”(hŒYDespite the fact that a single individual flush on x86 is guaranteed to flush a full 2MB ”…””}”(hjâhžhhŸNh NubhŒfootnote_reference”“”)”}”(hŒ[1]_”h]”hŒ1”…””}”(hjìhžhhŸNh Nubah}”(h]”Œid1”ah ]”h"]”h$]”h&]”Œrefid”Œid2”Œdocname”Œ arch/x86/tlb”uh1jêhjâŒresolved”KubhŒ\, hugetlbfs always uses the full flushes. THP is treated exactly the same as normal memory.”…””}”(hjâhžhhŸNh Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K5hh¶hžhubhÊ)”}”(hŒ±You might see invlpg inside of flush_tlb_mm_range() show up in profiles, or you can use the trace_tlb_flush() tracepoints. to determine how long the flush operations are taking.”h]”hŒ±You might see invlpg inside of flush_tlb_mm_range() show up in profiles, or you can use the trace_tlb_flush() tracepoints. to determine how long the flush operations are taking.”…””}”(hj hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K9hh¶hžhubhÊ)”}”(hŒxEssentially, you are balancing the cycles you spend doing invlpg with the cycles that you spend refilling the TLB later.”h]”hŒxEssentially, you are balancing the cycles you spend doing invlpg with the cycles that you spend refilling the TLB later.”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K=hh¶hžhubhÊ)”}”(hŒhYou can measure how expensive TLB refills are by using performance counters and 'perf stat', like this::”h]”hŒkYou can measure how expensive TLB refills are by using performance counters and ‘perf stat’, like this:”…””}”(hj&hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K@hh¶hžhubjÅ)”}”(hXŒperf stat -e cpu/event=0x8,umask=0x84,name=dtlb_load_misses_walk_duration/, cpu/event=0x8,umask=0x82,name=dtlb_load_misses_walk_completed/, cpu/event=0x49,umask=0x4,name=dtlb_store_misses_walk_duration/, cpu/event=0x49,umask=0x2,name=dtlb_store_misses_walk_completed/, cpu/event=0x85,umask=0x4,name=itlb_misses_walk_duration/, cpu/event=0x85,umask=0x2,name=itlb_misses_walk_completed/”h]”hXŒperf stat -e cpu/event=0x8,umask=0x84,name=dtlb_load_misses_walk_duration/, cpu/event=0x8,umask=0x82,name=dtlb_load_misses_walk_completed/, cpu/event=0x49,umask=0x4,name=dtlb_store_misses_walk_duration/, cpu/event=0x49,umask=0x2,name=dtlb_store_misses_walk_completed/, cpu/event=0x85,umask=0x4,name=itlb_misses_walk_duration/, cpu/event=0x85,umask=0x2,name=itlb_misses_walk_completed/”…””}”hj4sbah}”(h]”h ]”h"]”h$]”h&]”h±h²uh1jÄhŸh³h KChh¶hžhubhÊ)”}”(hX That works on an IvyBridge-era CPU (i5-3320M). Different CPUs may have differently-named counters, but they should at least be there in some form. You can use pmu-tools 'ocperf list' (https://github.com/andikleen/pmu-tools) to find the right counters for a given CPU.”h]”(hŒ¾That works on an IvyBridge-era CPU (i5-3320M). Different CPUs may have differently-named counters, but they should at least be there in some form. You can use pmu-tools ‘ocperf list’ (”…””}”(hjBhžhhŸNh NubhŒ reference”“”)”}”(hŒ&https://github.com/andikleen/pmu-tools”h]”hŒ&https://github.com/andikleen/pmu-tools”…””}”(hjLhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”Œrefuri”jNuh1jJhjBubhŒ-) to find the right counters for a given CPU.”…””}”(hjBhžhhŸNh Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h KKhh¶hžhubhŒfootnote”“”)”}”(hŒœA footnote in Intel's SDM "4.10.4.2 Recommended Invalidation" says: "One execution of INVLPG is sufficient even for a page with size greater than 4 KBytes."”h]”(hŒlabel”“”)”}”(hŒ1”h]”hŒ1”…””}”(hjmhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1jkhjgubhÊ)”}”(hŒœA footnote in Intel's SDM "4.10.4.2 Recommended Invalidation" says: "One execution of INVLPG is sufficient even for a page with size greater than 4 KBytes."”h]”hŒ¦A footnote in Intel’s SDM “4.10.4.2 Recommended Invalidation†says: “One execution of INVLPG is sufficient even for a page with size greater than 4 KBytes.—…””}”(hj{hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h KQhjgubeh}”(h]”jüah ]”h"]”Œ1”ah$]”h&]”jöajýjþuh1jehŸh³h KQhh¶hžhjÿKubeh}”(h]”Œthe-tlb”ah ]”h"]”Œthe tlb”ah$]”h&]”uh1h´hhhžhhŸh³h Kubeh}”(h]”h ]”h"]”h$]”h&]”Œsource”h³uh1hŒcurrent_source”NŒ current_line”NŒsettings”Œdocutils.frontend”ŒValues”“”)”}”(h¹NŒ generator”NŒ datestamp”NŒ source_link”NŒ source_url”NŒ toc_backlinks”Œentry”Œfootnote_backlinks”KŒ sectnum_xform”KŒstrip_comments”NŒstrip_elements_with_classes”NŒ strip_classes”NŒ report_level”KŒ halt_level”KŒexit_status_level”KŒdebug”NŒwarning_stream”NŒ traceback”ˆŒinput_encoding”Œ utf-8-sig”Œinput_encoding_error_handler”Œstrict”Œoutput_encoding”Œutf-8”Œoutput_encoding_error_handler”j»Œerror_encoding”Œutf-8”Œerror_encoding_error_handler”Œbackslashreplace”Œ language_code”Œen”Œrecord_dependencies”NŒconfig”NŒ id_prefix”hŒauto_id_prefix”Œid”Œ dump_settings”NŒdump_internals”NŒdump_transforms”NŒdump_pseudo_xml”NŒexpose_internals”NŒstrict_visitor”NŒ_disable_config”NŒ_source”h³Œ _destination”NŒ _config_files”]”Œ7/var/lib/git/docbuild/linux/Documentation/docutils.conf”aŒfile_insertion_enabled”ˆŒ raw_enabled”KŒline_length_limit”M'Œpep_references”NŒ pep_base_url”Œhttps://peps.python.org/”Œpep_file_url_template”Œpep-%04d”Œrfc_references”NŒ rfc_base_url”Œ&https://datatracker.ietf.org/doc/html/”Œ tab_width”KŒtrim_footnote_reference_space”‰Œsyntax_highlight”Œlong”Œ smart_quotes”ˆŒsmartquotes_locales”]”Œcharacter_level_inline_markup”‰Œdoctitle_xform”‰Œ docinfo_xform”KŒsectsubtitle_xform”‰Œ image_loading”Œlink”Œembed_stylesheet”‰Œcloak_email_addresses”ˆŒsection_self_link”‰Œenv”NubŒreporter”NŒindirect_targets”]”Œsubstitution_defs”}”Œsubstitution_names”}”Œrefnames”}”Œ1”]”jìasŒrefids”}”Œnameids”}”(j•j’jjüuŒ nametypes”}”(j•‰jˆuh}”(j’h¶jöjìjüjguŒ footnote_refs”}”jû]”jìasŒ citation_refs”}”Œ autofootnotes”]”Œautofootnote_refs”]”Œsymbol_footnotes”]”Œsymbol_footnote_refs”]”Œ footnotes”]”jgaŒ citations”]”Œautofootnote_start”KŒsymbol_footnote_start”KŒ id_counter”Œ collections”ŒCounter”“”}”jÉKs…”R”Œparse_messages”]”Œtransform_messages”]”Œ transformer”NŒ include_log”]”Œ decoration”Nhžhub.