€•¶DŒsphinx.addnodes”Œdocument”“”)”}”(Œ rawsource”Œ”Œchildren”]”(Œ translations”Œ LanguagesNode”“”)”}”(hhh]”(hŒ pending_xref”“”)”}”(hhh]”Œdocutils.nodes”ŒText”“”ŒChinese (Simplified)”…””}”Œparent”hsbaŒ attributes”}”(Œids”]”Œclasses”]”Œnames”]”Œdupnames”]”Œbackrefs”]”Œ refdomain”Œstd”Œreftype”Œdoc”Œ reftarget”Œ%/translations/zh_CN/arch/x86/entry_64”Œmodname”NŒ classname”NŒ refexplicit”ˆuŒtagname”hhh ubh)”}”(hhh]”hŒChinese (Traditional)”…””}”hh2sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ%/translations/zh_TW/arch/x86/entry_64”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒItalian”…””}”hhFsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ%/translations/it_IT/arch/x86/entry_64”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒJapanese”…””}”hhZsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ%/translations/ja_JP/arch/x86/entry_64”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒKorean”…””}”hhnsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ%/translations/ko_KR/arch/x86/entry_64”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒSpanish”…””}”hh‚sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ%/translations/sp_SP/arch/x86/entry_64”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubeh}”(h]”h ]”h"]”h$]”h&]”Œcurrent_language”ŒEnglish”uh1h hhŒ _document”hŒsource”NŒline”NubhŒcomment”“”)”}”(hŒ SPDX-License-Identifier: GPL-2.0”h]”hŒ SPDX-License-Identifier: GPL-2.0”…””}”hh£sbah}”(h]”h ]”h"]”h$]”h&]”Œ xml:space”Œpreserve”uh1h¡hhhžhhŸŒ?/var/lib/git/docbuild/linux/Documentation/arch/x86/entry_64.rst”h KubhŒsection”“”)”}”(hhh]”(hŒtitle”“”)”}”(hŒKernel Entries”h]”hŒKernel Entries”…””}”(hh»hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h¹hh¶hžhhŸh³h KubhŒ paragraph”“”)”}”(hŒ’This file documents some of the kernel entries in arch/x86/entry/entry_64.S. A lot of this explanation is adapted from an email from Ingo Molnar:”h]”hŒ’This file documents some of the kernel entries in arch/x86/entry/entry_64.S. A lot of this explanation is adapted from an email from Ingo Molnar:”…””}”(hhËhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Khh¶hžhubhÊ)”}”(hŒ9https://lore.kernel.org/r/20110529191055.GC9835%40elte.hu”h]”hŒ reference”“”)”}”(hhÛh]”hŒ9https://lore.kernel.org/r/20110529191055.GC9835%40elte.hu”…””}”(hhßhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”Œrefuri”hÛuh1hÝhhÙubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K hh¶hžhubhÊ)”}”(hXÄThe x86 architecture has quite a few different ways to jump into kernel code. Most of these entry points are registered in arch/x86/kernel/traps.c and implemented in arch/x86/entry/entry_64.S for 64-bit, arch/x86/entry/entry_32.S for 32-bit and finally arch/x86/entry/entry_64_compat.S which implements the 32-bit compatibility syscall entry points and thus provides for 32-bit processes the ability to execute syscalls when running on 64-bit kernels.”h]”hXÄThe x86 architecture has quite a few different ways to jump into kernel code. Most of these entry points are registered in arch/x86/kernel/traps.c and implemented in arch/x86/entry/entry_64.S for 64-bit, arch/x86/entry/entry_32.S for 32-bit and finally arch/x86/entry/entry_64_compat.S which implements the 32-bit compatibility syscall entry points and thus provides for 32-bit processes the ability to execute syscalls when running on 64-bit kernels.”…””}”(hhóhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K hh¶hžhubhÊ)”}”(hŒLThe IDT vector assignments are listed in arch/x86/include/asm/irq_vectors.h.”h]”hŒLThe IDT vector assignments are listed in arch/x86/include/asm/irq_vectors.h.”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Khh¶hžhubhÊ)”}”(hŒSome of these entries are:”h]”hŒSome of these entries are:”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Khh¶hžhubhŒ block_quote”“”)”}”(hX- system_call: syscall instruction from 64-bit code. - entry_INT80_compat: int 0x80 from 32-bit or 64-bit code; compat syscall either way. - entry_INT80_compat, ia32_sysenter: syscall and sysenter from 32-bit code - interrupt: An array of entries. Every IDT vector that doesn't explicitly point somewhere else gets set to the corresponding value in interrupts. These point to a whole array of magically-generated functions that make their way to common_interrupt() with the interrupt number as a parameter. - APIC interrupts: Various special-purpose interrupts for things like TLB shootdown. - Architecturally-defined exceptions like divide_error. ”h]”hŒ bullet_list”“”)”}”(hhh]”(hŒ list_item”“”)”}”(hŒ3system_call: syscall instruction from 64-bit code. ”h]”hÊ)”}”(hŒ2system_call: syscall instruction from 64-bit code.”h]”hŒ2system_call: syscall instruction from 64-bit code.”…””}”(hj.hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Khj*ubah}”(h]”h ]”h"]”h$]”h&]”uh1j(hj%ubj))”}”(hŒTentry_INT80_compat: int 0x80 from 32-bit or 64-bit code; compat syscall either way. ”h]”hÊ)”}”(hŒSentry_INT80_compat: int 0x80 from 32-bit or 64-bit code; compat syscall either way.”h]”hŒSentry_INT80_compat: int 0x80 from 32-bit or 64-bit code; compat syscall either way.”…””}”(hjFhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h KhjBubah}”(h]”h ]”h"]”h$]”h&]”uh1j(hj%ubj))”}”(hŒIentry_INT80_compat, ia32_sysenter: syscall and sysenter from 32-bit code ”h]”hÊ)”}”(hŒHentry_INT80_compat, ia32_sysenter: syscall and sysenter from 32-bit code”h]”hŒHentry_INT80_compat, ia32_sysenter: syscall and sysenter from 32-bit code”…””}”(hj^hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h KhjZubah}”(h]”h ]”h"]”h$]”h&]”uh1j(hj%ubj))”}”(hX%interrupt: An array of entries. Every IDT vector that doesn't explicitly point somewhere else gets set to the corresponding value in interrupts. These point to a whole array of magically-generated functions that make their way to common_interrupt() with the interrupt number as a parameter. ”h]”hÊ)”}”(hX$interrupt: An array of entries. Every IDT vector that doesn't explicitly point somewhere else gets set to the corresponding value in interrupts. These point to a whole array of magically-generated functions that make their way to common_interrupt() with the interrupt number as a parameter.”h]”hX&interrupt: An array of entries. Every IDT vector that doesn’t explicitly point somewhere else gets set to the corresponding value in interrupts. These point to a whole array of magically-generated functions that make their way to common_interrupt() with the interrupt number as a parameter.”…””}”(hjvhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K!hjrubah}”(h]”h ]”h"]”h$]”h&]”uh1j(hj%ubj))”}”(hŒSAPIC interrupts: Various special-purpose interrupts for things like TLB shootdown. ”h]”hÊ)”}”(hŒRAPIC interrupts: Various special-purpose interrupts for things like TLB shootdown.”h]”hŒRAPIC interrupts: Various special-purpose interrupts for things like TLB shootdown.”…””}”(hjŽhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K'hjŠubah}”(h]”h ]”h"]”h$]”h&]”uh1j(hj%ubj))”}”(hŒ6Architecturally-defined exceptions like divide_error. ”h]”hÊ)”}”(hŒ5Architecturally-defined exceptions like divide_error.”h]”hŒ5Architecturally-defined exceptions like divide_error.”…””}”(hj¦hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K*hj¢ubah}”(h]”h ]”h"]”h$]”h&]”uh1j(hj%ubeh}”(h]”h ]”h"]”h$]”h&]”Œbullet”Œ-”uh1j#hŸh³h Khjubah}”(h]”h ]”h"]”h$]”h&]”uh1jhŸh³h Khh¶hžhubhÊ)”}”(hXÛThere are a few complexities here. The different x86-64 entries have different calling conventions. The syscall and sysenter instructions have their own peculiar calling conventions. Some of the IDT entries push an error code onto the stack; others don't. IDT entries using the IST alternative stack mechanism need their own magic to get the stack frames right. (You can find some documentation in the AMD APM, Volume 2, Chapter 8 and the Intel SDM, Volume 3, Chapter 6.)”h]”hXÝThere are a few complexities here. The different x86-64 entries have different calling conventions. The syscall and sysenter instructions have their own peculiar calling conventions. Some of the IDT entries push an error code onto the stack; others don’t. IDT entries using the IST alternative stack mechanism need their own magic to get the stack frames right. (You can find some documentation in the AMD APM, Volume 2, Chapter 8 and the Intel SDM, Volume 3, Chapter 6.)”…””}”(hjÈhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K,hh¶hžhubhÊ)”}”(hXwDealing with the swapgs instruction is especially tricky. Swapgs toggles whether gs is the kernel gs or the user gs. The swapgs instruction is rather fragile: it must nest perfectly and only in single depth, it should only be used if entering from user mode to kernel mode and then when returning to user-space, and precisely so. If we mess that up even slightly, we crash.”h]”hXwDealing with the swapgs instruction is especially tricky. Swapgs toggles whether gs is the kernel gs or the user gs. The swapgs instruction is rather fragile: it must nest perfectly and only in single depth, it should only be used if entering from user mode to kernel mode and then when returning to user-space, and precisely so. If we mess that up even slightly, we crash.”…””}”(hjÖhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K5hh¶hžhubhÊ)”}”(hŒ£So when we have a secondary entry, already in kernel mode, we *must not* use SWAPGS blindly - nor must we forget doing a SWAPGS when it's not switched/swapped yet.”h]”(hŒ>So when we have a secondary entry, already in kernel mode, we ”…””}”(hjähžhhŸNh NubhŒemphasis”“”)”}”(hŒ *must not*”h]”hŒmust not”…””}”(hjîhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1jìhjäubhŒ] use SWAPGS blindly - nor must we forget doing a SWAPGS when it’s not switched/swapped yet.”…””}”(hjähžhhŸNh Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Kxorl %ebx,%ebx testl $3,CS+8(%rsp) je error_kernelspace SWAPGS”h]”hŒ>xorl %ebx,%ebx testl $3,CS+8(%rsp) je error_kernelspace SWAPGS”…””}”hj$sbah}”(h]”h ]”h"]”h$]”h&]”h±h²uh1j"hŸh³h KFhh¶hžhubhÊ)”}”(hŒdThe expensive (paranoid) way is to read back the MSR_GS_BASE value (which is what SWAPGS modifies)::”h]”hŒcThe expensive (paranoid) way is to read back the MSR_GS_BASE value (which is what SWAPGS modifies):”…””}”(hj2hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h KKhh¶hžhubj#)”}”(hŒ§ movl $1,%ebx movl $MSR_GS_BASE,%ecx rdmsr testl %edx,%edx js 1f /* negative -> in kernel */ SWAPGS xorl %ebx,%ebx 1: ret”h]”hŒ§ movl $1,%ebx movl $MSR_GS_BASE,%ecx rdmsr testl %edx,%edx js 1f /* negative -> in kernel */ SWAPGS xorl %ebx,%ebx 1: ret”…””}”hj@sbah}”(h]”h ]”h"]”h$]”h&]”h±h²uh1j"hŸh³h KNhh¶hžhubhÊ)”}”(hX|If we are at an interrupt or user-trap/gate-alike boundary then we can use the faster check: the stack will be a reliable indicator of whether SWAPGS was already done: if we see that we are a secondary entry interrupting kernel mode execution, then we know that the GS base has already been switched. If it says that we interrupted user-space execution then we must do the SWAPGS.”h]”hX|If we are at an interrupt or user-trap/gate-alike boundary then we can use the faster check: the stack will be a reliable indicator of whether SWAPGS was already done: if we see that we are a secondary entry interrupting kernel mode execution, then we know that the GS base has already been switched. If it says that we interrupted user-space execution then we must do the SWAPGS.”…””}”(hjNhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h KWhh¶hžhubhÊ)”}”(hŒùBut if we are in an NMI/MCE/DEBUG/whatever super-atomic entry context, which might have triggered right after a normal entry wrote CS to the stack but before we executed SWAPGS, then the only safe way to check for GS is the slower method: the RDMSR.”h]”hŒùBut if we are in an NMI/MCE/DEBUG/whatever super-atomic entry context, which might have triggered right after a normal entry wrote CS to the stack but before we executed SWAPGS, then the only safe way to check for GS is the slower method: the RDMSR.”…””}”(hj\hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h K^hh¶hžhubhÊ)”}”(hŒ³Therefore, super-atomic entries (except NMI, which is handled separately) must use idtentry with paranoid=1 to handle gsbase correctly. This triggers three main behavior changes:”h]”hŒ³Therefore, super-atomic entries (except NMI, which is handled separately) must use idtentry with paranoid=1 to handle gsbase correctly. This triggers three main behavior changes:”…””}”(hjjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Kchh¶hžhubj)”}”(hŒ´- Interrupt entry will use the slower gsbase check. - Interrupt entry from user mode will switch off the IST stack. - Interrupt exit to kernel mode will not attempt to reschedule. ”h]”j$)”}”(hhh]”(j))”}”(hŒ1Interrupt entry will use the slower gsbase check.”h]”hÊ)”}”(hjh]”hŒ1Interrupt entry will use the slower gsbase check.”…””}”(hjƒhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Kghjubah}”(h]”h ]”h"]”h$]”h&]”uh1j(hj|ubj))”}”(hŒ=Interrupt entry from user mode will switch off the IST stack.”h]”hÊ)”}”(hj˜h]”hŒ=Interrupt entry from user mode will switch off the IST stack.”…””}”(hjšhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Khhj–ubah}”(h]”h ]”h"]”h$]”h&]”uh1j(hj|ubj))”}”(hŒ>Interrupt exit to kernel mode will not attempt to reschedule. ”h]”hÊ)”}”(hŒ=Interrupt exit to kernel mode will not attempt to reschedule.”h]”hŒ=Interrupt exit to kernel mode will not attempt to reschedule.”…””}”(hj±hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Kihj­ubah}”(h]”h ]”h"]”h$]”h&]”uh1j(hj|ubeh}”(h]”h ]”h"]”h$]”h&]”jÀjÁuh1j#hŸh³h Kghjxubah}”(h]”h ]”h"]”h$]”h&]”uh1jhŸh³h Kghh¶hžhubhÊ)”}”(hŒÞWe try to only use IST entries and the paranoid entry code for vectors that absolutely need the more expensive check for the GS base - and we generate all 'normal' entry points with the regular (faster) paranoid=0 variant.”h]”hŒâWe try to only use IST entries and the paranoid entry code for vectors that absolutely need the more expensive check for the GS base - and we generate all ‘normal’ entry points with the regular (faster) paranoid=0 variant.”…””}”(hjÑhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÉhŸh³h Kkhh¶hžhubeh}”(h]”Œkernel-entries”ah ]”h"]”Œkernel entries”ah$]”h&]”uh1h´hhhžhhŸh³h Kubeh}”(h]”h ]”h"]”h$]”h&]”Œsource”h³uh1hŒcurrent_source”NŒ current_line”NŒsettings”Œdocutils.frontend”ŒValues”“”)”}”(h¹NŒ generator”NŒ datestamp”NŒ source_link”NŒ source_url”NŒ toc_backlinks”Œentry”Œfootnote_backlinks”KŒ sectnum_xform”KŒstrip_comments”NŒstrip_elements_with_classes”NŒ strip_classes”NŒ report_level”KŒ halt_level”KŒexit_status_level”KŒdebug”NŒwarning_stream”NŒ traceback”ˆŒinput_encoding”Œ utf-8-sig”Œinput_encoding_error_handler”Œstrict”Œoutput_encoding”Œutf-8”Œoutput_encoding_error_handler”j Œerror_encoding”Œutf-8”Œerror_encoding_error_handler”Œbackslashreplace”Œ language_code”Œen”Œrecord_dependencies”NŒconfig”NŒ id_prefix”hŒauto_id_prefix”Œid”Œ dump_settings”NŒdump_internals”NŒdump_transforms”NŒdump_pseudo_xml”NŒexpose_internals”NŒstrict_visitor”NŒ_disable_config”NŒ_source”h³Œ _destination”NŒ _config_files”]”Œ7/var/lib/git/docbuild/linux/Documentation/docutils.conf”aŒfile_insertion_enabled”ˆŒ raw_enabled”KŒline_length_limit”M'Œpep_references”NŒ pep_base_url”Œhttps://peps.python.org/”Œpep_file_url_template”Œpep-%04d”Œrfc_references”NŒ rfc_base_url”Œ&https://datatracker.ietf.org/doc/html/”Œ tab_width”KŒtrim_footnote_reference_space”‰Œsyntax_highlight”Œlong”Œ smart_quotes”ˆŒsmartquotes_locales”]”Œcharacter_level_inline_markup”‰Œdoctitle_xform”‰Œ docinfo_xform”KŒsectsubtitle_xform”‰Œ image_loading”Œlink”Œembed_stylesheet”‰Œcloak_email_addresses”ˆŒsection_self_link”‰Œenv”NubŒreporter”NŒindirect_targets”]”Œsubstitution_defs”}”Œsubstitution_names”}”Œrefnames”}”Œrefids”}”Œnameids”}”jäjásŒ nametypes”}”jä‰sh}”jáh¶sŒ footnote_refs”}”Œ citation_refs”}”Œ autofootnotes”]”Œautofootnote_refs”]”Œsymbol_footnotes”]”Œsymbol_footnote_refs”]”Œ footnotes”]”Œ citations”]”Œautofootnote_start”KŒsymbol_footnote_start”KŒ id_counter”Œ collections”ŒCounter”“”}”…”R”Œparse_messages”]”Œtransform_messages”]”Œ transformer”NŒ include_log”]”Œ decoration”Nhžhub.