€•fUŒsphinx.addnodes”Œdocument”“”)”}”(Œ rawsource”Œ”Œchildren”]”(Œ translations”Œ LanguagesNode”“”)”}”(hhh]”(hŒ pending_xref”“”)”}”(hhh]”Œdocutils.nodes”ŒText”“”ŒChinese (Simplified)”…””}”Œparent”hsbaŒ attributes”}”(Œids”]”Œclasses”]”Œnames”]”Œdupnames”]”Œbackrefs”]”Œ refdomain”Œstd”Œreftype”Œdoc”Œ reftarget”Œ&/translations/zh_CN/accel/introduction”Œmodname”NŒ classname”NŒ refexplicit”ˆuŒtagname”hhh ubh)”}”(hhh]”hŒChinese (Traditional)”…””}”hh2sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ&/translations/zh_TW/accel/introduction”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒItalian”…””}”hhFsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ&/translations/it_IT/accel/introduction”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒJapanese”…””}”hhZsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ&/translations/ja_JP/accel/introduction”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒKorean”…””}”hhnsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ&/translations/ko_KR/accel/introduction”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒPortuguese (Brazilian)”…””}”hh‚sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ&/translations/pt_BR/accel/introduction”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒSpanish”…””}”hh–sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ&/translations/sp_SP/accel/introduction”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubeh}”(h]”h ]”h"]”h$]”h&]”Œcurrent_language”ŒEnglish”uh1h hhŒ _document”hŒsource”NŒline”NubhŒcomment”“”)”}”(hŒ SPDX-License-Identifier: GPL-2.0”h]”hŒ SPDX-License-Identifier: GPL-2.0”…””}”hh·sbah}”(h]”h ]”h"]”h$]”h&]”Œ xml:space”Œpreserve”uh1hµhhh²hh³Œ@/var/lib/git/docbuild/linux/Documentation/accel/introduction.rst”h´KubhŒsection”“”)”}”(hhh]”(hŒtitle”“”)”}”(hŒ Introduction”h]”hŒ Introduction”…””}”(hhÏh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÍhhÊh²hh³hÇh´KubhŒ paragraph”“”)”}”(hŒœThe Linux compute accelerators subsystem is designed to expose compute accelerators in a common way to user-space and provide a common set of functionality.”h]”hŒœThe Linux compute accelerators subsystem is designed to expose compute accelerators in a common way to user-space and provide a common set of functionality.”…””}”(hhßh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÝh³hÇh´KhhÊh²hubhÞ)”}”(hXThese devices can be either stand-alone ASICs or IP blocks inside an SoC/GPU. Although these devices are typically designed to accelerate Machine-Learning (ML) and/or Deep-Learning (DL) computations, the accel layer is not limited to handling these types of accelerators.”h]”hXThese devices can be either stand-alone ASICs or IP blocks inside an SoC/GPU. Although these devices are typically designed to accelerate Machine-Learning (ML) and/or Deep-Learning (DL) computations, the accel layer is not limited to handling these types of accelerators.”…””}”(hhíh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÝh³hÇh´K hhÊh²hubhÞ)”}”(hŒPTypically, a compute accelerator will belong to one of the following categories:”h]”hŒPTypically, a compute accelerator will belong to one of the following categories:”…””}”(hhûh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÝh³hÇh´KhhÊh²hubhŒ bullet_list”“”)”}”(hhh]”(hŒ list_item”“”)”}”(hŒ×Edge AI - doing inference at an edge device. It can be an embedded ASIC/FPGA, or an IP inside a SoC (e.g. laptop web camera). These devices are typically configured using registers and can work with or without DMA. ”h]”hÞ)”}”(hŒÖEdge AI - doing inference at an edge device. It can be an embedded ASIC/FPGA, or an IP inside a SoC (e.g. laptop web camera). These devices are typically configured using registers and can work with or without DMA.”h]”hŒÖEdge AI - doing inference at an edge device. It can be an embedded ASIC/FPGA, or an IP inside a SoC (e.g. laptop web camera). These devices are typically configured using registers and can work with or without DMA.”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÝh³hÇh´Khjubah}”(h]”h ]”h"]”h$]”h&]”uh1jhj h²hh³hÇh´Nubj)”}”(hX÷Inference data-center - single/multi user devices in a large server. This type of device can be stand-alone or an IP inside a SoC or a GPU. It will have on-board DRAM (to hold the DL topology), DMA engines and command submission queues (either kernel or user-space queues). It might also have an MMU to manage multiple users and might also enable virtualization (SR-IOV) to support multiple VMs on the same device. In addition, these devices will usually have some tools, such as profiler and debugger. ”h]”hÞ)”}”(hXöInference data-center - single/multi user devices in a large server. This type of device can be stand-alone or an IP inside a SoC or a GPU. It will have on-board DRAM (to hold the DL topology), DMA engines and command submission queues (either kernel or user-space queues). It might also have an MMU to manage multiple users and might also enable virtualization (SR-IOV) to support multiple VMs on the same device. In addition, these devices will usually have some tools, such as profiler and debugger.”h]”hXöInference data-center - single/multi user devices in a large server. This type of device can be stand-alone or an IP inside a SoC or a GPU. It will have on-board DRAM (to hold the DL topology), DMA engines and command submission queues (either kernel or user-space queues). It might also have an MMU to manage multiple users and might also enable virtualization (SR-IOV) to support multiple VMs on the same device. In addition, these devices will usually have some tools, such as profiler and debugger.”…””}”(hj,h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÝh³hÇh´Khj(ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhj h²hh³hÇh´Nubj)”}”(hXTraining data-center - Similar to Inference data-center cards, but typically have more computational power and memory b/w (e.g. HBM) and will likely have a method of scaling-up/out, i.e. connecting to other training cards inside the server or in other servers, respectively. ”h]”hÞ)”}”(hXTraining data-center - Similar to Inference data-center cards, but typically have more computational power and memory b/w (e.g. HBM) and will likely have a method of scaling-up/out, i.e. connecting to other training cards inside the server or in other servers, respectively.”h]”hXTraining data-center - Similar to Inference data-center cards, but typically have more computational power and memory b/w (e.g. HBM) and will likely have a method of scaling-up/out, i.e. connecting to other training cards inside the server or in other servers, respectively.”…””}”(hjDh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÝh³hÇh´K hj@ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhj h²hh³hÇh´Nubeh}”(h]”h ]”h"]”h$]”h&]”Œbullet”Œ-”uh1j h³hÇh´KhhÊh²hubhÞ)”}”(hXQAll these devices typically have different runtime user-space software stacks, that are tailored-made to their h/w. In addition, they will also probably include a compiler to generate programs to their custom-made computational engines. Typically, the common layer in user-space will be the DL frameworks, such as PyTorch and TensorFlow.”h]”hXQAll these devices typically have different runtime user-space software stacks, that are tailored-made to their h/w. In addition, they will also probably include a compiler to generate programs to their custom-made computational engines. Typically, the common layer in user-space will be the DL frameworks, such as PyTorch and TensorFlow.”…””}”(hj`h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÝh³hÇh´K%hhÊh²hubhÉ)”}”(hhh]”(hÎ)”}”(hŒSharing code with DRM”h]”hŒSharing code with DRM”…””}”(hjqh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÍhjnh²hh³hÇh´K,ubhÞ)”}”(hX!Because this type of devices can be an IP inside GPUs or have similar characteristics as those of GPUs, the accel subsystem will use the DRM subsystem's code and functionality. i.e. the accel core code will be part of the DRM subsystem and an accel device will be a new type of DRM device.”h]”hX#Because this type of devices can be an IP inside GPUs or have similar characteristics as those of GPUs, the accel subsystem will use the DRM subsystem’s code and functionality. i.e. the accel core code will be part of the DRM subsystem and an accel device will be a new type of DRM device.”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÝh³hÇh´K.hjnh²hubhÞ)”}”(hŒýThis will allow us to leverage the extensive DRM code-base and collaborate with DRM developers that have experience with this type of devices. In addition, new features that will be added for the accelerator drivers can be of use to GPU drivers as well.”h]”hŒýThis will allow us to leverage the extensive DRM code-base and collaborate with DRM developers that have experience with this type of devices. In addition, new features that will be added for the accelerator drivers can be of use to GPU drivers as well.”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÝh³hÇh´K4hjnh²hubeh}”(h]”Œsharing-code-with-drm”ah ]”h"]”Œsharing code with drm”ah$]”h&]”uh1hÈhhÊh²hh³hÇh´K,ubhÉ)”}”(hhh]”(hÎ)”}”(hŒDifferentiation from GPUs”h]”hŒDifferentiation from GPUs”…””}”(hj¦h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÍhj£h²hh³hÇh´K:ubhÞ)”}”(hŒçBecause we want to prevent the extensive user-space graphic software stack from trying to use an accelerator as a GPU, the compute accelerators will be differentiated from GPUs by using a new major number and new device char files.”h]”hŒçBecause we want to prevent the extensive user-space graphic software stack from trying to use an accelerator as a GPU, the compute accelerators will be differentiated from GPUs by using a new major number and new device char files.”…””}”(hj´h²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÝh³hÇh´K`_ - Oded Gabbay (2022)”h]”hÞ)”}”(hjØh]”(hŒ reference”“”)”}”(hŒ¦`Initial discussion on the New subsystem for acceleration devices `_”h]”hŒ@Initial discussion on the New subsystem for acceleration devices”…””}”(hjßh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”Œname”Œ@Initial discussion on the New subsystem for acceleration devices”Œrefuri”Œ`https://lore.kernel.org/lkml/CAFCwf11=9qpNAepL7NL+YAV_QO=Wv6pnWPhKHKAepK3fNn+2Dg@mail.gmail.com/”uh1jÝhjÚubhŒtarget”“”)”}”(hŒc ”h]”h}”(h]”Œ@initial-discussion-on-the-new-subsystem-for-acceleration-devices”ah ]”h"]”Œ@initial discussion on the new subsystem for acceleration devices”ah$]”h&]”Œrefuri”jðuh1jñŒ referenced”KhjÚubhŒ - Oded Gabbay (2022)”…””}”(hjÚh²hh³Nh´Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1hÝh³hÇh´KhhjÖubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjÓh²hh³hÇh´Nubj)”}”(hŒ…`patch-set to add the new subsystem `_ - Oded Gabbay (2022) ”h]”hÞ)”}”(hŒ„`patch-set to add the new subsystem `_ - Oded Gabbay (2022)”h]”(jÞ)”}”(hŒo`patch-set to add the new subsystem `_”h]”hŒ"patch-set to add the new subsystem”…””}”(hjh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”Œname”Œ"patch-set to add the new subsystem”jïŒGhttps://lore.kernel.org/lkml/20221022214622.18042-1-ogabbay@kernel.org/”uh1jÝhjubjò)”}”(hŒJ ”h]”h}”(h]”Œ"patch-set-to-add-the-new-subsystem”ah ]”h"]”Œ"patch-set to add the new subsystem”ah$]”h&]”Œrefuri”j)uh1jñjKhjubhŒ - Oded Gabbay (2022)”…””}”(hjh²hh³Nh´Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1hÝh³hÇh´Kihjubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjÓh²hh³hÇh´Nubeh}”(h]”h ]”h"]”h$]”h&]”j^Œ*”uh1j h³hÇh´KhhjÂh²hubeh}”(h]”Œ email-threads”ah ]”h"]”Œ email threads”ah$]”h&]”uh1hÈhj±h²hh³hÇh´KfubhÉ)”}”(hhh]”(hÎ)”}”(hŒConference talks”h]”hŒConference talks”…””}”(hjYh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÍhjVh²hh³hÇh´Klubj )”}”(hhh]”j)”}”(hŒ`LPC 2022 Accelerators BOF outcomes summary `_ - Dave Airlie (2022)”h]”hÞ)”}”(hjlh]”(jÞ)”}”(hŒ{`LPC 2022 Accelerators BOF outcomes summary `_”h]”hŒ*LPC 2022 Accelerators BOF outcomes summary”…””}”(hjqh²hh³Nh´Nubah}”(h]”h ]”h"]”h$]”h&]”Œname”Œ*LPC 2022 Accelerators BOF outcomes summary”jïŒKhttps://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html”uh1jÝhjnubjò)”}”(hŒN ”h]”h}”(h]”Œ*lpc-2022-accelerators-bof-outcomes-summary”ah ]”h"]”Œ*lpc 2022 accelerators bof outcomes summary”ah$]”h&]”Œrefuri”juh1jñjKhjnubhŒ - Dave Airlie (2022)”…””}”(hjnh²hh³Nh´Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1hÝh³hÇh´Knhjjubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjgh²hh³hÇh´Nubah}”(h]”h ]”h"]”h$]”h&]”j^jMuh1j h³hÇh´KnhjVh²hubeh}”(h]”Œconference-talks”ah ]”h"]”Œconference talks”ah$]”h&]”uh1hÈhj±h²hh³hÇh´Klubeh}”(h]”Œexternal-references”ah ]”h"]”Œexternal references”ah$]”h&]”uh1hÈhhÊh²hh³hÇh´Kcubeh}”(h]”Œ introduction”ah ]”h"]”Œ introduction”ah$]”h&]”uh1hÈhhh²hh³hÇh´Kubeh}”(h]”h ]”h"]”h$]”h&]”Œsource”hÇuh1hŒcurrent_source”NŒ current_line”NŒsettings”Œdocutils.frontend”ŒValues”“”)”}”(hÍNŒ generator”NŒ datestamp”NŒ source_link”NŒ source_url”NŒ toc_backlinks”Œentry”Œfootnote_backlinks”KŒ sectnum_xform”KŒstrip_comments”NŒstrip_elements_with_classes”NŒ strip_classes”NŒ report_level”KŒ halt_level”KŒexit_status_level”KŒdebug”NŒwarning_stream”NŒ traceback”ˆŒinput_encoding”Œ utf-8-sig”Œinput_encoding_error_handler”Œstrict”Œoutput_encoding”Œutf-8”Œoutput_encoding_error_handler”jàŒerror_encoding”Œutf-8”Œerror_encoding_error_handler”Œbackslashreplace”Œ language_code”Œen”Œrecord_dependencies”NŒconfig”NŒ id_prefix”hŒauto_id_prefix”Œid”Œ dump_settings”NŒdump_internals”NŒdump_transforms”NŒdump_pseudo_xml”NŒexpose_internals”NŒstrict_visitor”NŒ_disable_config”NŒ_source”hÇŒ _destination”NŒ _config_files”]”Œ7/var/lib/git/docbuild/linux/Documentation/docutils.conf”aŒfile_insertion_enabled”ˆŒ raw_enabled”KŒline_length_limit”M'Œpep_references”NŒ pep_base_url”Œhttps://peps.python.org/”Œpep_file_url_template”Œpep-%04d”Œrfc_references”NŒ rfc_base_url”Œ&https://datatracker.ietf.org/doc/html/”Œ tab_width”KŒtrim_footnote_reference_space”‰Œsyntax_highlight”Œlong”Œ smart_quotes”ˆŒsmartquotes_locales”]”Œcharacter_level_inline_markup”‰Œdoctitle_xform”‰Œ docinfo_xform”KŒsectsubtitle_xform”‰Œ image_loading”Œlink”Œembed_stylesheet”‰Œcloak_email_addresses”ˆŒsection_self_link”‰Œenv”NubŒreporter”NŒindirect_targets”]”Œsubstitution_defs”}”Œsubstitution_names”}”Œrefnames”}”Œrefids”}”Œnameids”}”(jºj·j jj2j/j®j«j²j¯jSjPjüjùj3j0jªj§j‹jˆuŒ nametypes”}”(jº‰j ‰j2‰j®‰j²‰jS‰jüˆj3ˆjª‰j‹ˆuh}”(j·hÊjjnj/j£j«j5j¯j±jPjÂjùjój0j*j§jVjˆj‚uŒ footnote_refs”}”Œ citation_refs”}”Œ autofootnotes”]”Œautofootnote_refs”]”Œsymbol_footnotes”]”Œsymbol_footnote_refs”]”Œ footnotes”]”Œ citations”]”Œautofootnote_start”KŒsymbol_footnote_start”KŒ id_counter”Œ collections”ŒCounter”“”}”…”R”Œparse_messages”]”Œtransform_messages”]”Œ transformer”NŒ include_log”]”Œ decoration”Nh²hub.