€•z<Œsphinx.addnodes”Œdocument”“”)”}”(Œ rawsource”Œ”Œchildren”]”(Œ translations”Œ LanguagesNode”“”)”}”(hhh]”(hŒ pending_xref”“”)”}”(hhh]”Œdocutils.nodes”ŒText”“”ŒChinese (Simplified)”…””}”Œparent”hsbaŒ attributes”}”(Œids”]”Œclasses”]”Œnames”]”Œdupnames”]”Œbackrefs”]”Œ refdomain”Œstd”Œreftype”Œdoc”Œ reftarget”Œ0/translations/zh_CN/admin-guide/perf/alibaba_pmu”Œmodname”NŒ classname”NŒ refexplicit”ˆuŒtagname”hhh ubh)”}”(hhh]”hŒChinese (Traditional)”…””}”hh2sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ0/translations/zh_TW/admin-guide/perf/alibaba_pmu”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒItalian”…””}”hhFsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ0/translations/it_IT/admin-guide/perf/alibaba_pmu”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒJapanese”…””}”hhZsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ0/translations/ja_JP/admin-guide/perf/alibaba_pmu”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒKorean”…””}”hhnsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ0/translations/ko_KR/admin-guide/perf/alibaba_pmu”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒSpanish”…””}”hh‚sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ0/translations/sp_SP/admin-guide/perf/alibaba_pmu”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubeh}”(h]”h ]”h"]”h$]”h&]”Œcurrent_language”ŒEnglish”uh1h hhŒ _document”hŒsource”NŒline”NubhŒsection”“”)”}”(hhh]”(hŒtitle”“”)”}”(hŒ=Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU)”h]”hŒ?Alibaba’s T-Head SoC Uncore Performance Monitoring Unit (PMU)”…””}”(hh¨hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h¦hh£hžhhŸŒJ/var/lib/git/docbuild/linux/Documentation/admin-guide/perf/alibaba_pmu.rst”h KubhŒ paragraph”“”)”}”(hŒ³The Yitian 710, custom-built by Alibaba Group's chip development business, T-Head, implements uncore PMU for performance and functional debugging to facilitate system maintenance.”h]”hŒµThe Yitian 710, custom-built by Alibaba Group’s chip development business, T-Head, implements uncore PMU for performance and functional debugging to facilitate system maintenance.”…””}”(hh¹hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubh¢)”}”(hhh]”(h§)”}”(hŒ(DDR Sub-System Driveway (DRW) PMU Driver”h]”hŒ(DDR Sub-System Driveway (DRW) PMU Driver”…””}”(hhÊhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h¦hhÇhžhhŸh¶h K ubh¸)”}”(hX<Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 channel is independent of others to service system memory requests. And one DDR5 channel is split into two independent sub-channels. The DDR Sub-System Driveway implements separate PMUs for each sub-channel to monitor various performance metrics.”h]”hX<Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 channel is independent of others to service system memory requests. And one DDR5 channel is split into two independent sub-channels. The DDR Sub-System Driveway implements separate PMUs for each sub-channel to monitor various performance metrics.”…””}”(hhØhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K hhÇhžhubh¸)”}”(hXThe Driveway PMU devices are named as ali_drw_ with perf. For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two sub-channels of the same channel in die 0. And the PMU device of die 1 is prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.”h]”hXThe Driveway PMU devices are named as ali_drw_ with perf. For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two sub-channels of the same channel in die 0. And the PMU device of die 1 is prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000.”…””}”(hhæhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KhhÇhžhubh¸)”}”(hŒTEach sub-channel has 36 PMU counters in total, which is classified into four groups:”h]”hŒTEach sub-channel has 36 PMU counters in total, which is classified into four groups:”…””}”(hhôhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KhhÇhžhubhŒ bullet_list”“”)”}”(hhh]”(hŒ list_item”“”)”}”(hŒ¤Group 0: PMU Cycle Counter. This group has one pair of counters pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count based on DDRC core clock. ”h]”h¸)”}”(hŒ£Group 0: PMU Cycle Counter. This group has one pair of counters pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count based on DDRC core clock.”h]”hŒ£Group 0: PMU Cycle Counter. This group has one pair of counters pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count based on DDRC core clock.”…””}”(hj hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khj ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjhžhhŸh¶h Nubj)”}”(hŒïGroup 1: PMU Bandwidth Counters. This group has 8 counters that are used to count the total access number of either the eight bank groups in a selected rank, or four ranks separately in the first 4 counters. The base transfer unit is 64B. ”h]”h¸)”}”(hŒîGroup 1: PMU Bandwidth Counters. This group has 8 counters that are used to count the total access number of either the eight bank groups in a selected rank, or four ranks separately in the first 4 counters. The base transfer unit is 64B.”h]”hŒîGroup 1: PMU Bandwidth Counters. This group has 8 counters that are used to count the total access number of either the eight bank groups in a selected rank, or four ranks separately in the first 4 counters. The base transfer unit is 64B.”…””}”(hj%hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khj!ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjhžhhŸh¶h Nubj)”}”(hŒŠGroup 2: PMU Retry Counters. This group has 10 counters, that intend to count the total retry number of each type of uncorrectable error. ”h]”h¸)”}”(hŒ‰Group 2: PMU Retry Counters. This group has 10 counters, that intend to count the total retry number of each type of uncorrectable error.”h]”hŒ‰Group 2: PMU Retry Counters. This group has 10 counters, that intend to count the total retry number of each type of uncorrectable error.”…””}”(hj=hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K#hj9ubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjhžhhŸh¶h Nubj)”}”(hŒdGroup 3: PMU Common Counters. This group has 16 counters, that are used to count the common events. ”h]”h¸)”}”(hŒcGroup 3: PMU Common Counters. This group has 16 counters, that are used to count the common events.”h]”hŒcGroup 3: PMU Common Counters. This group has 16 counters, that are used to count the common events.”…””}”(hjUhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K&hjQubah}”(h]”h ]”h"]”h$]”h&]”uh1jhjhžhhŸh¶h Nubeh}”(h]”h ]”h"]”h$]”h&]”Œbullet”Œ-”uh1jhŸh¶h KhhÇhžhubh¸)”}”(hŒKFor now, the Driveway PMU driver only uses counters in group 0 and group 3.”h]”hŒKFor now, the Driveway PMU driver only uses counters in group 0 and group 3.”…””}”(hjqhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K)hhÇhžhubh¸)”}”(hXThe DDR Controller (DDRCTL) and DDR PHY combine to create a complete solution for connecting an SoC application bus to DDR memory devices. The DDRCTL receives transactions Host Interface (HIF) which is custom-defined by Synopsys. These transactions are queued internally and scheduled for access while satisfying the SDRAM protocol timing requirements, transaction priorities, and dependencies between the transactions. The DDRCTL in turn issues commands on the DDR PHY Interface (DFI) to the PHY module, which launches and captures data to and from the SDRAM. The driveway PMUs have hardware logic to gather statistics and performance logging signals on HIF, DFI, etc.”h]”hXThe DDR Controller (DDRCTL) and DDR PHY combine to create a complete solution for connecting an SoC application bus to DDR memory devices. The DDRCTL receives transactions Host Interface (HIF) which is custom-defined by Synopsys. These transactions are queued internally and scheduled for access while satisfying the SDRAM protocol timing requirements, transaction priorities, and dependencies between the transactions. The DDRCTL in turn issues commands on the DDR PHY Interface (DFI) to the PHY module, which launches and captures data to and from the SDRAM. The driveway PMUs have hardware logic to gather statistics and performance logging signals on HIF, DFI, etc.”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K+hhÇhžhubh¸)”}”(hŒ¬By counting the READ, WRITE and RMW commands sent to the DDRC through the HIF interface, we could calculate the bandwidth. Example usage of counting memory data bandwidth::”h]”hŒ«By counting the READ, WRITE and RMW commands sent to the DDRC through the HIF interface, we could calculate the bandwidth. Example usage of counting memory data bandwidth:”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K5hhÇhžhubhŒ literal_block”“”)”}”(hXµperf stat \ -e ali_drw_21000/hif_wr/ \ -e ali_drw_21000/hif_rd/ \ -e ali_drw_21000/hif_rmw/ \ -e ali_drw_21000/cycle/ \ -e ali_drw_21080/hif_wr/ \ -e ali_drw_21080/hif_rd/ \ -e ali_drw_21080/hif_rmw/ \ -e ali_drw_21080/cycle/ \ -e ali_drw_23000/hif_wr/ \ -e ali_drw_23000/hif_rd/ \ -e ali_drw_23000/hif_rmw/ \ -e ali_drw_23000/cycle/ \ -e ali_drw_23080/hif_wr/ \ -e ali_drw_23080/hif_rd/ \ -e ali_drw_23080/hif_rmw/ \ -e ali_drw_23080/cycle/ \ -e ali_drw_25000/hif_wr/ \ -e ali_drw_25000/hif_rd/ \ -e ali_drw_25000/hif_rmw/ \ -e ali_drw_25000/cycle/ \ -e ali_drw_25080/hif_wr/ \ -e ali_drw_25080/hif_rd/ \ -e ali_drw_25080/hif_rmw/ \ -e ali_drw_25080/cycle/ \ -e ali_drw_27000/hif_wr/ \ -e ali_drw_27000/hif_rd/ \ -e ali_drw_27000/hif_rmw/ \ -e ali_drw_27000/cycle/ \ -e ali_drw_27080/hif_wr/ \ -e ali_drw_27080/hif_rd/ \ -e ali_drw_27080/hif_rmw/ \ -e ali_drw_27080/cycle/ -- sleep 10”h]”hXµperf stat \ -e ali_drw_21000/hif_wr/ \ -e ali_drw_21000/hif_rd/ \ -e ali_drw_21000/hif_rmw/ \ -e ali_drw_21000/cycle/ \ -e ali_drw_21080/hif_wr/ \ -e ali_drw_21080/hif_rd/ \ -e ali_drw_21080/hif_rmw/ \ -e ali_drw_21080/cycle/ \ -e ali_drw_23000/hif_wr/ \ -e ali_drw_23000/hif_rd/ \ -e ali_drw_23000/hif_rmw/ \ -e ali_drw_23000/cycle/ \ -e ali_drw_23080/hif_wr/ \ -e ali_drw_23080/hif_rd/ \ -e ali_drw_23080/hif_rmw/ \ -e ali_drw_23080/cycle/ \ -e ali_drw_25000/hif_wr/ \ -e ali_drw_25000/hif_rd/ \ -e ali_drw_25000/hif_rmw/ \ -e ali_drw_25000/cycle/ \ -e ali_drw_25080/hif_wr/ \ -e ali_drw_25080/hif_rd/ \ -e ali_drw_25080/hif_rmw/ \ -e ali_drw_25080/cycle/ \ -e ali_drw_27000/hif_wr/ \ -e ali_drw_27000/hif_rd/ \ -e ali_drw_27000/hif_rmw/ \ -e ali_drw_27000/cycle/ \ -e ali_drw_27080/hif_wr/ \ -e ali_drw_27080/hif_rd/ \ -e ali_drw_27080/hif_rmw/ \ -e ali_drw_27080/cycle/ -- sleep 10”…””}”hjsbah}”(h]”h ]”h"]”h$]”h&]”Œ xml:space”Œpreserve”uh1j›hŸh¶h K9hhÇhžhubh¸)”}”(hŒEExample usage of counting all memory read/write bandwidth by metric::”h]”hŒDExample usage of counting all memory read/write bandwidth by metric:”…””}”(hj­hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K[hhÇhžhubjœ)”}”(hŒ`perf stat -M ddr_read_bandwidth.all -- sleep 10 perf stat -M ddr_write_bandwidth.all -- sleep 10”h]”hŒ`perf stat -M ddr_read_bandwidth.all -- sleep 10 perf stat -M ddr_write_bandwidth.all -- sleep 10”…””}”hj»sbah}”(h]”h ]”h"]”h$]”h&]”j«j¬uh1j›hŸh¶h K]hhÇhžhubh¸)”}”(hŒ8The average DRAM bandwidth can be calculated as follows:”h]”hŒ8The average DRAM bandwidth can be calculated as follows:”…””}”(hjÉhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K`hhÇhžhubj)”}”(hhh]”(j)”}”(hŒCRead Bandwidth = perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle”h]”h¸)”}”(hjÜh]”hŒCRead Bandwidth = perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle”…””}”(hjÞhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KbhjÚubah}”(h]”h ]”h"]”h$]”h&]”uh1jhj×hžhhŸh¶h Nubj)”}”(hŒUWrite Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle ”h]”h¸)”}”(hŒTWrite Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle”h]”hŒTWrite Bandwidth = (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle”…””}”(hjõhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Kchjñubah}”(h]”h ]”h"]”h$]”h&]”uh1jhj×hžhhŸh¶h Nubeh}”(h]”h ]”h"]”h$]”h&]”jojpuh1jhŸh¶h KbhhÇhžhubh¸)”}”(hŒHere, DDRC_WIDTH = 64 bytes.”h]”hŒHere, DDRC_WIDTH = 64 bytes.”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KehhÇhžhubh¸)”}”(hŒ’The current driver does not support sampling. So "perf record" is unsupported. Also attach to a task is unsupported as the events are all uncore.”h]”hŒ–The current driver does not support sampling. So “perf record†is unsupported. Also attach to a task is unsupported as the events are all uncore.”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KghhÇhžhubeh}”(h]”Œ&ddr-sub-system-driveway-drw-pmu-driver”ah ]”h"]”Œ(ddr sub-system driveway (drw) pmu driver”ah$]”h&]”uh1h¡hh£hžhhŸh¶h K ubeh}”(h]”Œ;alibaba-s-t-head-soc-uncore-performance-monitoring-unit-pmu”ah ]”h"]”Œ=alibaba's t-head soc uncore performance monitoring unit (pmu)”ah$]”h&]”uh1h¡hhhžhhŸh¶h Kubeh}”(h]”h ]”h"]”h$]”h&]”Œsource”h¶uh1hŒcurrent_source”NŒ current_line”NŒsettings”Œdocutils.frontend”ŒValues”“”)”}”(h¦NŒ generator”NŒ datestamp”NŒ source_link”NŒ source_url”NŒ toc_backlinks”Œentry”Œfootnote_backlinks”KŒ sectnum_xform”KŒstrip_comments”NŒstrip_elements_with_classes”NŒ strip_classes”NŒ report_level”KŒ halt_level”KŒexit_status_level”KŒdebug”NŒwarning_stream”NŒ traceback”ˆŒinput_encoding”Œ utf-8-sig”Œinput_encoding_error_handler”Œstrict”Œoutput_encoding”Œutf-8”Œoutput_encoding_error_handler”j^Œerror_encoding”Œutf-8”Œerror_encoding_error_handler”Œbackslashreplace”Œ language_code”Œen”Œrecord_dependencies”NŒconfig”NŒ id_prefix”hŒauto_id_prefix”Œid”Œ dump_settings”NŒdump_internals”NŒdump_transforms”NŒdump_pseudo_xml”NŒexpose_internals”NŒstrict_visitor”NŒ_disable_config”NŒ_source”h¶Œ _destination”NŒ _config_files”]”Œ7/var/lib/git/docbuild/linux/Documentation/docutils.conf”aŒfile_insertion_enabled”ˆŒ raw_enabled”KŒline_length_limit”M'Œpep_references”NŒ pep_base_url”Œhttps://peps.python.org/”Œpep_file_url_template”Œpep-%04d”Œrfc_references”NŒ rfc_base_url”Œ&https://datatracker.ietf.org/doc/html/”Œ tab_width”KŒtrim_footnote_reference_space”‰Œsyntax_highlight”Œlong”Œ smart_quotes”ˆŒsmartquotes_locales”]”Œcharacter_level_inline_markup”‰Œdoctitle_xform”‰Œ docinfo_xform”KŒsectsubtitle_xform”‰Œ image_loading”Œlink”Œembed_stylesheet”‰Œcloak_email_addresses”ˆŒsection_self_link”‰Œenv”NubŒreporter”NŒindirect_targets”]”Œsubstitution_defs”}”Œsubstitution_names”}”Œrefnames”}”Œrefids”}”Œnameids”}”(j8j5j0j-uŒ nametypes”}”(j8‰j0‰uh}”(j5h£j-hÇuŒ footnote_refs”}”Œ citation_refs”}”Œ autofootnotes”]”Œautofootnote_refs”]”Œsymbol_footnotes”]”Œsymbol_footnote_refs”]”Œ footnotes”]”Œ citations”]”Œautofootnote_start”KŒsymbol_footnote_start”KŒ id_counter”Œ collections”ŒCounter”“”}”…”R”Œparse_messages”]”Œtransform_messages”]”Œ transformer”NŒ include_log”]”Œ decoration”Nhžhub.