€•Ü£Œsphinx.addnodes”Œdocument”“”)”}”(Œ rawsource”Œ”Œchildren”]”(Œ translations”Œ LanguagesNode”“”)”}”(hhh]”(hŒ pending_xref”“”)”}”(hhh]”Œdocutils.nodes”ŒText”“”ŒChinese (Simplified)”…””}”Œparent”hsbaŒ attributes”}”(Œids”]”Œclasses”]”Œnames”]”Œdupnames”]”Œbackrefs”]”Œ refdomain”Œstd”Œreftype”Œdoc”Œ reftarget”Œ7/translations/zh_CN/arch/powerpc/eeh-pci-error-recovery”Œmodname”NŒ classname”NŒ refexplicit”ˆuŒtagname”hhh ubh)”}”(hhh]”hŒChinese (Traditional)”…””}”hh2sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ7/translations/zh_TW/arch/powerpc/eeh-pci-error-recovery”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒItalian”…””}”hhFsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ7/translations/it_IT/arch/powerpc/eeh-pci-error-recovery”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒJapanese”…””}”hhZsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ7/translations/ja_JP/arch/powerpc/eeh-pci-error-recovery”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒKorean”…””}”hhnsbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ7/translations/ko_KR/arch/powerpc/eeh-pci-error-recovery”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubh)”}”(hhh]”hŒSpanish”…””}”hh‚sbah}”(h]”h ]”h"]”h$]”h&]”Œ refdomain”h)Œreftype”h+Œ reftarget”Œ7/translations/sp_SP/arch/powerpc/eeh-pci-error-recovery”Œmodname”NŒ classname”NŒ refexplicit”ˆuh1hhh ubeh}”(h]”h ]”h"]”h$]”h&]”Œcurrent_language”ŒEnglish”uh1h hhŒ _document”hŒsource”NŒline”NubhŒsection”“”)”}”(hhh]”(hŒtitle”“”)”}”(hŒPCI Bus EEH Error Recovery”h]”hŒPCI Bus EEH Error Recovery”…””}”(hh¨hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h¦hh£hžhhŸŒQ/var/lib/git/docbuild/linux/Documentation/arch/powerpc/eeh-pci-error-recovery.rst”h KubhŒ paragraph”“”)”}”(hŒ$Linas Vepstas ”h]”(hŒLinas Vepstas <”…””}”(hh¹hžhhŸNh NubhŒ reference”“”)”}”(hŒlinas@austin.ibm.com”h]”hŒlinas@austin.ibm.com”…””}”(hhÃhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”Œrefuri”Œmailto:linas@austin.ibm.com”uh1hÁhh¹ubhŒ>”…””}”(hh¹hžhhŸNh Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubh¸)”}”(hŒ12 January 2005”h]”hŒ12 January 2005”…””}”(hhÝhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubh¢)”}”(hhh]”(h§)”}”(hŒ Overview:”h]”hŒ Overview:”…””}”(hhîhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h¦hhëhžhhŸh¶h K ubh¸)”}”(hX™The IBM POWER-based pSeries and iSeries computers include PCI bus controller chips that have extended capabilities for detecting and reporting a large variety of PCI bus error conditions. These features go under the name of "EEH", for "Enhanced Error Handling". The EEH hardware features allow PCI bus errors to be cleared and a PCI card to be "rebooted", without also having to reboot the operating system.”h]”hX¥The IBM POWER-based pSeries and iSeries computers include PCI bus controller chips that have extended capabilities for detecting and reporting a large variety of PCI bus error conditions. These features go under the name of “EEHâ€, for “Enhanced Error Handlingâ€. The EEH hardware features allow PCI bus errors to be cleared and a PCI card to be “rebootedâ€, without also having to reboot the operating system.”…””}”(hhühžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K hhëhžhubh¸)”}”(hXEThis is in contrast to traditional PCI error handling, where the PCI chip is wired directly to the CPU, and an error would cause a CPU machine-check/check-stop condition, halting the CPU entirely. Another "traditional" technique is to ignore such errors, which can lead to data corruption, both of user data or of kernel data, hung/unresponsive adapters, or system crashes/lockups. Thus, the idea behind EEH is that the operating system can become more reliable and robust by protecting it from PCI errors, and giving the OS the ability to "reboot"/recover individual PCI devices.”h]”hXMThis is in contrast to traditional PCI error handling, where the PCI chip is wired directly to the CPU, and an error would cause a CPU machine-check/check-stop condition, halting the CPU entirely. Another “traditional†technique is to ignore such errors, which can lead to data corruption, both of user data or of kernel data, hung/unresponsive adapters, or system crashes/lockups. Thus, the idea behind EEH is that the operating system can become more reliable and robust by protecting it from PCI errors, and giving the OS the ability to “rebootâ€/recover individual PCI devices.”…””}”(hj hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khhëhžhubh¸)”}”(hŒbFuture systems from other vendors, based on the PCI-E specification, may contain similar features.”h]”hŒbFuture systems from other vendors, based on the PCI-E specification, may contain similar features.”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khhëhžhubeh}”(h]”Œoverview”ah ]”h"]”Œ overview:”ah$]”h&]”uh1h¡hh£hžhhŸh¶h K ubh¢)”}”(hhh]”(h§)”}”(hŒCauses of EEH Errors”h]”hŒCauses of EEH Errors”…””}”(hj1hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h¦hj.hžhhŸh¶h K#ubh¸)”}”(hXsEEH was originally designed to guard against hardware failure, such as PCI cards dying from heat, humidity, dust, vibration and bad electrical connections. The vast majority of EEH errors seen in "real life" are due to either poorly seated PCI cards, or, unfortunately quite commonly, due to device driver bugs, device firmware bugs, and sometimes PCI card hardware bugs.”h]”hXwEEH was originally designed to guard against hardware failure, such as PCI cards dying from heat, humidity, dust, vibration and bad electrical connections. The vast majority of EEH errors seen in “real life†are due to either poorly seated PCI cards, or, unfortunately quite commonly, due to device driver bugs, device firmware bugs, and sometimes PCI card hardware bugs.”…””}”(hj?hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K$hj.hžhubh¸)”}”(hXóThe most common software bug, is one that causes the device to attempt to DMA to a location in system memory that has not been reserved for DMA access for that card. This is a powerful feature, as it prevents what; otherwise, would have been silent memory corruption caused by the bad DMA. A number of device driver bugs have been found and fixed in this way over the past few years. Other possible causes of EEH errors include data or address line parity errors (for example, due to poor electrical connectivity due to a poorly seated card), and PCI-X split-completion errors (due to software, device firmware, or device PCI hardware bugs). The vast majority of "true hardware failures" can be cured by physically removing and re-seating the PCI card.”h]”hX÷The most common software bug, is one that causes the device to attempt to DMA to a location in system memory that has not been reserved for DMA access for that card. This is a powerful feature, as it prevents what; otherwise, would have been silent memory corruption caused by the bad DMA. A number of device driver bugs have been found and fixed in this way over the past few years. Other possible causes of EEH errors include data or address line parity errors (for example, due to poor electrical connectivity due to a poorly seated card), and PCI-X split-completion errors (due to software, device firmware, or device PCI hardware bugs). The vast majority of “true hardware failures†can be cured by physically removing and re-seating the PCI card.”…””}”(hjMhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K+hj.hžhubeh}”(h]”Œcauses-of-eeh-errors”ah ]”h"]”Œcauses of eeh errors”ah$]”h&]”uh1h¡hh£hžhhŸh¶h K#ubh¢)”}”(hhh]”(h§)”}”(hŒDetection and Recovery”h]”hŒDetection and Recovery”…””}”(hjfhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h¦hjchžhhŸh¶h K:ubh¸)”}”(hX’In the following discussion, a generic overview of how to detect and recover from EEH errors will be presented. This is followed by an overview of how the current implementation in the Linux kernel does it. The actual implementation is subject to change, and some of the finer points are still being debated. These may in turn be swayed if or when other architectures implement similar functionality.”h]”hX’In the following discussion, a generic overview of how to detect and recover from EEH errors will be presented. This is followed by an overview of how the current implementation in the Linux kernel does it. The actual implementation is subject to change, and some of the finer points are still being debated. These may in turn be swayed if or when other architectures implement similar functionality.”…””}”(hjthžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K;hjchžhubh¸)”}”(hXnWhen a PCI Host Bridge (PHB, the bus controller connecting the PCI bus to the system CPU electronics complex) detects a PCI error condition, it will "isolate" the affected PCI card. Isolation will block all writes (either to the card from the system, or from the card to the system), and it will cause all reads to return all-ff's (0xff, 0xffff, 0xffffffff for 8/16/32-bit reads). This value was chosen because it is the same value you would get if the device was physically unplugged from the slot. This includes access to PCI memory, I/O space, and PCI config space. Interrupts; however, will continue to be delivered.”h]”hXtWhen a PCI Host Bridge (PHB, the bus controller connecting the PCI bus to the system CPU electronics complex) detects a PCI error condition, it will “isolate†the affected PCI card. Isolation will block all writes (either to the card from the system, or from the card to the system), and it will cause all reads to return all-ff’s (0xff, 0xffff, 0xffffffff for 8/16/32-bit reads). This value was chosen because it is the same value you would get if the device was physically unplugged from the slot. This includes access to PCI memory, I/O space, and PCI config space. Interrupts; however, will continue to be delivered.”…””}”(hj‚hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KChjchžhubh¸)”}”(hXDetection and recovery are performed with the aid of ppc64 firmware. The programming interfaces in the Linux kernel into the firmware are referred to as RTAS (Run-Time Abstraction Services). The Linux kernel does not (should not) access the EEH function in the PCI chipsets directly, primarily because there are a number of different chipsets out there, each with different interfaces and quirks. The firmware provides a uniform abstraction layer that will work with all pSeries and iSeries hardware (and be forwards-compatible).”h]”hXDetection and recovery are performed with the aid of ppc64 firmware. The programming interfaces in the Linux kernel into the firmware are referred to as RTAS (Run-Time Abstraction Services). The Linux kernel does not (should not) access the EEH function in the PCI chipsets directly, primarily because there are a number of different chipsets out there, each with different interfaces and quirks. The firmware provides a uniform abstraction layer that will work with all pSeries and iSeries hardware (and be forwards-compatible).”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KNhjchžhubh¸)”}”(hX¹If the OS or device driver suspects that a PCI slot has been EEH-isolated, there is a firmware call it can make to determine if this is the case. If so, then the device driver should put itself into a consistent state (given that it won't be able to complete any pending work) and start recovery of the card. Recovery normally would consist of resetting the PCI device (holding the PCI #RST line high for two seconds), followed by setting up the device config space (the base address registers (BAR's), latency timer, cache line size, interrupt line, and so on). This is followed by a reinitialization of the device driver. In a worst-case scenario, the power to the card can be toggled, at least on hot-plug-capable slots. In principle, layers far above the device driver probably do not need to know that the PCI card has been "rebooted" in this way; ideally, there should be at most a pause in Ethernet/disk/USB I/O while the card is being reset.”h]”hXÁIf the OS or device driver suspects that a PCI slot has been EEH-isolated, there is a firmware call it can make to determine if this is the case. If so, then the device driver should put itself into a consistent state (given that it won’t be able to complete any pending work) and start recovery of the card. Recovery normally would consist of resetting the PCI device (holding the PCI #RST line high for two seconds), followed by setting up the device config space (the base address registers (BAR’s), latency timer, cache line size, interrupt line, and so on). This is followed by a reinitialization of the device driver. In a worst-case scenario, the power to the card can be toggled, at least on hot-plug-capable slots. In principle, layers far above the device driver probably do not need to know that the PCI card has been “rebooted†in this way; ideally, there should be at most a pause in Ethernet/disk/USB I/O while the card is being reset.”…””}”(hjžhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KXhjchžhubh¸)”}”(hXÈIf the card cannot be recovered after three or four resets, the kernel/device driver should assume the worst-case scenario, that the card has died completely, and report this error to the sysadmin. In addition, error messages are reported through RTAS and also through syslogd (/var/log/messages) to alert the sysadmin of PCI resets. The correct way to deal with failed adapters is to use the standard PCI hotplug tools to remove and replace the dead card.”h]”hXÈIf the card cannot be recovered after three or four resets, the kernel/device driver should assume the worst-case scenario, that the card has died completely, and report this error to the sysadmin. In addition, error messages are reported through RTAS and also through syslogd (/var/log/messages) to alert the sysadmin of PCI resets. The correct way to deal with failed adapters is to use the standard PCI hotplug tools to remove and replace the dead card.”…””}”(hj¬hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khhjchžhubeh}”(h]”Œdetection-and-recovery”ah ]”h"]”Œdetection and recovery”ah$]”h&]”uh1h¡hh£hžhhŸh¶h K:ubh¢)”}”(hhh]”(h§)”}”(hŒ&Current PPC64 Linux EEH Implementation”h]”hŒ&Current PPC64 Linux EEH Implementation”…””}”(hjÅhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h¦hjÂhžhhŸh¶h Krubh¸)”}”(hXhAt this time, a generic EEH recovery mechanism has been implemented, so that individual device drivers do not need to be modified to support EEH recovery. This generic mechanism piggy-backs on the PCI hotplug infrastructure, and percolates events up through the userspace/udev infrastructure. Following is a detailed description of how this is accomplished.”h]”hXhAt this time, a generic EEH recovery mechanism has been implemented, so that individual device drivers do not need to be modified to support EEH recovery. This generic mechanism piggy-backs on the PCI hotplug infrastructure, and percolates events up through the userspace/udev infrastructure. Following is a detailed description of how this is accomplished.”…””}”(hjÓhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KshjÂhžhubh¸)”}”(hXEEH must be enabled in the PHB's very early during the boot process, and if a PCI slot is hot-plugged. The former is performed by eeh_init() in arch/powerpc/platforms/pseries/eeh.c, and the later by drivers/pci/hotplug/pSeries_pci.c calling in to the eeh.c code. EEH must be enabled before a PCI scan of the device can proceed. Current Power5 hardware will not work unless EEH is enabled; although older Power4 can run with it disabled. Effectively, EEH can no longer be turned off. PCI devices *must* be registered with the EEH code; the EEH code needs to know about the I/O address ranges of the PCI device in order to detect an error. Given an arbitrary address, the routine pci_get_device_by_addr() will find the pci device associated with that address (if any).”h]”(hXóEEH must be enabled in the PHB’s very early during the boot process, and if a PCI slot is hot-plugged. The former is performed by eeh_init() in arch/powerpc/platforms/pseries/eeh.c, and the later by drivers/pci/hotplug/pSeries_pci.c calling in to the eeh.c code. EEH must be enabled before a PCI scan of the device can proceed. Current Power5 hardware will not work unless EEH is enabled; although older Power4 can run with it disabled. Effectively, EEH can no longer be turned off. PCI devices ”…””}”(hjáhžhhŸNh NubhŒemphasis”“”)”}”(hŒ*must*”h]”hŒmust”…””}”(hjëhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1jéhjáubhX  be registered with the EEH code; the EEH code needs to know about the I/O address ranges of the PCI device in order to detect an error. Given an arbitrary address, the routine pci_get_device_by_addr() will find the pci device associated with that address (if any).”…””}”(hjáhžhhŸNh Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KzhjÂhžhubh¸)”}”(hXPThe default arch/powerpc/include/asm/io.h macros readb(), inb(), insb(), etc. include a check to see if the i/o read returned all-0xff's. If so, these make a call to eeh_dn_check_failure(), which in turn asks the firmware if the all-ff's value is the sign of a true EEH error. If it is not, processing continues as normal. The grand total number of these false alarms or "false positives" can be seen in /proc/ppc64/eeh (subject to change). Normally, almost all of these occur during boot, when the PCI bus is scanned, where a large number of 0xff reads are part of the bus scan procedure.”h]”hXXThe default arch/powerpc/include/asm/io.h macros readb(), inb(), insb(), etc. include a check to see if the i/o read returned all-0xff’s. If so, these make a call to eeh_dn_check_failure(), which in turn asks the firmware if the all-ff’s value is the sign of a true EEH error. If it is not, processing continues as normal. The grand total number of these false alarms or “false positives†can be seen in /proc/ppc64/eeh (subject to change). Normally, almost all of these occur during boot, when the PCI bus is scanned, where a large number of 0xff reads are part of the bus scan procedure.”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KˆhjÂhžhubh¸)”}”(hX<If a frozen slot is detected, code in arch/powerpc/platforms/pseries/eeh.c will print a stack trace to syslog (/var/log/messages). This stack trace has proven to be very useful to device-driver authors for finding out at what point the EEH error was detected, as the error itself usually occurs slightly beforehand.”h]”hX<If a frozen slot is detected, code in arch/powerpc/platforms/pseries/eeh.c will print a stack trace to syslog (/var/log/messages). This stack trace has proven to be very useful to device-driver authors for finding out at what point the EEH error was detected, as the error itself usually occurs slightly beforehand.”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K’hjÂhžhubh¸)”}”(hXÏNext, it uses the Linux kernel notifier chain/work queue mechanism to allow any interested parties to find out about the failure. Device drivers, or other parts of the kernel, can use `eeh_register_notifier(struct notifier_block *)` to find out about EEH events. The event will include a pointer to the pci device, the device node and some state info. Receivers of the event can "do as they wish"; the default handler will be described further in this section.”h]”(hŒ¹Next, it uses the Linux kernel notifier chain/work queue mechanism to allow any interested parties to find out about the failure. Device drivers, or other parts of the kernel, can use ”…””}”(hjhžhhŸNh NubhŒtitle_reference”“”)”}”(hŒ0`eeh_register_notifier(struct notifier_block *)`”h]”hŒ.eeh_register_notifier(struct notifier_block *)”…””}”(hj)hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1j'hjubhŒê to find out about EEH events. The event will include a pointer to the pci device, the device node and some state info. Receivers of the event can “do as they wishâ€; the default handler will be described further in this section.”…””}”(hjhžhhŸNh Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K™hjÂhžhubh¸)”}”(hŒOTo assist in the recovery of the device, eeh.c exports the following functions:”h]”hŒOTo assist in the recovery of the device, eeh.c exports the following functions:”…””}”(hjAhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K¢hjÂhžhubhŒdefinition_list”“”)”}”(hhh]”(hŒdefinition_list_item”“”)”}”(hŒErtas_set_slot_reset() assert the PCI #RST line for 1/8th of a second”h]”(hŒterm”“”)”}”(hŒrtas_set_slot_reset()”h]”hŒrtas_set_slot_reset()”…””}”(hj\hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1jZhŸh¶h K¥hjVubhŒ definition”“”)”}”(hhh]”h¸)”}”(hŒ/assert the PCI #RST line for 1/8th of a second”h]”hŒ/assert the PCI #RST line for 1/8th of a second”…””}”(hjohžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K¦hjlubah}”(h]”h ]”h"]”h$]”h&]”uh1jjhjVubeh}”(h]”h ]”h"]”h$]”h&]”uh1jThŸh¶h K¥hjQubjU)”}”(hŒkrtas_configure_bridge() ask firmware to configure any PCI bridges located topologically under the pci slot.”h]”(j[)”}”(hŒrtas_configure_bridge()”h]”hŒrtas_configure_bridge()”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1jZhŸh¶h K¨hj‰ubjk)”}”(hhh]”h¸)”}”(hŒSask firmware to configure any PCI bridges located topologically under the pci slot.”h]”hŒSask firmware to configure any PCI bridges located topologically under the pci slot.”…””}”(hjžhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K¨hj›ubah}”(h]”h ]”h"]”h$]”h&]”uh1jjhj‰ubeh}”(h]”h ]”h"]”h$]”h&]”uh1jThŸh¶h K¨hjQhžhubjU)”}”(hŒ{eeh_save_bars() and eeh_restore_bars(): save and restore the PCI config-space info for a device and any devices under it. ”h]”(j[)”}”(hŒ'eeh_save_bars() and eeh_restore_bars():”h]”hŒ'eeh_save_bars() and eeh_restore_bars():”…””}”(hj¼hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1jZhŸh¶h K­hj¸ubjk)”}”(hhh]”h¸)”}”(hŒQsave and restore the PCI config-space info for a device and any devices under it.”h]”hŒQsave and restore the PCI config-space info for a device and any devices under it.”…””}”(hjÍhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K«hjÊubah}”(h]”h ]”h"]”h$]”h&]”uh1jjhj¸ubeh}”(h]”h ]”h"]”h$]”h&]”uh1jThŸh¶h K­hjQhžhubeh}”(h]”h ]”h"]”h$]”h&]”uh1jOhjÂhžhhŸh¶h Nubh¸)”}”(hX A handler for the EEH notifier_block events is implemented in drivers/pci/hotplug/pSeries_pci.c, called handle_eeh_events(). It saves the device BAR's and then calls rpaphp_unconfig_pci_adapter(). This last call causes the device driver for the card to be stopped, which causes uevents to go out to user space. This triggers user-space scripts that might issue commands such as "ifdown eth0" for ethernet cards, and so on. This handler then sleeps for 5 seconds, hoping to give the user-space scripts enough time to complete. It then resets the PCI card, reconfigures the device BAR's, and any bridges underneath. It then calls rpaphp_enable_pci_slot(), which restarts the device driver and triggers more user-space events (for example, calling "ifup eth0" for ethernet cards).”h]”hXA handler for the EEH notifier_block events is implemented in drivers/pci/hotplug/pSeries_pci.c, called handle_eeh_events(). It saves the device BAR’s and then calls rpaphp_unconfig_pci_adapter(). This last call causes the device driver for the card to be stopped, which causes uevents to go out to user space. This triggers user-space scripts that might issue commands such as “ifdown eth0†for ethernet cards, and so on. This handler then sleeps for 5 seconds, hoping to give the user-space scripts enough time to complete. It then resets the PCI card, reconfigures the device BAR’s, and any bridges underneath. It then calls rpaphp_enable_pci_slot(), which restarts the device driver and triggers more user-space events (for example, calling “ifup eth0†for ethernet cards).”…””}”(hjíhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K¯hjÂhžhubeh}”(h]”Œ¤t-ppc64-linux-eeh-implementation”ah ]”h"]”Œ¤t ppc64 linux eeh implementation”ah$]”h&]”uh1h¡hh£hžhhŸh¶h Krubh¢)”}”(hhh]”(h§)”}”(hŒ%Device Shutdown and User-Space Events”h]”hŒ%Device Shutdown and User-Space Events”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h¦hjhžhhŸh¶h K¾ubh¸)”}”(hŒ±This section documents what happens when a pci slot is unconfigured, focusing on how the device driver gets shut down, and on how the events get delivered to user-space scripts.”h]”hŒ±This section documents what happens when a pci slot is unconfigured, focusing on how the device driver gets shut down, and on how the events get delivered to user-space scripts.”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K¿hjhžhubh¸)”}”(hŒÍFollowing is an example sequence of events that cause a device driver close function to be called during the first phase of an EEH reset. The following sequence is an example of the pcnet32 device driver::”h]”hŒÌFollowing is an example sequence of events that cause a device driver close function to be called during the first phase of an EEH reset. The following sequence is an example of the pcnet32 device driver:”…””}”(hj"hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KÃhjhžhubhŒ literal_block”“”)”}”(hXrpa_php_unconfig_pci_adapter (struct slot *) // in rpaphp_pci.c { calls pci_remove_bus_device (struct pci_dev *) // in /drivers/pci/remove.c { calls pci_destroy_dev (struct pci_dev *) { calls device_unregister (&dev->dev) // in /drivers/base/core.c { calls device_del (struct device *) { calls bus_remove_device() // in /drivers/base/bus.c { calls device_release_driver() { calls struct device_driver->remove() which is just pci_device_remove() // in /drivers/pci/pci_driver.c { calls struct pci_driver->remove() which is just pcnet32_remove_one() // in /drivers/net/pcnet32.c { calls unregister_netdev() // in /net/core/dev.c { calls dev_close() // in /net/core/dev.c { calls dev->stop(); which is just pcnet32_close() // in pcnet32.c { which does what you wanted to stop the device } } } which frees pcnet32 device driver memory } }}}}}}”h]”hXrpa_php_unconfig_pci_adapter (struct slot *) // in rpaphp_pci.c { calls pci_remove_bus_device (struct pci_dev *) // in /drivers/pci/remove.c { calls pci_destroy_dev (struct pci_dev *) { calls device_unregister (&dev->dev) // in /drivers/base/core.c { calls device_del (struct device *) { calls bus_remove_device() // in /drivers/base/bus.c { calls device_release_driver() { calls struct device_driver->remove() which is just pci_device_remove() // in /drivers/pci/pci_driver.c { calls struct pci_driver->remove() which is just pcnet32_remove_one() // in /drivers/net/pcnet32.c { calls unregister_netdev() // in /net/core/dev.c { calls dev_close() // in /net/core/dev.c { calls dev->stop(); which is just pcnet32_close() // in pcnet32.c { which does what you wanted to stop the device } } } which frees pcnet32 device driver memory } }}}}}}”…””}”hj2sbah}”(h]”h ]”h"]”h$]”h&]”Œ xml:space”Œpreserve”uh1j0hŸh¶h KÇhjhžhubh¸)”}”(hXZin drivers/pci/pci_driver.c, struct device_driver->remove() is just pci_device_remove() which calls struct pci_driver->remove() which is pcnet32_remove_one() which calls unregister_netdev() (in net/core/dev.c) which calls dev_close() (in net/core/dev.c) which calls dev->stop() which is pcnet32_close() which then does the appropriate shutdown.”h]”hXZin drivers/pci/pci_driver.c, struct device_driver->remove() is just pci_device_remove() which calls struct pci_driver->remove() which is pcnet32_remove_one() which calls unregister_netdev() (in net/core/dev.c) which calls dev_close() (in net/core/dev.c) which calls dev->stop() which is pcnet32_close() which then does the appropriate shutdown.”…””}”(hjBhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K÷hjhžhubh¸)”}”(hŒ---”h]”hŒ---”…””}”(hjPhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Kÿhjhžhubh¸)”}”(hŒjFollowing is the analogous stack trace for events sent to user-space when the pci device is unconfigured::”h]”hŒiFollowing is the analogous stack trace for events sent to user-space when the pci device is unconfigured:”…””}”(hj^hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Mhjhžhubj1)”}”(hXrpa_php_unconfig_pci_adapter() { // in rpaphp_pci.c calls pci_remove_bus_device (struct pci_dev *) { // in /drivers/pci/remove.c calls pci_destroy_dev (struct pci_dev *) { calls device_unregister (&dev->dev) { // in /drivers/base/core.c calls device_del(struct device * dev) { // in /drivers/base/core.c calls kobject_del() { //in /libs/kobject.c calls kobject_uevent() { // in /libs/kobject.c calls kset_uevent() { // in /lib/kobject.c calls kset->uevent_ops->uevent() // which is really just a call to dev_uevent() { // in /drivers/base/core.c calls dev->bus->uevent() which is really just a call to pci_uevent () { // in drivers/pci/hotplug.c which prints device name, etc.... } } then kobject_uevent() sends a netlink uevent to userspace --> userspace uevent (during early boot, nobody listens to netlink events and kobject_uevent() executes uevent_helper[], which runs the event process /sbin/hotplug) } } kobject_del() then calls sysfs_remove_dir(), which would trigger any user-space daemon that was watching /sysfs, and notice the delete event.”h]”hXrpa_php_unconfig_pci_adapter() { // in rpaphp_pci.c calls pci_remove_bus_device (struct pci_dev *) { // in /drivers/pci/remove.c calls pci_destroy_dev (struct pci_dev *) { calls device_unregister (&dev->dev) { // in /drivers/base/core.c calls device_del(struct device * dev) { // in /drivers/base/core.c calls kobject_del() { //in /libs/kobject.c calls kobject_uevent() { // in /libs/kobject.c calls kset_uevent() { // in /lib/kobject.c calls kset->uevent_ops->uevent() // which is really just a call to dev_uevent() { // in /drivers/base/core.c calls dev->bus->uevent() which is really just a call to pci_uevent () { // in drivers/pci/hotplug.c which prints device name, etc.... } } then kobject_uevent() sends a netlink uevent to userspace --> userspace uevent (during early boot, nobody listens to netlink events and kobject_uevent() executes uevent_helper[], which runs the event process /sbin/hotplug) } } kobject_del() then calls sysfs_remove_dir(), which would trigger any user-space daemon that was watching /sysfs, and notice the delete event.”…””}”hjlsbah}”(h]”h ]”h"]”h$]”h&]”j@jAuh1j0hŸh¶h Mhjhžhubeh}”(h]”Œ%device-shutdown-and-user-space-events”ah ]”h"]”Œ%device shutdown and user-space events”ah$]”h&]”uh1h¡hh£hžhhŸh¶h K¾ubh¢)”}”(hhh]”(h§)”}”(hŒ%Pro's and Con's of the Current Design”h]”hŒ)Pro’s and Con’s of the Current Design”…””}”(hj…hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h¦hj‚hžhhŸh¶h M*ubh¸)”}”(hX¡There are several issues with the current EEH software recovery design, which may be addressed in future revisions. But first, note that the big plus of the current design is that no changes need to be made to individual device drivers, so that the current design throws a wide net. The biggest negative of the design is that it potentially disturbs network daemons and file systems that didn't need to be disturbed.”h]”hX£There are several issues with the current EEH software recovery design, which may be addressed in future revisions. But first, note that the big plus of the current design is that no changes need to be made to individual device drivers, so that the current design throws a wide net. The biggest negative of the design is that it potentially disturbs network daemons and file systems that didn’t need to be disturbed.”…””}”(hj“hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h M+hj‚hžhubhŒ bullet_list”“”)”}”(hhh]”(hŒ list_item”“”)”}”(hŒÔA minor complaint is that resetting the network card causes user-space back-to-back ifdown/ifup burps that potentially disturb network daemons, that didn't need to even know that the pci card was being rebooted. ”h]”h¸)”}”(hŒÓA minor complaint is that resetting the network card causes user-space back-to-back ifdown/ifup burps that potentially disturb network daemons, that didn't need to even know that the pci card was being rebooted.”h]”hŒÕA minor complaint is that resetting the network card causes user-space back-to-back ifdown/ifup burps that potentially disturb network daemons, that didn’t need to even know that the pci card was being rebooted.”…””}”(hj¬hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h M2hj¨ubah}”(h]”h ]”h"]”h$]”h&]”uh1j¦hj£hžhhŸh¶h Nubj§)”}”(hX†A more serious concern is that the same reset, for SCSI devices, causes havoc to mounted file systems. Scripts cannot post-facto unmount a file system without flushing pending buffers, but this is impossible, because I/O has already been stopped. Thus, ideally, the reset should happen at or below the block layer, so that the file systems are not disturbed. Reiserfs does not tolerate errors returned from the block device. Ext3fs seems to be tolerant, retrying reads/writes until it does succeed. Both have been only lightly tested in this scenario. The SCSI-generic subsystem already has built-in code for performing SCSI device resets, SCSI bus resets, and SCSI host-bus-adapter (HBA) resets. These are cascaded into a chain of attempted resets if a SCSI command fails. These are completely hidden from the block layer. It would be very natural to add an EEH reset into this chain of events. ”h]”(h¸)”}”(hXhA more serious concern is that the same reset, for SCSI devices, causes havoc to mounted file systems. Scripts cannot post-facto unmount a file system without flushing pending buffers, but this is impossible, because I/O has already been stopped. Thus, ideally, the reset should happen at or below the block layer, so that the file systems are not disturbed.”h]”hXhA more serious concern is that the same reset, for SCSI devices, causes havoc to mounted file systems. Scripts cannot post-facto unmount a file system without flushing pending buffers, but this is impossible, because I/O has already been stopped. Thus, ideally, the reset should happen at or below the block layer, so that the file systems are not disturbed.”…””}”(hjÄhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h M7hjÀubh¸)”}”(hŒÀReiserfs does not tolerate errors returned from the block device. Ext3fs seems to be tolerant, retrying reads/writes until it does succeed. Both have been only lightly tested in this scenario.”h]”hŒÀReiserfs does not tolerate errors returned from the block device. Ext3fs seems to be tolerant, retrying reads/writes until it does succeed. Both have been only lightly tested in this scenario.”…””}”(hjÒhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h M>hjÀubh¸)”}”(hXYThe SCSI-generic subsystem already has built-in code for performing SCSI device resets, SCSI bus resets, and SCSI host-bus-adapter (HBA) resets. These are cascaded into a chain of attempted resets if a SCSI command fails. These are completely hidden from the block layer. It would be very natural to add an EEH reset into this chain of events.”h]”hXYThe SCSI-generic subsystem already has built-in code for performing SCSI device resets, SCSI bus resets, and SCSI host-bus-adapter (HBA) resets. These are cascaded into a chain of attempted resets if a SCSI command fails. These are completely hidden from the block layer. It would be very natural to add an EEH reset into this chain of events.”…””}”(hjàhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h MBhjÀubeh}”(h]”h ]”h"]”h$]”h&]”uh1j¦hj£hžhhŸh¶h Nubj§)”}”(hŒŸIf a SCSI error occurs for the root device, all is lost unless the sysadmin had the foresight to run /bin, /sbin, /etc, /var and so on, out of ramdisk/tmpfs. ”h]”h¸)”}”(hŒIf a SCSI error occurs for the root device, all is lost unless the sysadmin had the foresight to run /bin, /sbin, /etc, /var and so on, out of ramdisk/tmpfs.”h]”hŒIf a SCSI error occurs for the root device, all is lost unless the sysadmin had the foresight to run /bin, /sbin, /etc, /var and so on, out of ramdisk/tmpfs.”…””}”(hjøhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h MIhjôubah}”(h]”h ]”h"]”h$]”h&]”uh1j¦hj£hžhhŸh¶h Nubeh}”(h]”h ]”h"]”h$]”h&]”Œbullet”Œ-”uh1j¡hŸh¶h M2hj‚hžhubeh}”(h]”Œ%pro-s-and-con-s-of-the-current-design”ah ]”h"]”Œ%pro's and con's of the current design”ah$]”h&]”uh1h¡hh£hžhhŸh¶h M*ubh¢)”}”(hhh]”(h§)”}”(hŒ Conclusions”h]”hŒ Conclusions”…””}”(hjhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h¦hjhžhhŸh¶h MOubh¸)”}”(hŒThere's forward progress ...”h]”hŒThere’s forward progress ...”…””}”(hj-hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h MPhjhžhubeh}”(h]”Œ conclusions”ah ]”h"]”Œ conclusions”ah$]”h&]”uh1h¡hh£hžhhŸh¶h MOubeh}”(h]”Œpci-bus-eeh-error-recovery”ah ]”h"]”Œpci bus eeh error recovery”ah$]”h&]”uh1h¡hhhžhhŸh¶h Kubeh}”(h]”h ]”h"]”h$]”h&]”Œsource”h¶uh1hŒcurrent_source”NŒ current_line”NŒsettings”Œdocutils.frontend”ŒValues”“”)”}”(h¦NŒ generator”NŒ datestamp”NŒ source_link”NŒ source_url”NŒ toc_backlinks”Œentry”Œfootnote_backlinks”KŒ sectnum_xform”KŒstrip_comments”NŒstrip_elements_with_classes”NŒ strip_classes”NŒ report_level”KŒ halt_level”KŒexit_status_level”KŒdebug”NŒwarning_stream”NŒ traceback”ˆŒinput_encoding”Œ utf-8-sig”Œinput_encoding_error_handler”Œstrict”Œoutput_encoding”Œutf-8”Œoutput_encoding_error_handler”jnŒerror_encoding”Œutf-8”Œerror_encoding_error_handler”Œbackslashreplace”Œ language_code”Œen”Œrecord_dependencies”NŒconfig”NŒ id_prefix”hŒauto_id_prefix”Œid”Œ dump_settings”NŒdump_internals”NŒdump_transforms”NŒdump_pseudo_xml”NŒexpose_internals”NŒstrict_visitor”NŒ_disable_config”NŒ_source”h¶Œ _destination”NŒ _config_files”]”Œ7/var/lib/git/docbuild/linux/Documentation/docutils.conf”aŒfile_insertion_enabled”ˆŒ raw_enabled”KŒline_length_limit”M'Œpep_references”NŒ pep_base_url”Œhttps://peps.python.org/”Œpep_file_url_template”Œpep-%04d”Œrfc_references”NŒ rfc_base_url”Œ&https://datatracker.ietf.org/doc/html/”Œ tab_width”KŒtrim_footnote_reference_space”‰Œsyntax_highlight”Œlong”Œ smart_quotes”ˆŒsmartquotes_locales”]”Œcharacter_level_inline_markup”‰Œdoctitle_xform”‰Œ docinfo_xform”KŒsectsubtitle_xform”‰Œ image_loading”Œlink”Œembed_stylesheet”‰Œcloak_email_addresses”ˆŒsection_self_link”‰Œenv”NubŒreporter”NŒindirect_targets”]”Œsubstitution_defs”}”Œsubstitution_names”}”Œrefnames”}”Œrefids”}”Œnameids”}”(jHjEj+j(j`j]j¿j¼jjýjj|jjj@j=uŒ nametypes”}”(jH‰j+‰j`‰j¿‰j‰j‰j‰j@‰uh}”(jEh£j(hëj]j.j¼jcjýjÂj|jjj‚j=juŒ footnote_refs”}”Œ citation_refs”}”Œ autofootnotes”]”Œautofootnote_refs”]”Œsymbol_footnotes”]”Œsymbol_footnote_refs”]”Œ footnotes”]”Œ citations”]”Œautofootnote_start”KŒsymbol_footnote_start”KŒ id_counter”Œ collections”ŒCounter”“”}”…”R”Œparse_messages”]”Œtransform_messages”]”Œ transformer”NŒ include_log”]”Œ decoration”Nhžhub.