Method for diagnosing fault of server in real time
A real-time server and fault diagnosis technology, applied in the direction of instrumentation, response error generation, electrical digital data processing, etc., to achieve the effect of increasing hardware costs, reducing time, and reducing post-maintenance costs
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Examples
Embodiment 1
[0019] A method for real-time server fault diagnosis, the method is interconnected by BMC (Server Base Board Management Control Unit) and BIOS by LPC bus, BMC and CPU are interconnected by PECI bus, BIOS and memory, PCIE devices are interconnected by SMBus, PCIE bus; The diagnosis process of the method is as follows:
[0020] First, BMC reads the CPU, memory, and PCIE device fault status in real time through the LPC bus;
[0021] Secondly, when the BMC detects a device failure, it triggers an interrupt in real time, and the interrupt processing process reads some specific fault status registers of the CPU through the PECI bus and records them in the BMC storage space.
Embodiment 2
[0023] On the basis of Embodiment 1, the BMC described in this embodiment provides a standard network interface to provide a download function. If the fault is not retained on site after the fault occurs, the maintenance personnel can also download and analyze the CPU status register at the time of the fault state in the BMC storage space through the network interface to quickly locate the cause of the fault.
Embodiment 3
[0025] On the basis of embodiment 2, the method described in this embodiment is as follows for memory ECC fault diagnosis process:
[0026] 1) Interconnect BMC and BIOS through LPC bus, BMC and CPU through PECI bus, BIOS, memory and PCIE devices through SMBus and PCIE bus;
[0027] 2) The BIOS detects that an ECC fault has occurred in a certain memory through the SMBus bus, and the BIOS sends the memory ECC fault information to the BMC through the LPC bus;
[0028] 3) After the BMC reads the memory ECC fault information sent by the BIOS, it triggers the interrupt processing process, and the BMC reads some pre-agreed CPU fault status registers through the PECI bus, and records them in the BMC storage space;
[0029] 4) The maintenance personnel download the register status information stored in the BMC through the standard network interface provided by the BMC. These register information can clearly indicate which type of ECC failure (correctable ECC or uncorrectable ECC) has o...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com