Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for diagnosing fault of server in real time

A real-time server and fault diagnosis technology, applied in the direction of instrumentation, response error generation, electrical digital data processing, etc., to achieve the effect of increasing hardware costs, reducing time, and reducing post-maintenance costs

Inactive Publication Date: 2016-06-15
LANGCHAO ELECTRONIC INFORMATION IND CO LTD
View PDF4 Cites 30 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The technical problem to be solved by the present invention is: in order to solve the above problems, the present invention proposes a real-time server fault diagnosis method, through real-time fault state monitoring, automatically triggers an interrupt to read and save the specific fault state register of the CPU, and reaches the fault site time The purpose of diagnosing the server is to avoid failure to diagnose problems when the fault site does not exist, improve the hit rate of fault diagnosis, and reduce maintenance costs and impact on customer business

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0019] A method for real-time server fault diagnosis, the method is interconnected by BMC (Server Base Board Management Control Unit) and BIOS by LPC bus, BMC and CPU are interconnected by PECI bus, BIOS and memory, PCIE devices are interconnected by SMBus, PCIE bus; The diagnosis process of the method is as follows:

[0020] First, BMC reads the CPU, memory, and PCIE device fault status in real time through the LPC bus;

[0021] Secondly, when the BMC detects a device failure, it triggers an interrupt in real time, and the interrupt processing process reads some specific fault status registers of the CPU through the PECI bus and records them in the BMC storage space.

Embodiment 2

[0023] On the basis of Embodiment 1, the BMC described in this embodiment provides a standard network interface to provide a download function. If the fault is not retained on site after the fault occurs, the maintenance personnel can also download and analyze the CPU status register at the time of the fault state in the BMC storage space through the network interface to quickly locate the cause of the fault.

Embodiment 3

[0025] On the basis of embodiment 2, the method described in this embodiment is as follows for memory ECC fault diagnosis process:

[0026] 1) Interconnect BMC and BIOS through LPC bus, BMC and CPU through PECI bus, BIOS, memory and PCIE devices through SMBus and PCIE bus;

[0027] 2) The BIOS detects that an ECC fault has occurred in a certain memory through the SMBus bus, and the BIOS sends the memory ECC fault information to the BMC through the LPC bus;

[0028] 3) After the BMC reads the memory ECC fault information sent by the BIOS, it triggers the interrupt processing process, and the BMC reads some pre-agreed CPU fault status registers through the PECI bus, and records them in the BMC storage space;

[0029] 4) The maintenance personnel download the register status information stored in the BMC through the standard network interface provided by the BMC. These register information can clearly indicate which type of ECC failure (correctable ECC or uncorrectable ECC) has o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a real-time server fault diagnosis method. The method is interconnected by BMC and BIOS through LPC bus, BMC and CPU are interconnected through PECI bus, BIOS and memory and PCIE devices are interconnected through SMBus and PCIE bus; BMC is interconnected through LPC bus Read the fault status of CPU, memory and PCIE devices in real time; when the BMC detects a device fault, it will trigger an interrupt in real time, and the interrupt processing process reads some specific fault status registers of the CPU through the PECI bus, and records them in the BMC storage space. The invention realizes the purpose of real-time diagnosis of server faults at the fault site, improves the hit rate of fault diagnosis, reduces the time for fault location, and effectively reduces the impact on customer services.

Description

technical field [0001] The invention relates to the technical field of server fault diagnosis, in particular to a method for real-time server fault diagnosis. Background technique [0002] With the development of computer technology, big data and other technologies, the requirements for the stability and reliability of the server are getting higher and higher. At the beginning of the server design, although more fault-tolerant and reliability designs were carried out, as the server system The complexity is getting higher and higher, and it is inevitable that server failures will occur, especially CPU, memory, and PCIE device failures. In order to minimize the impact on business, higher requirements are put forward for maintenance personnel, and maintenance personnel are required to be able to quickly perform Fault diagnosis, locating the cause of the fault, since the site of the fault generally cannot be retained, this makes it difficult for maintenance personnel to quickly ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F11/07G06F11/10
CPCG06F11/0751G06F11/10
Inventor 刘宝阳刘冰
Owner LANGCHAO ELECTRONIC INFORMATION IND CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products