Equipment exception detection method and device

A technology of equipment abnormality and detection method, applied in error detection/correction, generation of response error, instrument, etc., can solve problems such as equipment abnormality, slow response, too late to deal with and repair the risk of host machine downtime, and avoid the risk of downtime , solve the effect of slow response

Active Publication Date: 2020-04-07
ALIBABA GRP HLDG LTD
View PDF8 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The embodiment of the present invention provides a device abnormality detection method and device to at least solve the technical problem in the related art that the slow response of the AER driver makes it too late to deal with the downtime risk of the host machine brought by the hardware repair

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Equipment exception detection method and device
  • Equipment exception detection method and device
  • Equipment exception detection method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0034] According to an embodiment of the present invention, an embodiment of a device abnormality detection method is also provided. It should be noted that the steps shown in the flow chart of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and , although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that shown or described herein.

[0035] The method embodiment provided in Embodiment 1 of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. figure 1 It shows a block diagram of hardware structure of a computer terminal (or mobile device) for realizing the detection method of device abnormality. Such as figure 1 As shown, the computer terminal 10 (or mobile device 10) may include one or more (shown by 102a, 102b, ..., 102n in the figure) processor 102 (the processor 1...

Embodiment 2

[0063] According to an embodiment of the present invention, there is also provided a device for implementing the above method for detecting equipment abnormalities, such as image 3 As shown, the device includes: a monitoring unit 301 , a control unit 302 and a detection unit 303 .

[0064] Specifically, the monitoring unit 301 is configured to monitor the capacity of the PCIe link of the PCIe terminal device to store data packets through the flow control characteristic of the high-speed serial computer expansion bus PCIe.

[0065] It should be noted that the scoring score of the PCIe terminal device can be monitored through the flow control characteristic of the high-speed serial computer expansion bus PCIe, wherein the scoring score is used to indicate the capacity of the PCIe link at the PCIe terminal device side to store data packets, for example, The scoring points of the above-mentioned PCIe terminal equipment are credits, and the above-mentioned data packet is the basic...

Embodiment 3

[0086] Embodiments of the present invention may provide a computer terminal, and the computer terminal may be any computer terminal device in a group of computer terminals. Optionally, in this embodiment, the foregoing computer terminal may also be replaced with a terminal device such as a mobile terminal.

[0087] Optionally, in this embodiment, the foregoing computer terminal may be located in at least one network device among multiple network devices of the computer network.

[0088] In this embodiment, the above-mentioned computer terminal can execute the program code of the following steps in the detection method of the device abnormality of the application program: monitor the PCIe link storage data packet of the PCIe terminal device through the flow control characteristic of the high-speed serial computer expansion bus PCIe Capacity; when the capacity of the data packet reaches the preset threshold, the control PCIe link is closed and an error report message is triggere...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an equipment exception detection method and device. The method comprises the following steps: monitoring the capacity of a PCIe link storage data packet of PCIe terminal equipment through the flow control characteristic of a high-speed serial computer extension bus PCIe; under the condition that the capacity of the data packet reaches a preset threshold value, controlling the PCIe link to be closed, and triggering an error report message, the error report message being an error report message triggered by an error report mechanism of PCIe; and triggering the driver to detect the state of the PCIe terminal device through the error report message so as to determine whether the PCIe terminal device is abnormal or not. According to the method and the device, the technical problem that the downtime risk of the host machine caused by hardware cannot be processed and repaired in time due to slow response of the AER drive program in related technologies is solved.

Description

technical field [0001] The invention relates to the field of equipment detection, in particular to a method and device for detecting equipment abnormality. Background technique [0002] When heterogeneous computing products provide computing services, GPU / FPGA resource sales are provided to virtual machines in a direct way. However, once such hardware itself or the hardware error triggered by improper handling of these hardware inside the virtual machine will cause the PCIe interface to be unavailable. Therefore, the stability, reliability, and security isolation of heterogeneous computing products have always been the top priority. However, in some specific cases, GPU computing service or FPGA service may not respond to the access to GPU / FPGA pass-through device hardware due to hardware instability and unpredictable reasons, and then cause serious system errors. Due to the slow response of the AER driver, it is too late to process the repair The downtime risk of the host ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/07
CPCG06F11/0754G06F11/0745G06F11/0766
Inventor 郑晓龙欣谢峰
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products