Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

GPU monitoring alarm system with monitoring function customized by cloud platform

An alarm system and self-defined technology, applied in hardware monitoring, instrumentation, electrical digital data processing, etc., can solve problems such as inability to customize monitoring configuration

Pending Publication Date: 2020-01-10
SHANDONG LANGCHAO YUNTOU INFORMATION TECH CO LTD
View PDF3 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This technical solution can observe GPU performance in real time, which is convenient for managers to grasp the usage of GPU resources. In the case of GPU overload, they can receive notification information in time, allocate resources, and make reasonable responses, which greatly reduces the management of the entire workstation. Maintenance costs improve the efficiency of administrator maintenance, but it is not possible for users to customize the monitoring configuration according to their needs and flexibly generate monitoring data that meets user needs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • GPU monitoring alarm system with monitoring function customized by cloud platform
  • GPU monitoring alarm system with monitoring function customized by cloud platform
  • GPU monitoring alarm system with monitoring function customized by cloud platform

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0048] as attached figure 1 As shown, the cloud platform of the present invention can customize the GPU monitoring and warning system of monitoring, and its structure includes,

[0049] The data collection module is used to collect performance indicators of the GPU in a cycle of 1 minute; the performance indicators of the GPU include but are not limited to GPU utilization, GPU memory utilization, GPU memory occupancy, GPU power and GPU temperature.

[0050] The monitoring configuration management module is used to configure the GPU monitoring dimension, GPU monitoring index, GPU monitoring cycle and GPU monitoring statistical method; wherein, the GPU monitoring dimension includes the ID of the cloud server on which the GPU is mounted, the ID of the GPU, and the user name or user ID;

[0051] GPU monitoring indicators include GPU utilization, GPU memory utilization, GPU memory occupancy, GPU power, and GPU temperature;

[0052] The minimum granularity of the GPU monitoring cycle...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a GPU monitoring alarm system with a monitoring function customized by a cloud platform, and belongs to the monitoring alarm technical field of the cloud platforms. The technical problem to be solved by the invention is how to realize the user-defined monitoring configuration according to the requirements, and generate the monitoring data satisfying the requirements of theusers flexibly. According to the technical scheme, the system comprises a data acquisition module, a monitoring configuration management module, an alarm rule management module and a data processingmodule; the data acquisition module is used for periodically acquiring the performance indexes of a GPU; the monitoring configuration management module is used for configuring a GPU monitoring dimension, a GPU monitoring index, a GPU monitoring period and a GPU monitoring statistical method; the alarm rule management module is used for configuring the alarm rules; the data processing module is used for storing the acquired data and generating the monitoring data according to the monitoring configuration and the acquired data and is used for traversing the alarm rules regularly, generating thealarm data or clearing the alarm data according to the acquired data, and forwarding the alarm data according to a configured notification mode.

Description

technical field [0001] The invention relates to the technical field of monitoring and alarming of cloud platforms, in particular to a GPU monitoring and alarming system capable of custom monitoring of cloud platforms. Background technique [0002] For three decades, changes in CPU performance have never deviated from Moore's Law. But improvements in CPU performance have slowed. GPU computing defines a new overload law. It starts with highly specialized parallel processors and continues to evolve through system design, system software, algorithms, and optimized applications. It is especially suitable for the increasing demand for computing power in application scenarios such as artificial intelligence, HPC, and graphics and image processing. [0003] GPU cloud physical hosts in the form of bare metal can provide computing power of "one machine with multiple cards" or "multiple machines with multiple cards". However, for some users, multiple GPU cards exceed the user's com...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F11/30
CPCG06F11/3006G06F11/3058
Inventor 屈傲高传集于昊张晓玉
Owner SHANDONG LANGCHAO YUNTOU INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products