Online identification method and system for high-frequency continuous failure tasks in cloud computing system

An identification method and cloud computing technology, applied in the field of cloud computing, can solve problems such as system resource waste, increase cluster scheduler load, and fast recovery, so as to avoid resource waste and scheduling load, and improve reliability and availability.

Active Publication Date: 2017-12-01
PEKING UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the existing technology does not have an effective method for identifying high-frequency continuous failure tasks
Although high-frequency continuous failure tasks will be rescheduled by the system immediately after each failure, they cannot be quickly recovered by restarting, but will fail repeatedly after repeated scheduling.
Repeated failures not only cause a lot of waste of system resources, but also increase the load of the cluster scheduler, bringing potential harm to the cloud computing system, making it difficult to meet the high availability requirements of the cloud computing system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Online identification method and system for high-frequency continuous failure tasks in cloud computing system
  • Online identification method and system for high-frequency continuous failure tasks in cloud computing system
  • Online identification method and system for high-frequency continuous failure tasks in cloud computing system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0069] Below in conjunction with the accompanying drawings, the present invention is further described by means of embodiments, but the scope of the present invention is not limited in any way.

[0070] figure 1 is a flowchart of the method for online identification of high-frequency continuous failure tasks in a cloud computing system provided by the embodiment of the present invention, figure 2 It is the structure and system data processing flow chart of the online identification system for high-frequency continuous failure tasks in the cloud computing system provided by the embodiment of the present invention. The following describes the process of the implementation of the method provided by the present invention by specific examples:

[0071] 1) First, the ETL module reads the offline monitoring data from the offline data source, and converts the data into a specific data structure;

[0072] Data can be read through the API provided by the system. For file systems, co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an online identification method and system for high-frequency continuous failure tasks in a cloud computing system. According to offline monitoring data, offline analysis and learning based on time series are performed to obtain a certain confidence level that can represent all non-high-frequency continuous failure tasks. The failure frequency threshold of the failure frequency feature of the failure task is then identified to obtain high-frequency continuous failure tasks in the online data. The present invention analyzes the tasks in the cloud computing system from the perspective of events and resources, obtains the frequency of failure events within a time period and the system resources consumed by the tasks, and identifies cloud computing in real time by analyzing the failure frequency characteristics of tasks and the time series pattern of resource usage. For high-frequency continuous failure tasks that fail repeatedly and are difficult to repair in the system, notify the cloud computing system in advance to take proactive failure recovery measures to save system resources for the cloud computing system and improve the reliability and availability of the cloud computing system.

Description

technical field [0001] The invention belongs to the technical field of cloud computing, and in particular relates to an online identification method and system for high-frequency continuous failure tasks in a cloud computing system. Background technique [0002] Cloud computing has been widely used in various fields such as finance and business due to its on-demand consumption mode. The high availability of the system in the cloud computing environment has increasingly become the key to the maturity of cloud computing technology. However, due to the gradual expansion of cloud computing system scale and increasing heterogeneity, various types of failures frequently occur in cloud computing systems, which has become one of the key factors that threaten the availability and reliability of cloud computing systems. In a cloud computing system, a task, as the smallest scheduling unit running on a single node, is the basic guarantee for the normal execution of user applications, an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F11/00
Inventor 李影唐红艳贾统吴中海张齐勋
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products