Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Dynamic group system for layered rollback recovery protocols based on MPI (Message Passing Interface) high performance computing

A high-performance computing and rollback recovery technology, applied in the field of high-performance computing and system fault tolerance, can solve problems such as low efficiency, grouping mechanism cannot adapt to application communication mode changes, etc., to achieve size reduction, strong versatility and portability, The effect of reducing overhead

Active Publication Date: 2016-08-10
HUAZHONG UNIV OF SCI & TECH
View PDF2 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] For the above defects or improvement needs of the prior art, the purpose of the present invention is to provide a dynamic grouping system based on the MPI high-performance computing layered wrapping recovery protocol, aiming at solving the inability of the grouping mechanism of the existing layered wrapping recovery protocol Adapting to technical inefficiencies caused by changes in application communication patterns

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Dynamic group system for layered rollback recovery protocols based on MPI (Message Passing Interface) high performance computing
  • Dynamic group system for layered rollback recovery protocols based on MPI (Message Passing Interface) high performance computing
  • Dynamic group system for layered rollback recovery protocols based on MPI (Message Passing Interface) high performance computing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0027] like figure 1 As shown, the present invention provides a dynamic grouping system based on the MPI high-performance computing layered rollback recovery protocol, including a message monitoring module, a message analysis module, and a process migration module. The message monitoring module is used to monitor the message delivery records among the various processes in the application program, and save the records in a certain format, and submit the message delivery records to the message analysis module; the message analysis module is used to collect the information collected by the message mon...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a dynamic group system for layered rollback recovery protocols based on MPI (Message Passing Interface) high performance computing. The system comprises a message monitoring module, a message analysis module and a process migration module, and belongs to the fields of high performance computing and system fault tolerance. The message monitoring module is used for monitoring a message passing record between processes in an MPI high performance computing application, storing the record in a format of a triad (source process, target process and message size) and submitting the message passing record to the message analysis module; the message analysis module is used for analyzing the message passing record collected by the message monitoring module, analyzing to obtain a message passing mode of the current application as a basis for the application to execute the process migration module at the next time while utilizing the last message passing mode to judge whether to execute the process migration module at this time; the process migration module is used for carrying out migration operation on the changed process when the message passing mode of the application changes, so that the performance optimization of the layered rollback recovery protocols is realized.

Description

technical field [0001] The invention belongs to the field of high-performance computing and system fault tolerance, and more specifically relates to a dynamic grouping system based on MPI high-performance computing layered rollback recovery protocol. Background technique [0002] With the development of the field of high-performance computing, high-performance computers have grown to a scale of one million nodes, and there may be further growth in the future. At the same time, the Mean Time Between Failures (MTBF) of high-performance computer systems has dropped significantly compared to before, even reaching the order of several hours. However, the data size, computational complexity, and running time of distributed commercial applications and large-scale scientific computing applications remain at a high level, and even the running time lasts for several months, which is much greater than the MTBF. This will cause the system to spend too much time dealing with system erro...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F11/14G06F9/48G06F9/54
CPCG06F9/4843G06F9/546G06F11/1458G06F11/1479
Inventor 廖小飞金海张斌圣
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products