Graph-based generic genome data organization method and system

A data organization and genome technology, applied in the medical field, can solve the problems of readability, poor information integrity, chaotic data organization, and chaotic data organization, and achieve the effect of clear data structure.

Pending Publication Date: 2022-08-09
INST OF LAB ANIMAL SCI CHINESE ACAD OF MEDICAL SCI
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the face of a large amount of genomic data, genome maps are widely used as an effective data organization method, but for follow-up research, it is necessary to ensure the validity and simplicity of the data structure while maintaining the integrity of the sequence information. There are many related studies , but most of the data organization is chaotic, and the readability and information integrity are relatively poor
[0005] In order to solve the problems of chaotic data organization and poor sequence readability, validity and integrity when dealing with a large amount of genomic data, a graph-based pan-genome data organization method and system are provided

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Graph-based generic genome data organization method and system
  • Graph-based generic genome data organization method and system
  • Graph-based generic genome data organization method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] In order for those skilled in the art to better understand the solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

[0048] In some of the processes described in the description and claims of the present invention and the above-mentioned drawings, various operations are included in a specific order, but it should be clearly understood that these operations may not be in accordance with the order in which they appear herein. For execution or parallel execution, the sequence numbers of the operations, such as 101, 102, etc., are only used to distinguish different operations, and the sequence numbers themselves do not represent any execution order. Additionally, these flows may include more or fewer operations, and these operations may be performed sequentially or in parallel. It should b...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a graph-based generic genome data organization method, system and equipment and a computer readable storage medium. The method comprises the following steps: acquiring a group of generic genome sequence data; performing composition on the generic genome sequence data to obtain a generic genome coloring graph; marking and acquiring characteristics of an access state of a single node of the colored graph, and traversing the colored graph to obtain a cSupB data model after the colored graph is decomposed and data information of the cSupB data model; and determining an inclusion relation between the cSupB data models based on the data information of the cSupB data models, and constructing a cSupB structure tree model according to the inclusion relation. According to the method, the problems that the data organization mode is disordered and the readability, the effectiveness and the integrity of the sequence are poor when a large amount of genome data is targeted at present are solved.

Description

technical field [0001] The invention belongs to the field of medical technology, and in particular relates to a graph-based pan-genome data organization method and system thereof. Background technique [0002] The development of life sciences, medicine and other fields is closely related to the application of sequencing technology. However, due to sequencing technology, sequencing costs and even computing costs, many genome studies have many problems, such as over-reliance on reference genomes. At present, the reference genome occupies a very important position in many fields. In almost all studies involving genomes, the first thing people have to do is to construct a reference genome for the research species, and then carry out different follow-up studies based on the reference genome, such as the Comparisons of newly sequenced individual data from other species of species to reference genomes reveal differences, an approach that underlies the search for the genetic origins...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B45/00G16B50/30G16B20/00G16B30/00
CPCG16B45/00G16B50/30G16B20/00G16B30/00Y02A90/10
Inventor 郭金旦陈禹保刘江宁秦川
Owner INST OF LAB ANIMAL SCI CHINESE ACAD OF MEDICAL SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products