Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A method for decompressing DNA self-indexed intervals

A decompression and self-indexing technology, applied in the direction of code conversion, electrical components, etc., can solve the problems of long decompression time and large data storage space, and achieve the effect of reducing decompression time, strong applicability and storage space

Active Publication Date: 2022-04-12
HARBIN INST OF TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to solve the problem that the existing decompression algorithm requires a long decompression time and the decompressed data requires a large storage space, and proposes a DNA self-index interval decompression method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for decompressing DNA self-indexed intervals
  • A method for decompressing DNA self-indexed intervals
  • A method for decompressing DNA self-indexed intervals

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0015] Specific implementation mode one: combine figure 1 This embodiment will be described. A DNA self-index interval decompression method described in this embodiment, the method is specifically implemented through the following steps:

[0016] Step 1. Input the sequence data file to be decompressed, and configure the index interval parameter ([start, end]) and the decompression output mode parameter (mode);

[0017] Step 2, according to the index interval parameter, determine the interval range that needs to be decompressed in the sequence data file to be decompressed;

[0018] Step 3. According to the header file information of the sequence data file to be decompressed, determine the sequenced short-read base bit information (short-read "column" for short) within the range that needs to be decompressed, which can be compared to the bases on the reference genome. Sequencing quality score bit information (referred to as comparison quality score), sequencing quality score b...

specific Embodiment approach 2

[0030] Embodiment 2: This embodiment is a further detailed description of Embodiment 1. The decompression output mode parameter determines the type of data to be decompressed and output.

specific Embodiment approach 3

[0031] Specific embodiment three: this embodiment is a further specific description of specific embodiment two. When the decompression output mode parameter is set to 1, the data type of the decompression output is a gene sequence. When the decompression output mode parameter is set to When 2, the decompressed output data type is a short-read sequence. When the decompressed output mode parameter is set to 3, the decompressed output data type is a whole genome sequence.

[0032] By default it is 0, which decompresses according to the short-read sequence condition.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A DNA self-index interval decompression method belongs to the technical field of decompression of DNA compressed data. The invention solves the problems that the existing decompression algorithm requires a long decompression time and the decompressed data requires a large storage space. The self-index interval decompression algorithm of the present invention can select the decompression range according to the requirement, and compared with the global static TPBWT decompression algorithm, it greatly reduces the decompression time and also reduces the storage space of the decompressed data. Compared with the traditional decompression algorithm, this algorithm is more flexible and can decompress data with different meanings according to different needs, and has stronger applicability. The present invention can be applied to the decompression of DNA compressed data.

Description

technical field [0001] The invention relates to the technical field of decompression of DNA compressed data, in particular to a method for decompressing DNA self-index intervals. Background technique [0002] With the development of DNA sequencing technology, biomedical research is facing the problem of how to store and transmit DNA data. Compressing DNA data and then decompressing it has become one of the important methods to solve the problem. [0003] After the LYZip tool performs data compression based on the TPBWT algorithm to obtain the short-read sequencing data, the existing decompression algorithm can only achieve global and static decompression. Although the existing decompression algorithm can realize the decompression of DNA data, it takes a long time to decompress, and the storage space required for the decompressed data is also large. Therefore, a method to reduce the decompression time and storage space is proposed. method is very necessary. Contents of th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): H03M7/30
CPCH03M7/30
Inventor 李杨刘博王亚东
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products