Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Insertion mutation detection method and system based on new-generation sequencing data

A technology for sequencing data and variant detection, applied in genomics, instrumentation, proteomics, etc., can solve the problems of wrong detection results, large deviation of repeated sequences, assembly errors, etc., to achieve a good detection effect and solve the effect of inaccurate judgment

Active Publication Date: 2019-10-01
XIDIAN UNIV
View PDF14 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0014] (1) Most of the existing technologies only detect a single type of insertional mutation, which does not meet the diverse types of insertional mutations in cancer samples, which greatly limits the ability of cancer diagnosis and targeted drug selection
[0015] (2) Most of the existing technologies have insufficient ability to detect large-scale insertional mutations, focusing on the detection of small fragment insertions and deletions, making the method insufficient for detection of insertional mutations, and it is difficult to comprehensively extract DNA mutation information of cancer samples
[0016] (3) Some methods in the prior art use local de novo assembly algorithms to detect large fragment insertion mutations, but they are susceptible to assembly errors caused by repetitive sequence regions, resulting in wrong mutation detection results
[0018] (1) Due to the objective physical problem of obtaining DNA fragments by next-generation sequencing technology, the read reads obtained by it are usually 100-250bp, so when using these read information to detect gene insertion mutations, for large fragments (50-1000bp ) detection of insertional mutations cannot be obtained through simple read comparison, making it particularly difficult to establish a detection model, so the detection of large fragment insertional mutations is a big challenge
[0019] (2) Repeated sequences in genes are relatively common. Due to the short-read characteristics of next-generation sequencing technology, there may be large deviations in the comparison of repeated sequences, which may cause errors in subsequent detection results. Therefore, the impact of repeated regions on insertion mutation detection results should be solved. is another big challenge

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Insertion mutation detection method and system based on new-generation sequencing data
  • Insertion mutation detection method and system based on new-generation sequencing data
  • Insertion mutation detection method and system based on new-generation sequencing data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] In order to make the object, technical solution and advantages of the present invention more clear, the present invention will be further described in detail below in conjunction with the examples. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0053] In view of the fact that the existing technology does not meet the situation of various types of insertion mutations in cancer samples, it greatly limits the ability of cancer diagnosis and targeted drug selection; the ability to detect large-scale insertion mutations is insufficient; the problem of obtaining wrong mutation detection results . The present invention uses the split read and insert size information of the paired-end read-end to accurately target the site and type of insertion variation. The present invention technically uses an insertion sequence iterative splicing method to detect and extract insertio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of genome sequencing, and discloses an insertion mutation detection method based on new-generation sequencing data. The method comprises the steps: when amutation generation site is determined, a region where insertion mutation occurs certainly generates a split reading section, aiming at the characteristics that insertion variation types such as new sequence insertion, sequence series multiplication and sequence dispersion multiplication are different in distribution of missing variation and inverted mutation split reading sections, constructing avirtual reference sequence by utilizing partial matching, complete matching and unmatched read segment information after determining the insertion mutation generation type and site, and comparing thevirtual reference sequence with the original reference sequence to obtain related information of the insertion sequence; and obtaining a mutant genotype by utilizing the copy number state information. According to the invention, the problem of inaccurate insertion mutation point judgment can be solved; the problem of omission caused by insertion mutation detection of an SR method can be solved; the problem that in the prior art, repeated sequences may cause detection errors can be solved.

Description

technical field [0001] The invention belongs to the technical field of genome sequencing, and in particular relates to an insertion variation detection method based on next-generation sequencing data. Background technique [0002] Currently, the closest existing technology: split read analysis method based on next-generation sequencing technology. Next-generation sequencing is a DNA sequencing technology. During the sequencing process, the complete sample DNA sequence is broken up, and fragments that meet a specific length (usually hundreds of bp) are screened out. Read a sequence of tens to hundreds of bp in length. The length of the read sequence is usually much smaller than the length of the DNA sequence of the tested sample, but the next-generation sequencing technology can read a large number of such short sequences at the same time, so that the total length of all short sequences reaches several times to tens of the length of the sample DNA times, making it possible ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16B20/20
CPCG16B20/20Y02A90/10
Inventor 袁细国谢文路李杰习佳宁杨利英张军英许向彦
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products