Method for determining chromosome structure variation signal intensity and insert fragment length distribution characteristics of sample, and application thereof

A technology of inserting fragments and structural variation, applied in the field of bioinformatics, can solve problems such as chromosomal rearrangement and miscarriage

Active Publication Date: 2020-07-03
深圳思勤医疗科技有限公司
View PDF11 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] 3. The position of a segment of an inverted chromosome is reversed by 180 degrees, resulting in a rearrangement within the chromosome such as female habitual abortion (inversion of the long arm of chromosome 9)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for determining chromosome structure variation signal intensity and insert fragment length distribution characteristics of sample, and application thereof
  • Method for determining chromosome structure variation signal intensity and insert fragment length distribution characteristics of sample, and application thereof
  • Method for determining chromosome structure variation signal intensity and insert fragment length distribution characteristics of sample, and application thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0077] Example 1 Sample plasma separation, library preparation, sequencing on the machine;

[0078] 1. Plasma Separation

[0079] a) Prepare the instruments, reagents, and consumables required for the experiment, and the high-speed refrigerated centrifuge should be pre-cooled to 4°C in advance.

[0080] b) If the peripheral blood sample is collected with an EDTA anticoagulant tube, put it in a 4°C refrigerator immediately after drawing the blood, and conduct plasma separation within 2 hours. If the peripheral blood sample is collected with free nucleic acid storage tubes such as streck tubes, it can be placed at room temperature, and plasma separation can be performed within the time specified in the blood collection tube instructions.

[0081] c) Record the sample information, balance the blood collection tube, replace the high-speed refrigerated centrifuge with a horizontal rotor, and set parameters: temperature 4°C, centrifugal force 1600g, time 10min. Place the blood col...

Embodiment 2

[0190] 1. According to the method of Example 1, complete the library sequencing of the samples, obtain off-machine data, filter out low-quality reads, and use the comparison software (bwa) to compare these sequencing reads to the human reference genome ( hg19).

[0191] Second, compare and compare the bam files for statistics:

[0192] 1. Filter reads and duplicate reads with only one end alignment (duplicate reads marked with samtools or picard)

[0193] 2. Statistical low-quality alignment rate: low-quality alignment refers to the alignment reads whose alignment quality value of reads at any end is less than 30. Add up all these reads and divide by the total number of reads, which is the low-quality alignment Rate;

[0194] 3. For abnormal alignments with high alignment quality values ​​(greater than 30), count the signal strength of reads supporting structural variation: including 1) as follows Figure 4 As shown, the statistics of the number of reads (reads) in the thre...

Embodiment 3

[0243] Based on Example 2, it is obtained: (1) the signal intensity supporting the variation of chromosome structure; (2) the ratio of mitochondrial content of the sample; (3) the ratio of inserts in the entire sample between 180-220 and 250-300, and those less than 150 bp The "peak-to-valley spacing" between the peaks and troughs. (3) The ratio of the number of short fragments to long fragments in each 5M interval, after dimensionality reduction by principal component analysis, take the value of the first 10 principal components.

[0244] Input these statistical values ​​of samples as feature vectors, use machine learning methods (such as: SVM, Lasso, GBM), and based on the above nearly 400 cancer and normal samples, use 10-fold cross-validation to test the effect of tumor prediction. The samples were divided into 10 points on average, and 9 of them were used as the training set in order to establish a tumor prediction model. The remaining one is used as a training set to me...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for determining chromosome structure variation signal intensity and insert fragment length distribution characteristics of a sample, and application thereof. Specifically, the invention relates to a method for determining a sample source. The method comprises the following steps: (1) comparing sequencing read data of chromosomes in a sample with a reference genome,and determining a low-quality comparison rate and a high-quality comparison rate of the sequencing read data of the chromosomes of the sample; (2) determining the structural variation proportion of the chromosomes of the sample, the content of mitochondrial DNA and the proportion of an insert fragment with a predetermined length based on reads corresponding to the high-quality comparison rate; (3)determining the probability of sample sources based on a predetermined tumor prediction model, the structural variation proportion of the chromosomes obtained in the step (2), the content of the mitochondrial DNA and the proportion of the insert fragment with the predetermined length; and (4) determining the source of the sample based on the probability of the sample source.

Description

technical field [0001] The present invention relates to the field of biological information. Specifically, the present invention relates to a method and application for determining the signal intensity of chromosome structural variation in a sample and counting the distribution of the length of insert fragments in a sample. More specifically, the present invention relates to determining the signal intensity of chromosome structural variation in a sample and the method and application of insert fragments. distribution, a method of mitochondrial copy number and a method of determining the source of a sample. Background technique [0002] According to reports, about 10% of cancers exhibit a large number of chromosome structural variations, Chromothripsis. There are four main types of structural changes in chromosomes: [0003] 1. The deletion of a certain segment of the missing chromosome. For example, meowing syndrome is a genetic disease caused by the partial deletion of chr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B20/30G16B30/10
CPCG16B20/30G16B30/10
Inventor 李世勇茅矛张锋钟果林陈彦
Owner 深圳思勤医疗科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products