Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for distinguishing somatic mutation and germline mutation

A somatic cell mutation and germline technology, applied in the field of bioinformatics, can solve the problems of unsatisfactory accuracy, consumption of funds and computing resources, high integrity and computing storage resources, etc.

Active Publication Date: 2021-08-20
GUANGZHOU BURNING ROCK DX CO LTD
View PDF8 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004]However, the current methods for identifying somatic mutations mainly rely on the detection of paired samples. Parallel sequencing of paired samples can accurately determine the source of mutations, but for the initial Samples of paired material were not collected and recollecting paired samples was often very difficult
In addition, high-throughput sequencing with the same depth as tumor samples will cause a large consumption of funds and computing resources
At the same time, this method has high requirements on the integrity of sample collection and computing and storage resources, and will significantly increase the cost of mutation detection
In addition, the methods of mutation frequency filtering and mutation annotation database comparison still cannot meet the requirements in terms of accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for distinguishing somatic mutation and germline mutation
  • Method for distinguishing somatic mutation and germline mutation
  • Method for distinguishing somatic mutation and germline mutation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0230] Example 1 Obtaining the mutation site described in this application

[0231] 1. Data preparation

[0232] a) Sequence reply: Use the mem module in the bwa 0.7.10 software to map the sequence to the human reference genome GRCh37 / hg19 to form a .bam file of the alignment result.

[0233] 2. Variant identification

[0234] Use vardict 1.5.1 to perform mutant calling (variant calling) on ​​SNV, and the calling parameters are as follows:

[0235] a) Remove bases with base quality < 30;

[0236] b) Remove reads with low mapping quality, such as < 60 reads;

[0237] c) Remove reads with too many mismatches, for example: more than 12, 10, 8 or 6 mismatches;

[0238] d) The mutation frequency should not be too small, for example: mutation frequency >=0.002, 0.001, 0.0005, 0.0002 or 0.0001;

[0239] e) Reads supporting mutations (reads) >= 3, 2 or 1;

[0240] 3. Variant annotation

[0241] These include database annotations, hot spot mutation (hot) site annotations, mutati...

Embodiment 2

[0258] Embodiment 2 Obtaining the method for the difference described in this application

[0259] 2.1

[0260] According to the mutation site SNV obtained in Example 1, the difference value described in the application is calculated according to the following steps:

[0261] a) Acquisition of wild-type supporting fragments and mutant-type supporting fragments: wherein, the wild-type supporting fragments are cfDNA fragments containing wild-type base sequences, and the mutant-type supporting fragments are cfDNA fragments containing mutant-type base sequences, Wherein, the wild-type base sequence is the same sequence as the nucleotide sequence at the corresponding position of the mutation site in the reference genome, wherein the mutant base sequence is the same sequence as the reference genome at the Compared with the nucleotide sequence at the corresponding position of the mutation site, the sequence is different, and the reference genome is the human reference genome in the ...

Embodiment 3

[0294] Embodiment 3 Carry out the machine learning described in this application

[0295] (1) Input the indicators involved in Table 1 into the machine learning model described in this application for machine learning training.

[0296] These indicators can be divided into 7 types according to the types of different characteristics, and the indicators are all related to the mutation site.

[0297] Table 1

[0298]

[0299] a) Location information: including the chromosome location where the SNV is located, for example, 68771372 on chromosome 16.

[0300] b) Base substitution pattern: In a single SNV locus, the base conversion from the wild type to the newly introduced mutant base pattern. For example, chr3, 178935093 C>A, the base substitution mode is "CA". This feature uses the "one-hot encoding" method, taking into account the theoretical 12 replacement modes, namely: AT, AC, AG, TA, TC, TG, CA, CT, CG, GA, GT, GC.

[0301] c) Dev value obtained in Example 2 (that is,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method for distinguishing a somatic mutation and a germline mutation. The method comprises the following steps of obtaining at least one mutation site from a subject sample; obtaining a wild type support fragment and a mutant type support fragment, wherein the wild type support fragment is a cfDNA fragment containing a wild type base sequence, the mutant type support fragment is a cfDNA fragment containing a mutant type base sequence, the wild type base sequence is the same as a nucleotide sequence of a human reference genome at a corresponding position of a mutation site, and the mutant type base sequence is different; obtaining the number of the wild type support fragments with at least one length, obtaining the number of the corresponding mutant type support fragments with the same length, and calculating the difference value between the ratio of the wild type support fragments with the same length to the total number of the corresponding support fragments and the the ratio of the mutant type support fragments with the same length to the total number of the corresponding support fragments ; and taking the difference value as a distinguishing index. Method and device for identifying ctDNA from cfDNA are provided. The method is used for tumor family management and TMB detection.

Description

technical field [0001] The present application relates to the field of biological information, in particular to a method for distinguishing somatic mutations and germline mutations. Background technique [0002] In the plasma of tumor patients, cfDNA widely exists, including a small amount of tumor-specific ctDNA. These ctDNAs differ from other normal cfDNAs in the way of shearing during cell senescence and apoptosis. In other words, the fragmentation patterns of ctDNA and other conventional cfDNA in cell-free DNA in plasma are different. Therefore, differences in this distribution pattern can serve as markers for ctDNA recognition. [0003] Somatic mutations are non-genetic variations that are distinct from germline mutations (also known as: germline mutations) and accumulate gradually over the human life cycle. Somatic mutation is an important marker of tumor formation because it is closely related to the molecular signaling pathway of tumorigenesis. Germline mutations...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): C12Q1/6886G16B20/50G16B30/00G16B40/00
CPCC12Q1/6886C12Q2600/156G16B20/50G16B30/00G16B40/00
Inventor 刘成林王俊张周揣少坤汉雨生
Owner GUANGZHOU BURNING ROCK DX CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products