Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and system for analyzing gene sequencing data

A technology for gene sequencing and sequencing data, applied in the field of biological information analysis, can solve problems such as inaccurate results, consumption of large computing resources and time costs, and achieve the effect of reducing the complexity of analysis, the analysis process and results are simple and intuitive, and reducing costs.

Pending Publication Date: 2021-03-02
元码基因科技(苏州)有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the current analysis of sequencing data based on machine learning is basically the variation results (SNV / Indel / SV / CNV, etc.) obtained by analysis software or devices, combined with certain filter conditions to obtain filtered results for downstream use For modeling analysis, since the number of sites is exponentially related to the modeling complexity, the general modeling site does not require too many, otherwise it will consume a lot of computing resources and time costs; at the same time, the filter conditions set are generally All are based on the experience of the analysts, so more subjective factors will be introduced, and if the filter conditions are too strict or loose, the results will also introduce more false positives or false negatives, resulting in inaccurate results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for analyzing gene sequencing data
  • Method and system for analyzing gene sequencing data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0059] 45 cases of kidney cancer patients (kidney chromosome) and 256 cases of prostate adenocarcinoma patients (prostate adenocarcinoma) in the TCGA database were selected as examples. Use the method of the present invention to find the specific differential gene and mutation ( figure 2 shown). The samples in this example all have a public result data set, which facilitates the consistency comparison between methods.

[0060] The TCGA raw sequencing data of 301 samples were compared to the human reference genome using BWA software to obtain a comparison file in SAM format, and then the SAM files were sorted and deduplicated using Samtools software to obtain a BAM format file, and finally VarScan software was used Obtain the somatic mutation results of each sample, record the set of all somatic mutations of patients with kidney cancer as A, and the set of all somatic mutations of patients with prostate adenocarcinoma as B;

[0061] Merge all the mutation positions of set A ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and system for analyzing gene sequencing data. According to the method, all detected variation information can be utilized; meanwhile, a visual concept is fused, unprocessed variation information is simulated into an image, variation distribution and intensity can be visually seen, an image comparison or image recognition technology is utilized to directly processthe image to search for differences. Therefore, the analysis complexity can be reduced to the great extent, the analysis time cost is lowered, the analysis time is multiplied compared with a conventional analysis method, and the analysis process is simpler and more visual.

Description

technical field [0001] The invention relates to the field of biological information analysis, in particular to a method and system for analyzing gene sequencing data. Background technique [0002] With the advancement of technology, the cost of gene sequencing has dropped rapidly, resulting in a large amount of gene sequencing data information, and the analysis requirements for these data are getting higher and higher and more refined. As a result, the application of sequencing technology to detect biomarkers in cancer has become more and more normalized and personalized. At present, the most widely used solutions focus on next generation sequencing (next generation sequencing) technology, such as whole genome sequencing, whole exome sequencing, high-depth target region sequencing, transcriptome sequencing, methylation sequencing and other technologies Real-time monitoring and targeted drug treatment of cancer patients can also be applied to large cohort data to discover ne...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16B30/10G16B40/00G06T7/00
CPCG06T7/0012G16B30/10G16B40/00
Inventor 郎继东田埂梁乐彬杨家亮
Owner 元码基因科技(苏州)有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products