Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

An Annotation Method for Variant Sequences

A variation and sequence technology, applied in the field of annotation of variation sequences

Active Publication Date: 2022-07-29
中国人民解放军海军军医大学第三附属医院 +1
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] HGVS (Human Genome Variation Society) has formulated the mutation naming rules recognized by the academic community (http: / / varnomen.hgvs.org / ), but ANNOVAR does not use HGVS specification naming by default

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An Annotation Method for Variant Sequences
  • An Annotation Method for Variant Sequences
  • An Annotation Method for Variant Sequences

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0070] (1) Determine the variant sequence information (call variants)

[0071] (1.1) Obtaining variant sequences

[0072] Using probe capture technology, next-generation sequencing of human whole exons is used to obtain the sequence to be analyzed. Use variant sequence analysis software (GATK, https: / / gatk.broadinstitute.org / hc / en-us) to compare the sequence to be analyzed with the reference genome to obtain call variants.

[0073] (1.2) Integrate reference sequence information

[0074] Obtain the hg19 reference genome sequence and the hg19 reference genome annotation file, which includes gene name, transcript name, physical location, positive and negative strands, information about each element (elements include UTR, Intron, CDS), etc. Among them, the download address of the hg19 reference genome sequence is ftp: / / hgdownload.soe.ucsc.edu / goldenPath / hg19 / bigZips / hg19.fa.gz; the access address of the hg19 reference genome annotation file is: ftp: / / hgdownload. soe.ucsc.edu / go...

example 1

[0130] Mutation site: A to G at position 69511 of chromosome 1 (step 1.3 standardized mutation information, 1:69511:69511:A:G)

[0131] Amino acid sequence variant annotation results:

[0132] Comparative example: OR4F5:NM_001005484:exon1:c.421A>G:p.T141A

[0133] Example: OR4F5:NM_001005484:exon1:c.421A>G:p.Thr141Ala

[0134] The annotation results are consistent, but the one-letter abbreviation of amino acid used by ANNOVAR in the comparative example does not conform to the specification.

[0135] Nucleic acid sequence variant annotation results:

[0136] Comparative ratio: Symbol: OR4F5

[0137] Example: Symbol: OR4F5, EntrezID: 79501

[0138] The annotation results are consistent, but there is no EntrezID information in ANNOVAR in the control example.

[0139] Functional area notes: All are exonic; results are consistent.

[0140] Variation type annotation: All are nonsynonymous SNVs; the results are consistent.

example 2

[0142] Variation site: G deletion at position 70176769 of chromosome 9 (step 1.3 normalized variation information, 9:70176769:70176769:G:-)

[0143] Nucleic acid and amino acid sequence variation annotation results:

[0144] Comparative example: FOXD4L5:NM_001126334:exon1:c.1215delC:p.W406Gfs*21

[0145] Example: FOXD4L5:NM_001126334:exon1:c.1215_1215del:p.Trp406Glyfs

[0146] The annotation results are consistent, but the example shows the deletion of the start and stop sites, and the three-letter amino acid is used.

[0147] Functional area notes: All are exonic; results are consistent.

[0148] Mutation type annotation:

[0149] Comparative example: frameshift deletion

[0150] Example: del_frameshift_stoploss

[0151] In the example, the stoploss information is successfully annotated.

[0152] Genetic information:

[0153] Comparative ratio: Symbol: FOXD4L5

[0154] Example: Symbol: FOXD4L5, EntrezID: 653427

[0155] ANNOVAR has no EntrezID information in the comp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of biological information, and in particular relates to a variant sequence annotation method. The method includes: (1) determining variant sequence information: obtaining variant sequence information, integrating reference sequence information, and standardizing variant information; (2) variant annotation, annotation Results include annotated functional regions, variant types, nucleic acid sequences, and amino acid sequences. This method not only realizes the existing functions of the industry gold standard ANNOVAR, but also overcomes the shortcomings in ANNOVAR. The canonical representation is also added, and the gene number Entrez ID is also added, which has better application value.

Description

technical field [0001] The invention belongs to the technical field of biological information, and in particular relates to a method for annotation of variant sequences. Background technique [0002] With the development of sequencing technology, the sequencing throughput continues to rise and the sequencing cost continues to decline, and more and more species have obtained genome and transcriptome information. In the field of subdivision, more and more researches focus on the variation among different varieties or groups of the same species, and even among differentiated individuals, in order to seek the phenotypic differences caused by the variation of individual genetic information in a large genetic background . This poses challenges to the search and annotation of variant sequences. [0003] Taking humans as an example, ANNOVAR is the mainstream software for annotating variants, and is considered the gold standard in the industry. However, in actual use, the inventor ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G16B30/00G16B50/10
CPCG16B30/00G16B50/10
Inventor 文文王红阳朱赢陈淑桢何慧斯高勇汪德鹏
Owner 中国人民解放军海军军医大学第三附属医院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products