Tag sequence library mixing method and device for improving sequencing platform library resolution rate

A tag sequence and sequencing platform technology, applied in the field of sequencing, can solve the problem of low split rate of tag sequence sequencing

Active Publication Date: 2018-05-11
海南华大基因科技有限公司
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Aiming at the problem of low splitting rate of existing tag sequence sequencing, the present invention provides a tag sequence mixing method and device for improving the library splitting rate of a sequencing platform

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Tag sequence library mixing method and device for improving sequencing platform library resolution rate
  • Tag sequence library mixing method and device for improving sequencing platform library resolution rate
  • Tag sequence library mixing method and device for improving sequencing platform library resolution rate

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0044] The numbers of the two DNA libraries are: WHBRAootMAAFDEAAPEI-30, HUMggzEAAADAAA-129, and the two libraries were mixed on the machine (note: the library name in the example is just a string of symbols used to distinguish different libraries, and has no specific technical meaning; The numbers after the library name, such as 30 and 129 indicate the number of the tag sequence).

[0045] 30. The specific nucleotide sequence of the tag sequence No. 129:

[0046] No. 30: GCTTAATG;

[0047] No. 129: ACAGAGTG.

[0048] Replace the A and C bases with the symbol A, and replace the G and T bases with the symbol B. After the replacement, the information of each tag sequence is as follows:

[0049] No. 30: BABBAABB;

[0050] Number 129: AAABABBB.

[0051] From the above sequence information, it can be seen that there are 3 different positions between the nucleotides of tag 30 and tag 129 after substitution, image 3 The splitting rate of the actual sequencing tag sequences of t...

Embodiment 2

[0053] For the 4 pepper DNA libraries, the library numbers are: CAPgsdG1AAD96FAAPEI-14, CAPgsdG1ABD96FABPEI-39, CAPgsdG2ADD96FAAPEI-45, CAPgsdG2ACD96FAAPEI-40; according to the data volume requirements, two libraries are required to be mixed on the machine (note: the name of the library in the example) It is just a string of symbols used to distinguish different libraries, without specific technical meaning; the numbers after the library name, such as 14, 39, 45 and 40 represent the number of the tag sequence).

[0054] The specific nucleotide sequences of tag sequences No. 14, 39, 45 and 40 are as follows:

[0055] No. 14: AGAGATCT;

[0056] No. 39: TCCAGTAG;

[0057] No. 45: ACTACAAG;

[0058] No. 40: TTGTCTAG.

[0059] A and C bases are replaced with the symbol A, and G and T bases are replaced with the symbol B. After the replacement, the information of each tag sequence is as follows:

[0060] No. 14: ABABABAB;

[0061] No. 39: BAAABBAB;

[0062] No. 45: AABAAAAB; ...

Embodiment 3

[0072] The numbers of the two DNA libraries are: WHHUMuwoRAAHDEAAPEI-75, WHHUMuwoRAABDEAAPEI-79.

[0073] Specific nucleotide sequences of 75 and 79 tag sequences:

[0074] No. 75: TACTATGA;

[0075] Number 79: CTTATAGA.

[0076] A and C bases are replaced with the symbol A, and G and T bases are replaced with the symbol B. After the replacement, the information of each tag sequence is as follows:

[0077] No. 75: BAABABBA;

[0078] Number 79: ABBABABA.

[0079] From the above sequence information, it can be seen that there are 6 different positions between the nucleotides of tag 75 and tag 79 after replacement, and the two DNA libraries were mixed and sequenced on the machine. Image 6 The splitting rate of the actual sequencing tag sequences of the two mixed libraries is shown, and it can be seen intuitively that the splitting rate reaches 98.31%.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a tag sequence library mixing method and device for improving the sequencing platform library resolution rate. The method includes: substituting the A and C bases at each location in a plurality of tag sequences with one symbol, and substituting the G and T bases with another symbol so as to convert each tag sequence into a sequence represented by two symbols; and after tagsequence conversion, selecting sequences of pairwise tag sequences with differences at more than 2 positions for library mixing. The method provided by the invention conducts conversion treatment onthe bases of tag sequences, also sets the difference standard of pairwise library mixing, and according to the set tag sequence library mixing rule, ensures that the tag sequence sequencing success rate reaches 100% and the tag sequence resolution rate reaches 90% or more.

Description

technical field [0001] The invention relates to the field of sequencing technology, in particular to a tag sequence library mixing method and device for improving the library splitting rate of a sequencing platform. Background technique [0002] The sequencing platform, especially the Illumina sequencing platform, requires that the bases in the same cycle (cycle) be sequenced to be relatively balanced, that is, it is best to ensure that the content of each base in each cycle is about 25%. If the requirements are not met, since the A and C bases share the red laser, and the G and T bases share the green laser, at least one base of the two excitation lights must be present in each cycle to ensure the normal operation of the machine. Focus and run, otherwise there will be poor sequencing quality or read N in the corresponding cycle. [0003] Tag sequences (index), for example, the numbers of different tag sequences developed by BGI have different sequences. According to the ex...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): C40B20/04C12Q1/6869
CPCC12Q1/6869C40B20/04C12Q2563/107
Inventor 刘舒伍梓靖
Owner 海南华大基因科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products