Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

High-throughput sequencing quality control analysis method capable of quickly and automatically feeding back results through mails in batches on basis of snakemake language

An analysis method and high-throughput technology, applied in the field of high-throughput sequencing quality control analysis, can solve the problems of no process monitoring mechanism, simple analysis results, and it takes a few days or even a month to achieve the effect of convenient error query

Active Publication Date: 2021-06-15
SHANGHAI OE BIOTECH CO LTD
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The current general analysis method uses Trimmomatic to first filter low-quality sequences and sequencing adapters, and then uses fastqc to perform quality visualization analysis on the data, and can only perform single-sample processing, and conduct quality control processing for high-throughput sequencing data with large samples It may take several days or even a month, and the analysis results cannot be quickly fed back, and there is no process monitoring mechanism, making data analysis a major bottleneck in related research
[0005] The existing high-throughput sequencing quality control analysis process has the following defects: (1) single-sample analysis is slow: it takes a long time for a single sample to filter out the results from raw data to quality control; (2) samples cannot be processed in batches: only single-sample Quality control, multiple samples cannot be processed in parallel; (3) The feedback of the analysis results is not timely: manual verification is required after the process is completed, and email feedback cannot be timely; (4) No error detection mechanism: There is no detection mechanism for whether a single sample is successfully run; (5) ) No analysis process visualization: no intuitive visual display of the analysis process; (6) Incomplete result display: the analysis results are too simple, and there is a lack of visual display content corresponding to the data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High-throughput sequencing quality control analysis method capable of quickly and automatically feeding back results through mails in batches on basis of snakemake language
  • High-throughput sequencing quality control analysis method capable of quickly and automatically feeding back results through mails in batches on basis of snakemake language
  • High-throughput sequencing quality control analysis method capable of quickly and automatically feeding back results through mails in batches on basis of snakemake language

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0059] Taking three samples of A1, A2, and A3 as examples, the flow process of the present invention is described:

[0060] 1. Accept the raw data of samples A1, A2, and A3 from the user's high-throughput sequencing;

[0061] 2. Use fastp software to perform quality control filtering on each raw data of the above-mentioned A1, A2, A3 samples, see figure 2 , 3, 4;

[0062] figure 2 It is the average error rate distribution diagram of the sequence: the abscissa is the base position at both ends of R1 and R2, and the ordinate is the average error rate at each base position;

[0063] image 3 Pie chart of sequence components: the legend part includes the number and percentage of high-quality sequences, the number and percentage of low-quality sequences, the number and percentage of sequences containing too many N bases, the number and percentage of sequences that are too short percentage;

[0064] Figure 4 It is the base content distribution map: the abscissa is the base ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a high-throughput sequencing quality control analysis method capable of quickly and automatically feeding back results in batches on the basis of a snakemake language. The method specifically comprises the following steps: preparing a file; performing fastp quality control filtering on multiple samples in parallel; carrying out single sample fastp operation monitoring; summarizing fastp quality control results of all the samples; carrying out quality control result summarizing and mail feedback; performing fastqc detection on multiple samples in parallel; integrating all sample results; and drawing an analysis method graph. According to the analysis method, batch processing can be carried out on samples, obtained results are comprehensive, all analysis results can be automatically arranged, statistical summarization visualization is carried out, meanwhile, all operation steps are traceable, and error query is facilitated.

Description

technical field [0001] The invention belongs to the technical field of high-throughput microbial sequencing, and relates to a high-throughput sequencing quality control analysis method based on snakemake language, which can quickly batch and automatically feedback results by email. Background technique [0002] High-throughput sequencing, also known as "next-generation sequencing", is a revolution to traditional sequencing. Compared with traditional Sanger sequencing, the throughput of next-generation sequencing technology has increased by one to two orders of magnitude, and it can economically perform high-throughput genome sequencing. Magnification of sequence coverage. With the gradual stabilization of the performance of high-throughput sequencing instruments and the continuous decline in prices, their applications are becoming more and more extensive. Therefore, research based on high-throughput sequencing data will show a rapid development trend in terms of quantity and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16B30/00G16B40/00G16B45/00G16B50/00G06Q10/10
CPCG16B30/00G16B40/00G16B45/00G16B50/00G06Q10/107
Inventor 张建明顾胤聪肖云平史贤俊刘钰钏林博
Owner SHANGHAI OE BIOTECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products