Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Distributed gene sequence alignment method based on Basic Local Alignment Search Tool (BLAST)

A gene sequence, distributed technology, applied in the field of computer and bioinformatics, can solve the problems affecting the overall execution efficiency of software, high hardware cost, high network resource overhead, etc., to reduce the IO bottleneck effect, increase the cumulative IO bandwidth, reduce The effect of hardware cost

Inactive Publication Date: 2012-06-27
BEIJING COMPUTING CENT
View PDF6 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although these parallel softwares enhance the scalability of the analysis algorithm and can be easily extended to hundreds or even thousands of processors for simultaneous execution, they have some common disadvantages: 1) Not all parallel versions can Produces the same results as the NCBI Blast stand-alone operation [4], which is caused by the use of different database cutting or result merging methods; 2) In traditional high-performance computing, shared storage systems are usually used, that is, databases, Blast binary files, sequence files, and intermediate results are all stored on the same physical storage. Although it is more convenient from the perspective of system maintenance, when the parallelism is high, the aggregated IO of all nodes has a very large overhead on network resources[ 5], will seriously affect the overall execution efficiency of the entire software, so IO bandwidth often becomes the bottleneck of multiple sequence alignment analysis; 3) These softwares all need to use highly coupled high-performance computer clusters and high-performance storage systems, and the hardware cost expensive

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed gene sequence alignment method based on Basic Local Alignment Search Tool (BLAST)
  • Distributed gene sequence alignment method based on Basic Local Alignment Search Tool (BLAST)
  • Distributed gene sequence alignment method based on Basic Local Alignment Search Tool (BLAST)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention.

[0028] Method flow chart of the present invention is as figure 1 shown, including the following steps:

[0029] S1, first the program parses the user parameters, and determines the number of MPI (Message Passing Interface, message passing interface) threads, reads the query sequence file (FASTA format) and divides the query sequence file according to the number of tasks (that is, the query sequence file) ( The number of tasks is greater than or equal to the number of MPI threads) to obtain the query sequence file segment, and then each MPI thread reads its own MPI line program number respectively; the user parameters mainly refer to BLAST parameters, and BLAST is an open ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical fields of computer and bioinformatics, disclosing a distributed gene sequence alignment method based on Basic Local Alignment Search Tool (BLAST). The method comprises the following steps: S1, the program analyzes user parameters, determines MPI thread serial number and reads query sequence file; query sequences are divided based on task number, and every MPI thread reads corresponding MPI thread serial number; S2, according to the MPI thread serial number, the program judges if the present MPI thread is head node; if the present MPI thread is a head node, the program waits for communication requests of other MPI threads; if a communication request exists, the response exists and then the present task is allocated to the thread making the request; the program continuously allocates task; if the present MPI thread is not a head node, the program requests a task serial number from the head node, reads the query sequence file segment according to the task serial number and performs BLAST to obtain BLAST alignment result; the program subtracts 1 in the task serial number and requests the task serial number after performing BLAST; and S3, the program combines all BLAST alignment results. The method can reduce hardware cost of the bioinformatics research.

Description

technical field [0001] The invention relates to the technical fields of computer and bioinformatics, in particular to a method for comparing distributed gene sequences based on BLAST. Background technique [0002] In the past few years, Next Generation Sequencing (NGS) technology has brought tremendous changes to biological research, and has made significant progress in sequencing principles, operational details, and technology expansion. Compared with the traditional Sanger sequencing method, the NGS technology platform avoids the cloning process, and directly uses adapters to perform parallel PCR (polymerase chain reaction) and sequencing reactions, so its data throughput is greatly improved, and more accurate sequencing can be performed in a shorter period of time. Much DNA is sequenced. For example, it took 13 years and hundreds of sequencers to map the first human genome using Sanger sequencing, but now NGS can complete the work in a few months. In addition, the cost ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/22
Inventor 吴一雷闫鹏程刘充李国锐陈禹保黄劲松谢威
Owner BEIJING COMPUTING CENT
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products