The present invention provides methods for determining a
nucleic acid sequence by performing successive cycles of duplex extension along a single stranded template. The cycles comprise steps of extension,
ligation, and, preferably, cleavage. In certain embodiments the methods make use of extension probes containing phosphorothiolate linkages and employ agents appropriate to cleave such linkages. In certain embodiments the methods make use of extension probes containing an abasic residue or a damaged base and employ agents appropriate to cleave linkages between a
nucleoside and an abasic residue and / or agents appropriate to remove a damaged base from a
nucleic acid. The invention provides methods of determining information about a sequence using at least two distinguishably labeled probe families. In certain embodiments the methods acquire less than 2 bits of information from each of a plurality of nucleotides in the template in each cycle. In certain embodiments the sequencing reactions are performed on templates attached to beads, which are immobilized in or on a semi-
solid support. The invention further provides sets of labeled extension probes containing phosphorothiolate linkages or trigger residues that are suitable for use in the method. In addition, the invention includes performing multiple sequencing reactions on a single template by removing initializing oligonucleotides and extended strands and performing subsequent reactions using different initializing oligonucleotides. The invention further provides efficient methods for preparing templates, particularly for performing sequencing multiple different templates in parallel. The invention also provides methods for performing
ligation and cleavage. The invention also provides new libraries of
nucleic acid fragments containing paired tags, and methods of preparing microparticles having multiple different templates (e.g., containing paired tags) attached thereto and of sequencing the templates individually. The invention also provides
automated sequencing systems, flow cells,
image processing methods, and computer-readable media that store computer-
executable instructions (e.g., to perform the image-
processing methods) and / or sequence information. In certain embodiments the sequence information is stored in a
database.