Foreword Preface Part I. Introduction 1. Hello BLAST What Is BLAST? Using NCBI-BLAST Alternate Output Formats Alternate Alignment Views The Next Step Further Reading Part II. Theory 2. Biological Sequences The Central Dogma of Molecular Biology Evolution Genomes and Genes Biological Sequences and Similarity Further Reading 3. Sequence Alignment Global Alignment: Needleman-Wunsch Local Alignment: Smith-Waterman Dynamic Programming Algorithmic Complexity Global Versus Local Variations Final Thoughts Further Reading 4. Sequence Similarity Introduction to Information Theory Amino Acid Similarity Scoring Matrices Target Frequencies, lambda, and H Sequence Similarity Karlin-Altschul Statistics Sum Statistics and Sum Scores Further Reading Part III. Practice 5. BLAST The Five BLAST Programs The BLAST Algorithm Further Reading 6. Anatomy of a BLAST Report Basic Structure Alignments 7. A BLAST Statistics Tutorial Basic BLAST Statistics Using Statistics to Understand BLAST Results Where Did My Oligo Go? 8. 20 Tips to Improve Your BLAST Searches 8.1 Don't Use the Default Parameters 8.2 Treat BLAST Searches as Scientific Experiments 8.3 Perform Controls, Especially in the Twilight Zone 8.4 View BLAST Reports Graphically 8.5 Use the Karlin-Altschul Equation to Design Experiments 8.6 When Troubleshooting, Read the Footer First 8.7 Know When to Use Complexity Filters 8.8 Mask Repeats in Genomic DNA 8.9 Segment Large Genomic Sequences 8.10 Be Skeptical of Hypothetical Proteins 8.11 Expect Contaminants in EST Databases 8.12 Use Caution When Searching Raw Sequencing Reads 8.13 Look for Stop Codons and Frame-Shifts to find Pseudo-Genes 8.14 Consider Using Ungapped Alignment for BLASTX, TBLASTN, and TBLASTX 8.15 Look for Gaps in Coverage as a Sign of Missed Exons 8.16 Parse BLAST Reports with Bioperl 8.17 Perform Pilot Experiments 8.18 Examine Statistical Outliers 8.19 Use links and topcomboN to Make Sense of Alignment Groups 8.20 How to Lie with BLAST Statistics 9. BLAST Protocols BLASTN Protocols BLASTP Protocols BLASTX Protocols TBLASTN Protocols TBLASTX Protocols Part IV. Industrial-Strength BLAST 10. Installation and Command-Line Tutorial NCBI-BLAST Installation WU-BLAST Installation Command-Line Tutorial Editing Scoring Matrices 11. BLAST Databases FASTA Files BLAST Databases Sequence Databases Sequence Database Management Strategies 12. Hardware and Software Optimizations The Persistence of Memory CPUs and Computer Architecture Compute Clusters Distributed Resource Management Software Tricks Optimized NCBI-BLAST Part V. BLAST Reference 13. NCBI-BLAST Reference Usage Statements Command-Line Syntax blastall Parameters formatdb Parameters fastacmd Parameters megablast Parameters bl2seq Parameters blastpgp Parameters (PSI-BLAST and PHI-BLAST) blastclust Parameters 14. WU-BLAST Reference Usage Statements Command-Line Syntax WU-BLAST Parameters xdformat Parameters xdget Parameters Part VI. Appendixes A. NCBI Display Formats B. Nucleotide Scoring Schemes C. NCBI-BLAST Scoring Schemes D. blast-imager.pl E. blast2table.pl Glossary Index
Ian Korf is from Washington University, where he works closely with Warren Gish, one of the creators of BLAST. Mark Yandell is from Celera Genomics. He, and the company he works for, make extensive use of BLAST in their gene sequencing and gene discovery efforts. He is an expert at using BLAST, and represents most of the biologists using BLAST today. Joseph Bedell is from Incyte Genomics. He, and the company he works for, make extensive use of BLAST in their gene sequencing and gene discovery efforts. He is an expert at using BLAST, and represents most of the biologists using BLAST today.