Chapter 6

Where to Start

Hogewer, P. 2011. The roots of bioinformatics in theoretical biology. PLoS Comp. Biol. 7:e1002021.

Searls, D. B. 2010. The roots of bioinformatics. PLoS Comp. Biol. 6: e1000809.

Books

Claverie, J.-M., and Notredame, C. 2003. Bioinformatics for Dummies. Wiley.

Pevsner, J. 2003. Bioinformatics and Functional Genomics. Wiley-Liss.

Doolittle, R. F. 1987. Of URFS and ORFS. University Science Books.

Sequence Alignment

Schaffer, A. A., Aravind, L., Madden, T. L., Shavirin, S., Spouge, J. L., Wolf, Y. I., Koonin, E. V., and Altschul, S. F. 2001. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res. 29: 2994–3005.

Henikoff, S., and Henikoff, J. G. 1992. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. U.S.A. 89:10915–10919.

Johnson, M. S., and Overington, J. P. 1993. A structural basis for sequence comparisons: An evaluation of scoring methodologies. J. Mol. Biol. 233:716–738.

Eddy, S. R. 2004. Where did the BLOSUM62 alignment score matrix come from? Nat. Biotechnol. 22:1035–1036.

Aravind, L., and Koonin, E. V. 1999. Gleaning non-trivial structural, functional and evolutionary information about proteins by iterative database searches. J. Mol. Biol. 287:1023–1040.

Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25:3389–3402.

Structure Comparison

Orengo, C. A., Bray, J. E., Buchan, D. W., Harrison, A., Lee, D., Pearl, F. M., Sillitoe, I., Todd, A. E., and Thornton, J. M. 2002. The CATH protein family database: A resource for structural and functional annotation of genomes. Proteomics 2:11–21.

Bashford, D., Chothia, C., and Lesk, A. M. 1987. Determinants of a protein fold: Unique features of the globin amino acid sequences. J. Mol. Biol. 196:199–216.

Harutyunyan, E. H., Safonova, T. N., Kuranova, I. P., Popov, A. N., Teplyakov, A. V., Obmolova, G. V., Rusakov, A. A., Vainshtein, B. K., Dodson, G. G., Wilson, J. C., et al. 1995. The structure of deoxy- and oxy-leghaemoglobin from lupin. J. Mol. Biol. 251:104–115.

Flaherty, K. M., McKay, D. B., Kabsch, W., and Holmes, K. C. 1991. Similarity of the three-dimensional structures of actin and the ATPase fragment of a 70-kDa heat shock cognate protein. Proc. Natl. Acad. Sci. U.S.A. 88:5041–5045.

Murzin, A. G., Brenner, S. E., Hubbard, T., and Chothia, C. 1995. SCOP: A structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247:536–540.

Hadley, C., and Jones, D. T. 1999. A systematic comparison of protein structure classification: SCOP, CATH and FSSP. Struct. Fold. Des. 7:1099–1112.

Domain Detection

Marchler-Bauer, A., Anderson, J. B., DeWeese-Scott, C., Fedorova, N. D., Geer, L. Y., He, S., Hurwitz, D. I., Jackson, J. D., Jacobs, A. R., Lanczycki, C. J., et al. 2003. CDD: A curated Entrez database of conserved domain alignments. Nucleic Acids Res. 31:383–387.

Ploegman, J. H., Drent, G., Kalk, K. H., and Hol, W. G. 1978. Structure of bovine liver rhodanese I: Structure determination at 2.5 Å resolution and a comparison of the conformation and sequence of its two domains. J. Mol. Biol. 123:557–594.

Nikolov, D. B., Hu, S. H., Lin, J., Gasch, A., Hoffmann, A., Horikoshi, M., Chua, N. H., Roeder, R. G., and Burley, S. K. 1992. Crystal structure of TFIID TATA-box binding protein. Nature 360:40–46.

Doolittle, R. F. 1995. The multiplicity of domains in proteins. Annu. Rev. Biochem. 64:287–314.

Heger, A., and Holm, L. 2000. Rapid automatic detection and alignment of repeats in protein sequences. Proteins 41:224–237.

Evolutionary Trees

Wolf, Y. I., Rogozin, I. B., Grishin, N. V., and Koonin, E. V. 2002. Genome trees and the tree of life. Trends Genet. 18:472–479.

Doolittle, R. F. 1992. Stein and Moore Award address. Reconstructing history with amino acid sequences. Protein Sci. 1:191–200.

Zuckerkandl, E., and Pauling, L. 1965. Molecules as documents of evolutionary history. J. Theor. Biol. 8:357–366.

Schönknecht, G., Chen, W.-H., Ternes, C. M., Barbier, G. G., Shrestha, R. P., Stanke, M., Bräutigam, A., Baker, B. J., Banfield, J. F., Garavito, R. M., et al. 2013. Gene transfer from bacteria and archaea facilitated evolution of an extremophilic eukaryote. Science 339:1207–1210.

Ancient DNA

Prüfer, K., Racimo, F., Patterson, N., Jay, F., Sankararaman, S., Sawyer, S., Heinze, A., Renaud, G., Sudmant, P. H., de Filippo C., et al. 2014. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505:43–49.

Meyer, M., Kircher, M., Gansauge, M. T., Li, H., Racimo, F., Mallick, S., Schraiber, J. G., Jay, F., Prüfer, K., de Filippo, C., et al. 2012. A high-coverage genome sequence from an archaic Denisovan individual. Science 338:222–226.

Green, R. E., Malaspinas, A.-S., Krause, J., Briggs, A. W., Johnson, P. L. F., Uhler, C., Meyer, M., Good, J. M., Maricic, T., Stenzel, U., et al. 2008. A complete Neandertal mitochondrial genome sequence determined by high-throughput sequencing. Cell 134:416–426.

B6

Pääbo, S., Poinar, H., Serre, D., Jaenicke-Despres, V., Hebler, J., Rohland, N., Kuch, M., Krause, J., Vigilant, L., and Hofreiter, M. 2004. Genetic analyses from ancient DNA. Annu. Rev. Genet. 38:645–679.

Evolution in the Laboratory

Sassanfar, M., and Szostak, J. W. 1993. An RNA motif that binds ATP. Nature 364:550–553.

Gold, L., Polisky, B., Uhlenbeck, O., and Yarus, M. 1995. Diversity of oligonucleotide functions. Annu. Rev. Biochem. 64:763–797.

Wilson, D. S., and Szostak, J. W. 1999. In vitro selection of functional nucleic acids. Annu. Rev. Biochem. 68:611–647.

Hermann, T., and Patel, D. J. 2000. Adaptive recognition by nucleic acid aptamers. Science 287:820–825.

Keefe, A. D., Pai, S., and Ellington, A. 2010. Aptamers as therapeutics. Nat. Rev. Drug Discov. 9:537–550.

Radom, F., Jurek, P. M., Mazurek, M. P., Otlewski, J., and Jeleń, F. 2013. Aptamers: Molecules of great potential. Biotechnol. Adv. 31:1260–1274.

Web Sites

The Protein Data Bank (PDB) site is the repository for three-dimensional macromolecular structures. It currently contains more than 100,000 structures. (http://www.pdb.org).

National Center for Biotechnology Information (NCBI) contains molecular biological databases and software for analysis. (http://www.ncbi.nlm.nih.gov/).