Predicting Function from Sequence

The nucleotide sequence of a gene can be used to predict the amino acid sequence of the protein it encodes. The protein can then be synthesized or isolated and its properties studied to determine its function. However, this biochemical approach to understanding gene function is both time-consuming and expensive. A major goal of functional genomics has been to develop computational methods that allow gene function to be identified from DNA sequence alone, bypassing the laborious process of isolating and characterizing individual proteins.

One computational method (often the first employed) for determining gene function is to conduct a homology search, which relies on comparisons of DNA and protein sequences from the same organism and from different organisms. Genes that are evolutionarily related, which are referred to as homologous genes, are likely to have similar sequences. Databases containing genes and proteins found in a wide array of organisms are available for homology searches. Powerful computer programs, such as BLAST (Basic Local Alignment Search Tool), have been developed for scanning these databases to look for particular sequences. Suppose that a geneticist sequences a genome and locates a gene that encodes a protein of unknown function. A homology search conducted on databases containing the DNA or protein sequences of other organisms may identify one or more homologous sequences. If a function is known for a protein encoded by one of those sequences, that function may provide information about the function of the newly discovered protein.

CONCEPTS

The function of an unknown gene can sometimes be determined by finding genes with similar sequences whose function is known.