A long DNA molecule typically contains thousands of genes, most of them coding for proteins or RNA molecules with specialized functions, and hence thousands of different transcripts are produced. For example, the DNA molecule in the bacterium E. coli has about 4 million base pairs and produces about 4000 RNA transcripts, most of which code for proteins. A typical map of a small part of a long DNA molecule is shown in Fig. 3.15. Each green segment indicates the position where a transcription is initiated, and each purple segment indicates the position where it ends.
The green segments are promoters, regions of typically a few hundred base pairs where RNA polymerase and associated proteins bind to the DNA duplex. Many eukaryotic and archaeal promoters contain a sequence similar to 5′-TATAAA-
Transcription continues until the RNA polymerase encounters a sequence known as a terminator (shown in purple in Fig. 3.15). Transcription stops at the terminator, and the transcript is released. A long DNA molecule contains the genetic information for hundreds or thousands of genes. For any one gene, usually only one DNA strand is transcribed; however, different genes in the same double-
Transcription does not take place indiscriminately from promoters but is a regulated process. For genes called housekeeping genes, whose products are needed at all times in all cells, transcription takes place continually. But most genes are transcribed only at certain times, under certain conditions, or in certain cell types. In E. coli, for example, the genes that encode proteins needed to utilize the sugar lactose (milk sugar) are transcribed only when lactose is present in the environment. For such genes, regulation of transcription often depends on whether the RNA polymerase and associated proteins are able to bind with the promoter (Chapter 19).
In bacteria, promoter recognition is mediated by a protein called sigma factor, which associates with RNA polymerase and facilitates its binding to specific promoters. One type of sigma factor is used for transcription of housekeeping genes and many others, but there are other sigma factors for genes whose expression is needed under special environmental conditions such as lack of nutrients or excess heat.
61
Promoter recognition in eukaryotes is considerably more complicated. Transcription requires the combined action of at least six proteins known as general transcription factors that assemble at the promoter of a gene. Assembly of the general transcription factors is necessary for transcription to occur, but not sufficient. Also needed is the presence of one or more types of transcriptional activator protein, each of which binds to a specific DNA sequence known as an enhancer (Fig. 3.16). Transcriptional activator proteins help control when and in which cells transcription of a gene will occur. They are able to bind with enhancer DNA sequences in or near the gene, and also bind with proteins that allow transcription to begin. The presence of the transcriptional activator proteins that bind with enhancers controlling the expression of the gene is therefore required for transcription of any eukaryotic gene to begin (Fig. 3.16).
62
Once transcriptional activator proteins have bound to enhancer DNA sequences, they can attract, or recruit, a mediator complex of proteins, which in turn recruits the RNA polymerase complex to the promoter. Because enhancers can be located almost anywhere in or near a gene, the recruitment of the mediator complex and the RNA polymerase complex may require the DNA to loop around as shown in Fig. 3.16. Cells have several different types of RNA polymerase enzymes, but in both prokaryotes and eukaryotes all protein-