For a protein-coding gene to be expressed, it must first be transcribed. In transcription, the code in the gene's DNA is converted into a complementary code in an RNA molecule. The RNA molecule then participates in the second phase of gene expression: translation. In translation, the code in the RNA is converted into an amino acid sequence in a protein. Transcription and translation are the main events of gene expression.
In the accompanying animation, we focus on transcription, which occurs in three phases: initiation, elongation, and termination.
Transcription is the first step in the expression of a gene. It is the formation of a specific RNA sequence from a specific DNA sequence. Transcription initiation occurs at a promoter, a region on the DNA where RNA polymerase binds. RNA polymerase attaches to the promoter and begins to unwind the DNA.
At the initiation site, the polymerase begins reading the DNA template strand and building a complementary RNA strand from free nucleoside triphosphates. The RNA strand grows by the addition of these substrate molecules to its 3′ end. This is the elongation phase of transcription.
Note that the bases in the nucleotides of the RNA strand form pairs with the bases in the DNA strand. Also, note that RNA nucleotides contain the sugar ribose, which has a 2′ hydroxyl group, whereas the DNA nucleotides contain the sugar deoxyribose, which lacks this hydroxyl group.
RNA polymerase functions similarly to DNA polymerase. It adds nucleotides to the 3′ hydroxyl group of the last nucleotide.
The DNA double helix rewinds as the RNA polymerase moves through.
When RNA polymerase reaches the termination site, the RNA transcript is released from the template. The DNA rewinds completely and the RNA polymerase dissociates from it. The DNA and the RNA polymerase can then participate in other rounds of transcription.
Transcription begins with initiation, which requires a promoter, a special sequence of DNA to which the RNA polymerase binds very tightly. A promoter orients the RNA polymerase and thus "aims" it at the correct strand to use as a template. Part of each promoter is the initiation site, where transcription begins.
Once RNA polymerase has bound to the promoter, it begins the process of elongation. RNA polymerase unwinds the DNA about 10 base pairs at a time and reads the template strand in the 3′-to-5′ direction while it forms RNA in the opposite (5′-to-3′) direction. As in the process of DNA replication, base-pairing rules apply. However, uracil is used in RNA instead of the base thymine, which is used in DNA.
Just as initiation sites in the DNA template strand specify the starting point for transcription, particular base sequences specify its termination. For some genes, the newly formed transcript falls away from the DNA template and the RNA polymerase. For others, a helper protein pulls the transcript away.
Not only mRNA is produced by transcription. The same process is responsible for the synthesis of tRNA and ribosomal RNA (rRNA), which have important roles in protein synthesis. Like polypeptides, these RNAs are encoded by specific genes.