The protein-coding genes of a eukaryote typically contain regions of DNA that serve no coding function. Noncoding regions, called introns, interrupt the coding regions, called exons.
When the gene is transcribed into RNA, both the coding and noncoding regions are copied. However, a eukaryotic cell has a mechanism for removing the introns from RNA. In a process called RNA splicing, a newly transcribed RNA molecule is cut at the intron-exon boundaries, its introns are discarded, and its exons are joined together. RNA splicing occurs within the nucleus before the RNA migrates to the cytoplasm. In the cytoplasm, ribosomes translate the RNA—now containing uninterrupted coding information—into protein.
After a eukaryotic cell transcribes a protein-coding gene, the RNA transcript, called a pre-mRNA, is processed. The processing takes place in the nucleus, after which the mature mRNA is released into the cytoplasm. Ribosomes in the cytoplasm translate the mRNA into protein.
Here we focus on one of several RNA processing steps, called RNA splicing. The pre-mRNAs, just like the genes from which they are transcribed, contain regions that do not code for the translated protein. These noncoding regions, called introns, interrupt the coding regions, called exons, and must be removed.
As soon as a pre-mRNA is transcribed, it is quickly bound by several complexes, called small nuclear ribonucleoprotein particles, or snRNPs (pronounced "snurps") for short. As the name suggests, snRNPs contain RNAs and proteins. The snRNPs are responsible for splicing introns out of pre-mRNAs.
snRNPs bind to sites in a pre-mRNA at or near the intron-exon boundaries. These sites, called consensus sequences, contain nucleotide sequences that are shared by most pre-mRNAs. The snRNPs contain RNA molecules that can bind to these consensus sequnces through complementary base pairing.
In addition to the snRNPs that have bound to the consensus sequences, other snRNPs (not shown) attach to other sequences within the intron. The snRNPs eventually come together into a large complex called a spliceosome. As the spliceosome forms, the intron loops out.
The spliceosome cuts the pre-mRNA at one intron-exon boundary, where it leaves a reactive free hydroxyl (–OH) group on the exon. The spliceosome uses this hydroxyl group to attack the other end of the intron, and in the process removes of the intron and joins the exons together, forming a mature mRNA molecule.
The mRNA leaves the nucleus and is translated into protein within the cytoplasm. The intron that remains is quickly degraded. The snRNPs remain in the cell and are used to splice introns from other RNA molecules.
Before an RNA can be translated into a protein, its introns must be removed. A eukaryotic cell splices out these introns soon after the RNA is transcribed. If the introns are not removed, the RNA would be translated into a nonfunctional protein.
Although introns are discarded, they do contain important sequences. The cell's splicing machinery—the small nuclear ribonucleoprotein particles, or snRNPs—bind to essential sequences, called consensus sequences, within the introns. The snRNPs use these sequences as markers to direct them to the correct splice sites.
A mutation in a consensus sequence can cause serious problems for a cell. For example, this type of mutation is the cause of one form of the genetic disease β-thalassemia, which results in severe, chronic anemia. People with β-thalassemia make defective red blood cells because they cannot properly produce β-globin, a polypeptide component of hemoglobin. In this disease, the pre-mRNA for β-globin is incorrectly spliced, and the resulting mRNA codes for a defective polypeptide.