14.6 Long Noncoding RNAs Regulate Gene Expression

For many years, our knowledge of RNA was limited to those molecules that play a central role in the synthesis of proteins: mRNAs, tRNAs, and rRNAs. Later, small nuclear RNAs that participate in the post-transcriptional processing of RNA (snRNAs and snoRNAs) were added to the list. Starting in the late 1990s, geneticists began to recognize that numerous small RNAs (siRNAs, miRNAs, piRNAs, and crRNAs) were also abundant and fundamentally important to cell function. More recently, it has become apparent that most of eukaryotic genomes are transcribed—although only about 1% of the human genome directly codes for proteins, over 80% is transcribed, producing many long RNA molecules that do not code for proteins. Called long noncoding RNAs (lncRNAs), these RNAs are typically over 100 nucleotides in length and lack an open reading frame (a sequence with a start and a stop codon, which is translated by ribosomes). Thousands of lncRNAs have been discovered in the last five years. The DNA sequences that encode them, along with other DNA of unknown function, have been called “the dark matter of the genome.”

Although the function of many lncRNAs is still unclear, there is increasing evidence that at least some play a role in controlling gene expression. Some lncRNAs interact with proteins that regulate transcription. For example, a lncRNA called lincRNA-p21 interacts with a protein called p53, a transcription factor that activates numerous genes, including genes involved in control of the cell cycle and cancer. By repressing p53, lincRNA-21 affects the transcription of hundreds of genes. Other lncRNAs modify chromatin structure, which also regulates transcription (see Chapter 17). Some lncRNA have sites that are recognized by miRNAs, and the lncRNAs serve as decoys for miRNA attachment. Thus, the lncRNAs and mRNAs compete for a limited number of miRNAs and regulate one another’s translation and degradation. Still other lncRNAs are complementary to mRNA sequences and function by base pairing with the mRNA and preventing translation or splicing.

One of the best-studied lncRNAs is Xist RNA, which plays a central role in dosage compensation in mammalian cells (see Chapter 4). To balance expression of X-linked genes in males (with one X chromosome) and females (with two X chromosomes), one of the X chromosomes in each mammalian female cell is inactivated. Which X chromosome is inactivated is random and set early in development; once inactivated, this chromosome remains inactive through multiple rounds of cell division. Xist RNA is transcribed only from the X chromosome destined to become inactive; Xist RNA coats it and recruits proteins that methylate histones in the chromatin. Methylation of the chromatin then leads to the inhibition of transcription of genes on the inactive X chromosome. At least two additional lncRNAs act to regulate the expression of Xist RNA.

Evidence suggests that other lncRNAs also bring about genomic imprinting (see Chapters 4 and 21). Imprinting occurs when a gene is expressed differently depending on whether it is inherited from a male or female parent. Many clusters of imprinted genes contain sequences that encode lncRNAs and evidence suggests that some imprinted genes are controlled by lncRNAs.

CONCEPTS

Long noncoding RNAs are long RNA molecules that do not encode proteins. Evidence increasingly suggests that many of these molecules function in the control of gene expression.