LTR Retrotransposons Behave Like Intracellular Retroviruses

316

The genomes of all eukaryotes studied, from yeast to humans, contain retrotransposons, mobile DNA elements that transpose through an RNA intermediate using a reverse transcriptase (see Figure 8-8b). These mobile elements are divided into two major categories: those containing and those lacking long terminal repeats (LTRs). LTR retrotransposons, which we discuss here, are common in yeast (e.g., Ty elements) and in Drosophila (e.g., copia elements). Although less abundant in mammals than non-LTR retrotransposons, LTR retrotransposons nonetheless constitute about 8 percent of human genomic DNA. Non-LTR retrotransposons are the most common type of mobile element in mammals; these retrotransposons are described in the next section.

The general structure of LTR retrotransposons found in eukaryotes is depicted in Figure 8-12. In addition to the short 5′ and 3′ direct repeats that are typical of all transposons, these retrotransposons are marked by the presence of LTRs flanking the central protein-coding region. These long direct terminal repeats, containing 250–600 bp depending on the particular LTR retrotransposon, are characteristic of integrated retroviral DNA and are critical to the life cycle of retroviruses. In addition to sharing LTRs with retroviruses, LTR retrotransposons encode all the proteins of the most common type of retroviruses, except for the envelope proteins. Lacking these envelope proteins, LTR retrotransposons cannot bud from their host cell and infect other cells; however, they can transpose to new sites in the DNA of their host cell. Because of their clear relationship with retroviruses, LTR retrotransposons are often called retrovirus-like elements.

image
FIGURE 8-12 General structure of eukaryotic LTR retrotransposons. The central protein-coding region is flanked by two long terminal repeats (LTRs), which are element-specific direct repeats. Like other mobile elements, integrated retrotransposons have short target-site direct repeats at each end. Note that the different regions are not drawn to scale. The protein-coding region constitutes 80 percent or more of a retrotransposon and encodes reverse transcriptase, integrase, and other retroviral proteins.

A key step in the retroviral life cycle is the formation of retroviral genomic RNA from integrated retroviral DNA (see Figure 5-48). We describe this process in some detail here because it serves as a model for the generation of the RNA intermediate during the transposition of LTR retrotransposons. As depicted in Figure 8-13, the leftward retroviral LTR functions as a promoter that directs host-cell RNA polymerase to initiate transcription at the 5′ nucleotide of the roughly 20-base R sequence that is repeated at each end of the retroviral RNA. After the entire downstream retroviral DNA has been transcribed, the RNA sequence corresponding to the rightward LTR directs host-cell RNA-processing enzymes to cleave the primary transcript and add a poly(A) tail at the 3′ end of the R sequence. The resulting retroviral RNA genome, which lacks a complete LTR, exits the nucleus and is packaged into a virion that buds from the host cell.

image
FIGURE 8-13 Generation of retroviral genomic RNA from integrated retroviral DNA. The left LTR directs cellular RNA polymerase to initiate transcription at the first nucleotide of the left R region. The resulting primary transcript extends beyond the right LTR. The right LTR, now present in the RNA primary transcript, directs cellular enzymes to cleave the primary transcript at the last nucleotide of the right R region and to add a poly(A) tail, yielding a retroviral RNA genome with the structure shown at the top of Figure 8-14. The R sequence is repeated precisely at the 5′ and 3′ end [before the poly(A) tail] of the viral genomic RNA. U5 and U3 refer to sequences at the 5′ and 3′ ends of the viral RNA that are not repeated in the genomic retroviral RNA and hence are unique (see Figure 8-14). A similar mechanism is thought to generate the RNA intermediate during transposition of retrotransposons. The short direct repeat sequences (black) of target-site DNA are generated during integration of the retroviral DNA into the host-cell genome.

After a retrovirus infects a cell, reverse transcription of its RNA genome by the retrovirus-encoded reverse transcriptase yields a double-stranded DNA containing complete LTRs (Figure 8-14). This DNA synthesis takes place in the cytosol. The double-stranded DNA, with an LTR at each end, is then transported into the nucleus in a complex with integrase, another enzyme encoded by retroviruses. Retroviral integrases are closely related to the transposases encoded by DNA transposons and use a similar mechanism to insert the double-stranded retroviral DNA into the host-cell genome. In this process, short direct repeats of the target-site sequence are generated at either end of the inserted viral DNA sequence. Although the mechanism of reverse transcription is complex, it is a critical aspect of the retrovirus life cycle. The process generates the complete 5′ LTR that functions as a promoter for initiation of transcription precisely at the 5′ nucleotide of the R sequence, while the complete 3′ LTR functions as a poly(A) site leading to polyadenylation precisely at the 3′ nucleotide of the R sequence. Consequently, no nucleotides are lost from an LTR retrotransposon as it undergoes successive rounds of insertion, transcription, reverse transcription, and reinsertion at a new site.

317

image
FIGURE 8-14 Model for reverse transcription of retroviral genomic RNA into DNA. In this model, a complicated series of nine events generates a double-stranded DNA copy of the single-stranded RNA genome of a retrovirus. The genomic RNA is packaged in the virion with a retrovirus-specific cellular tRNA hybridized to a complementary sequence near its 5′ end, called the primer-binding site (PBS). The retroviral RNA has a short direct repeat terminal sequence (R) at each end. The overall reaction is carried out by reverse transcriptase, which catalyzes polymerization of deoxyribonucleotides. RNaseH, also encoded in the viral RNA and packaged into the virion particle, digests the RNA strand in a DNA-RNA hybrid. The entire process yields a double-stranded DNA molecule that is longer than the template RNA and has a long terminal repeat (LTR) at each end. The different regions are not shown to scale. The PBS and R regions are actually much shorter than the U5 and U3 regions, and the central coding region is very much longer than the other regions. See E. Gilboa et al., 1979, Cell 18:93.

318

As noted above, LTR retrotransposons encode reverse transcriptase and integrase. By analogy with retroviruses, these mobile elements move by a “copy-and-paste” mechanism whereby reverse transcriptase converts an RNA copy of a donor element into DNA, which is inserted into a target site by integrase. The experiments depicted in Figure 8-15 provided strong evidence for the role of an RNA intermediate in the transposition of Ty elements in yeast.

image
EXPERIMENTAL FIGURE 8-15 The yeast Ty element transposes through an RNA intermediate. When yeast cells are transformed with a Ty-containing plasmid, the Ty element can transpose to new sites, although normally this occurs at a low rate. Using the elements diagrammed at the top, researchers engineered two different recombinant plasmid vectors containing recombinant Ty elements adjacent to a galactose-sensitive promoter. Yeast cells transformed with these plasmids were grown in a galactose-containing and a galactose-free medium. In experiment 1, growth of cells in galactose-containing medium resulted in many more transpositions than in galactose-free medium, indicating that transcription into an mRNA intermediate is required for Ty transposition. In experiment 2, an intron from an unrelated yeast gene was inserted into the putative protein-coding region of the recombinant galactose-responsive Ty element. The observed absence of the intron in transposed Ty elements is strong evidence that transposition involves an mRNA intermediate from which the intron was removed by RNA splicing, as depicted in the box on the right. In contrast, eukaryotic DNA transposons, such as the Ac element of maize, contain introns within the transposase gene, indicating that they do not transpose via an RNA intermediate. See J. Boeke et al., 1985, Cell 40:491.

The most common LTR retrotransposons in humans are called ERVs, for endogenous retroviruses. Most of the 443,000 ERV-related DNA sequences in the human genome consist only of isolated LTRs. These sequences are derived from full-length proviral DNA by homologous recombination between two LTRs, resulting in deletion of the internal retroviral sequences. Isolated LTRs such as these cannot be transposed to a new position in the genome, but recombination between homologous LTRs at different positions in the genome has probably contributed to the chromosomal DNA rearrangements leading to gene and exon duplications, the evolution of proteins with new combinations of exons, and, as we will see in Chapter 9, the evolution of complex control of gene expression.