3′ Cleavage and Polyadenylation of Pre-mRNAs Are Tightly Coupled

In eukaryotic cells, all mRNAs, except histone mRNAs,* have a 3′ poly(A) tail. Early studies of pulse-labeled adenovirus and SV40 RNA demonstrated that the viral primary transcripts extend beyond the site from which the poly(A) tail extends. These results suggested that A residues are added to a 3′ hydroxyl generated by endonucleolytic cleavage of a longer transcript, but the predicted downstream RNA fragments were never detected in vivo, presumably because of their rapid degradation. However, both predicted cleavage products were observed in in vitro processing reactions performed with nuclear extracts of cultured human cells. The cleavage/polyadenylation process and degradation of the RNA downstream of the cleavage site occurs much more slowly in these in vitro reactions, simplifying detection of the downstream cleavage product.

Early sequencing of cDNA clones from animal cells showed that nearly all mRNAs contain the sequence AAUAAA 15–30 nucleotides upstream from the poly(A) tail (Figure 10-15). Polyadenylation of RNA transcripts is virtually eliminated when the corresponding sequence in the template DNA is mutated to any other sequence except one encoding a closely related sequence (AUUAAA). The unprocessed RNA transcripts produced from such mutant templates do not accumulate in nuclei, but are rapidly degraded. Further mutagenesis studies revealed that a second signal downstream from the cleavage site is required for efficient cleavage and polyadenylation of most pre-mRNAs in animal cells. This downstream signal is not a specific sequence, but rather a GU-rich or simply a U-rich region within about 20 nucleotides of the cleavage site.

image
FIGURE 10-15 Model for cleavage and polyadenylation of pre-mRNAs in mammalian cells. Cleavage and polyadenylation specificity factor (CPSF) binds to the upstream AAUAAA polyadenylation signal. CStF interacts with a downstream GU- or U-rich sequence and with bound CPSF, forming a loop in the RNA; binding of CFI and CFII helps stabilize the complex. Binding of poly(A) polymerase (PAP) then stimulates cleavage at a poly(A) cleavage site, which usually is 15–30 nucleotides 3′ of the upstream polyadenylation signal. The cleavage factors are released, as is the downstream RNA cleavage product, which is rapidly degraded. Bound PAP then adds about 12 A residues at a slow rate to the 3′-hydroxyl group generated by the cleavage reaction. Binding of nuclear poly(A)-binding protein (PABPN1) to the initial short poly(A) tail accelerates the rate of addition by PAP. After 200–250 A residues have been added, PABPN1 signals PAP to stop polymerization.

Identification and purification of the proteins required for cleavage and polyadenylation of pre-mRNA have led to the model shown in Figure 10-15. A 360-kDa cleavage and polyadenylation specificity factor (CPSF), composed of five different polypeptides, first forms an unstable complex with the upstream AAUAAA polyadenylation signal. Then at least three additional proteins bind to the CPSF-RNA complex: a 200-kDa heterotrimer called cleavage stimulatory factor (CStF), which interacts with the G/U-rich sequence; a 150-kDa heterotetramer called cleavage factor I (CFI); and a second heterodimeric cleavage factor (CFII). A 150-kDa protein called symplekin is thought to form a scaffold on which these cleavage/polyadenylation factors assemble. Finally, poly(A) polymerase (PAP) must bind to the complex before cleavage can occur. This requirement for PAP binding links cleavage and polyadenylation, so that the free 3′ end generated is rapidly polyadenylated and no essential information is lost to exonuclease degradation of an unprotected 3′ end.

431

Assembly of this large multiprotein cleavage/polyadenylation complex around the AU-rich polyadenylation signal in a pre-mRNA is analogous in many ways to formation of the transcription preinitiation complex at the AT-rich TATA box of a template DNA molecule (see Figure 9-19). In both cases, multiprotein complexes assemble cooperatively through a network of specific protein–nucleic acid and protein-protein interactions.

Following cleavage at the poly(A) site, polyadenylation proceeds in two phases: addition of the first 12 or so A residues occurs slowly, followed by rapid addition of up to 200–250 more A residues. The rapid phase requires the binding of multiple copies of a poly(A)-binding protein containing the RRM motif. This protein is designated PABPN1 to distinguish it from the poly(A)-binding protein that is present in the cytoplasm in humans, PABPC1. PABPN1 binds cooperatively to the short A tail initially added by PAP and to CPSF bound to the AAUAAA polyadenylation signal. This binding stimulates the PAP to extend the short poly(A) tail rapidly and processively; that is, without releasing the growing poly(A) tail from the complex of PABPN1 and CPSF. Once the poly(A) tail reaches a length of about 250 adenines, this processivity is lost, and PAP dissociates from the poly(A)-PABPN1 complex, terminating A addition (see Figure 10-15). Binding of PABPN1 to the poly(A) tail is essential for mRNA export into the cytoplasm. As for splicing factors, several of the subunits of the proteins involved in cleavage and polyadenylation associate with the serine 2–phosphorylated CTD of RNA polymerase II, which concentrates them in the region where polyadenylation signals in the RNA emerge from the elongating polymerase.

In wild-type genes, RNA polymerase II terminates transcription at any one of multiple possible sites within about 2 kb of the polyadenylation signal. Experiments with SV40 and adenovirus (both DNA viruses) showed that when the polyadenylation signal is mutated, RNA polymerase II does not terminate transcription, but continues transcription until the next poly(A) site in the viral genome is encountered. Similar results were soon shown for a recombinant human β-globin gene inserted into an adenovirus. These experiments showed that transcription termination by RNA polymerase II is coupled to cleavage and polyadenylation of the transcript. It is hypothesized that this is due to the de-protection of the 5′ end of the nascent RNA. Because no cap is present on the 5′ end of the cleaved RNA, it is susceptible to the XRN1 5′→3′ exoribonuclease. It is thought that when this exoribonuclease reaches the still-transcribing polymerase, it triggers termination, either by pulling the 3′ end of the nascent RNA out of the polymerase active site or by inducing a conformational change in the polymerase that causes transcription termination. Once the nascent RNA is removed from the elongating polymerase, the contacts between the RNA polymerase II clamp and the RNA-DNA hybrid within the polymerase (see Figure 9-15) are lost, allowing the clamp to open and releasing the polymerase from the DNA template. More recent chromatin immunoprecipitation studies (ChIP-seq) (see Figure 9-18) with antibody to RNA polymerase II indicate that the polymerase may be removed from the template DNA at multiple possible sites within about 2 kb downstream from the poly(A) site.

432