A Functional Light-Chain Gene Requires Assembly of V and J Gene Segments
Genes encoding intact immunoglobulins do not exist already assembled in the genome, ready for expression. Instead, the required gene segments are brought together and assembled in the course of B-cell development. The organization of the region of the genome containing the immunoglobulin genes is shown in Figure 23-15. In B cells, the DNA in this region is rearranged as described below to generate assembled and fully functional immunoglobulin-encoding genes in each B cell and its descendants. Although the rearrangement of heavy-chain genes occurs before the rearrangement of light-chain genes, we discuss light-chain genes first because of their less complex organization.
FIGURE 23-15 Overview of somatic gene rearrangement in immunoglobulin DNA. The stem cells that give rise to B cells contain multiple gene segments encoding portions of immunoglobulin heavy and light chains. During development of a B cell, somatic recombination of these gene segments yields functional light-chain genes (a) and heavy-chain genes (b). Each V gene segment carries its own promoter. Rearrangement brings an enhancer close enough to the combined sequence to activate transcription. The light-chain variable region (VL) is encoded by two joined gene segments, and the heavy-chain variable region (VH) is encoded by three joined segments. Note that the chromosomal regions encoding immunoglobulins contain many more V, D, and J segments than shown. In addition, the κ light-chain locus contains a single constant (C) segment, as shown, but the heavy-chain locus contains several distinct C segments (not shown) corresponding to the immunoglobulin isotypes.
The immunoglobulin light-chain genes consist of clusters of V gene segments, followed downstream by a single C segment. Each V gene segment carries its own promoter sequence and encodes the bulk of the light-chain variable region, although a small piece of the nucleotide sequence encoding the light-chain variable region is missing from the V gene segment. This missing portion is provided by one of the multiple J segments located between the V segments and the single C segment in the unrearranged κ light-chain locus (see Figure 23-15a). (This J segment is a genetic element, not to be confused with the J chain, a polypeptide subunit of the pentameric IgM molecule and found also in association with IgA; see Figure 23-10.) In the course of B-cell development, commitment of a B-cell precursor to use a particular V gene segment—a random process—results in its physical juxtaposition with one of the J segments, again a random choice, to form an exon encoding the entire light-chain variable region (VL). This DNA rearrangement not only generates an intact and functional light-chain gene, but also places the promoter sequence of the rearranged gene within controlling distance of enhancer elements, located downstream of the light-chain constant-region exon, that are required for its transcription. Only a fully rearranged light-chain gene is transcribed and subsequently translated into protein.
Recombination Signal Sequences Detailed DNA sequence analysis of the light-chain and heavy-chain regions revealed a conserved sequence element at the 3′ end of each V gene segment. This conserved element, called a recombination signal sequence (RSS), is composed of heptamer and nonamer sequences separated by a 23-bp spacer. At the 5′ end of each J segment, there is a similarly conserved RSS that contains a 12-bp spacer (Figure 23-16a). The 12- and 23-bp spacers separate the conserved heptamer and nonamer sequences by one and two turns of the DNA helix, respectively.
FIGURE 23-16 Mechanism of rearrangement of immunoglobulin gene segments via deletional joining. (a) Location of the DNA elements involved in somatic recombination of immunoglobulin gene segments at the light-chain locus (top) and at the heavy-chain locus (bottom). D segments are present in the heavy-chain, but not the light-chain, locus. At the 3′ end of all V gene segments is a conserved recombination signal sequence (RSS) composed of a heptamer, a 12-bp spacer, and a nonamer. Each of the J or D segments with which a V can recombine possesses at its 5′ end a similar RSS with a 23-bp spacer. The nonamer and heptamer sequences at the 5′ end of J or D are complementary and antiparallel to those found at the 3′ end of each V when read on the same (top) strand. The RSSs that flank the D segments have spacers of identical length, preventing the formation of D to D rearrangements. (b) Hypothetical model of how two coding regions to be joined may be arranged spatially, stabilized by the RAG1 and RAG2 recombinase complex. Both strands of the DNA are shown. (c) Events in the joining of V to J (light chain) or to DJ (heavy chain) coding regions. The germ-line DNA (step 1) is folded, bringing the segments to be joined close together, and the RAG1/RAG2 complex makes single-stranded cuts at the boundaries between the coding sequences and RSSs (step 2). The free 3′ –OH groups attack the complementary strands, creating a covalently closed hairpin at each coding end and a clean double-stranded break at each boundary with an RSS (step 3). The hairpins are opened, either symmetrically (step 4), as shown for the J (light chain) or DJ (heavy chain) segment, or asymmetrically (step 5), as shown for the V segment. For D to J and V to DJ rearrangements in the heavy-chain locus, terminal deoxynucleotidyl transferase adds nucleotides in a template-independent manner to opened hairpins (step 6, right), generating an overhang (yellow) of unpaired nucleotides of random sequence (N-region); asymmetric opening automatically creates a palindromic overhang (step 6, left). The unpaired overhangs at the ends of both the V and J (light chain) or DJ (heavy chain) coding regions are filled in by DNA polymerase (step 7) or may be excised by an exonuclease. DNA ligase IV joins the two segments generated from the V and J coding regions (step 8). N-region addition does not take place for V to J (light chain) rearrangements. See text for additional discussion.
Somatic recombination is catalyzed by two enzymes, the RAG1 and RAG2 recombinases, which are expressed only in lymphocytes (Figure 23-17). Thus these rearrangements do not occur in any other cells of the body. Juxtaposition of the two gene segments to be joined is stabilized by the RAG1/RAG2 complex (Figure 23-16b). The recombinases then make a single-stranded cut at the exact boundary of each coding sequence and its adjacent RSS. Only gene segments that possess heptamer-nonamer RSSs with spacers of different lengths can engage in this type of rearrangement (the so-called 12/23-bp spacer rule). Each newly created –OH group at the site of cleavage then executes a nucleophilic attack on the complementary strand, creating a covalently closed hairpin for each of the two coding ends and double-strand breaks at the ends of the RSSs. Protein complexes that include the Ku70 and Ku80 proteins hold this complex together so that the ends about to be joined remain in close proximity: double-strand breaks in chromosomes need to be repaired, and thus the ends need to be held together for resolution and repair of these breaks to proceed. The RSS ends are then covalently joined without loss or addition of nucleotides, creating a circular reaction product (deletion circle) containing the intervening DNA, which is lost altogether. The hairpin ends of the coding segments undergoing recombination are then opened and finally joined as depicted in Figure 23-16c, completing the recombination process.
FIGURE 23-17 RAG1/RAG2 structure. (a) RAG1/RAG2 is shown in complex with the recombination signal sequences, positioning the 12- and 23-bp spacer sequences to enable cleavage at the boundary of the coding sequence and the heptamer of the RSS. (b) DNA can be cleaved by hairpin-forming bacterial and eukaryotic transposases, the evolutionary precursors of the RAG1/RAG2 complex. Shown here is the generation of a single-strand break, followed by an attack by the newly generated 3′ hydroxyl on the complementary strand to form a hairpin and a double-strand break.
[Data from M. S. Kim et al., 2015, Nature 518:507–511, PDB ID 4wwx; A. B. Hickman et al., 2014, Cell 158:353-367, PDB ID 4d1q; and F. F. Yin et al., 2009, Nat. Struct. Biol. 16:499-508, PDB ID 3gna.]
The recombination mechanism just described, called deletional joining, occurs when the V gene segment involved has the same transcriptional orientation as the other gene segments at the light-chain locus. Some V gene segments, however, have the opposite transcriptional orientation. These segments are joined to J segments by a mechanism, termed inversional joining, in which the V segment is inverted and the intervening DNA and RSSs are not lost from the locus.
Defects in the synthesis of RAG proteins obliterate the possibility of somatic gene rearrangements. As described below, the rearrangement process is essential for B-cell development; consequently, RAG deficiency leads to the complete absence of B cells. People with defects in RAG gene function suffer from severe immunodeficiency. Targeted deletion of RAG genes in mice likewise leads to a complete defect in immunoglobulin (and T-cell receptor) gene rearrangement, resulting in a developmental block in the generation of B and T lymphocytes.
Junctional Imprecision In addition to the random selection of V and J gene segments, processing of the intermediates created in the course of somatic recombination provides an additional means for expanding the variability of immunoglobulin sequences. This additional variability is created at the junction of the segments to be joined. The opening of the hairpins at the coding ends is a key step in this process: this opening may occur symmetrically or asymmetrically (see Figure 23-16c, steps 4 and 5). The protein Artemis, whose function requires the catalytic subunit of DNA-dependent protein kinase, carries out the opening of the hairpins.
If the opening of a hairpin is asymmetric, a short, single-stranded palindromic sequence is generated. Filling in of this overhang by DNA polymerase results in the addition of several nucleotides, called P-nucleotides, that were not part of the original coding region of the gene segment in question. Alternatively, the overhang may be removed by an exonuclease, resulting in the removal of nucleotides from the original coding region. These possibilities apply equally to the V and the J coding regions. Symmetric opening of a hairpin retains all the original coding information. However, even if the hairpin is opened symmetrically, the ends of the DNA molecule tend to breathe, creating short single-stranded sequences, which may also be attacked by nucleases. Once the hairpins have been opened and the coding ends processed, the ends are ligated by two proteins, DNA ligase IV and XRCC4, generating a functional light-chain gene.
Inherent in this rearrangement process is junctional imprecision resulting in part from the addition and loss of nucleotides at the coding-region joints. When a V and a J segment recombine, the sequence and reading frame of the VJ product cannot be predicted. Only one in three recombination reactions results in a reading frame that is compatible with light-chain synthesis. The others produce frameshifts that do not encode functional proteins.
Light-chain diversity therefore arises not only from the combinatorial use of V and J gene segments, but also from junctional imprecision. Inspection of the three-dimensional structure of the light chain shows that the highly diverse joint generated as a consequence of junctional imprecision forms part of a loop—hypervariable region 3 (HV3)—that projects into the antigen-binding site and makes contact with antigen (see Figure 23-13b).