REARRANGING GENES OF THE CHONDRICHTHYAN IMMUNE SYSTEM by JONATHAN PATRICK RAST A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Departm e nt of Marine Science University of South Florida August 1995 Co-Major Professors: Gary W Litman Ph D. Raymond R Wilson Ph.D.
Graduate School Univer s ity of South Florida T ampa, Florida CERTIFICATE OF APPROVAL Ph.D. Dissertation Thi s is to certify that the Ph. D. Dissertation of JONATHAN P RAST with a major in Marine Science has been approved by the Examining Committee on July 30, 1 995 as satisfactory for the dissertation requirement for the Doctor of Philosophy degree Examining Committee: Co-Maior Professor.: Garv W Litm a n Ph.D. Co-Major Professor : Raymond'R Wilson, Ph.D. Member: Robert H. Byrne, Ph D. Member: Peter J. Nea1rtrle. Ph .D. Membe r : John H Paul. Ph D
ACKNOWLEDGEMENTS I would like to thank Michele Anderson, Edward Bridgeford Robin Buell, Robert Haire, Kristin Hinds-Frey, Noel Hawke, Ronda Litman, Yuko Ota, and Anna Zilch for invaluable help and advice in the laboratory. Thanks also to Barbara Pryor for help with manuscripts and general difficulties. I wish to express my gratitude to Chis Amemiya for guidance and encouragement and to Tatsuya Ota for his help in the perilous field of phylogenetic sequence analysis. I thank my committee members, Robert Byrne Peter Neame and John Paul, for their patience and advice. Special thanks are due to my co-major professor Raymond Wilson for his help and encouragement to enter the fields of molecular biology and evolution. I would like to thank my committee chairperson, Robert Good who inspired me by his unwavering interest in the evolution of the immune system. I would especially like to thank Gary Litman for his remarkably positive attitude willingness to explore new ideas and guidance throughout my dissertation work Without his help none of this work could have been accomplished Finally I would like to thank my parents for their patience and help throughout m y seemingly endless graduate student career.
TABLE OF CONTENTS LIST OF TABLES IV LIST OF FIGURES v LIST OF ABBREVIATIONS AND ACRONYMS Vlll ABSTRACT X 1. INTRODUCTION 2. BACKGROUND 6 The Mammalian Immunoglobulin Model 9 Mammalian TCR s 14 lgffCR Accessory Molecules and Coreceptors 17 PreIgffCR Gene s 20 Vertebrate Systematics with Regard to the Chondrichthyans 20 Rearranging Immune System Genes of Nonmammalian Vertebrates 22 Immunoglobulin Heavy Chains 23 The Heterodontus Prototype 23 Raja a s a Developmental Model 30 Actinopterygians: the Ray Finned Fis he s 34 Sarcopterygians: the Lobed-finned Fishes 36 Amphibians 37 Reptiles 39 Avians 39 Immunoglobulin Light Chains 40 Chondrichthyan Light Chains 41 Teleosts 43 Amphibians 44 Avians 45 Non-Mammalian T Cell Receptors 46 Agnathans : The lawles s Vertebrates 47 Immune System Organs and Tissues 49 Evolution of Immunoglobulin Variable Region Gene s 53 Non-IgffCR Rearranging Gene Systems 54
Further Questions 54 3. METHODS 56 General Methods 56 Animal Procurement 56 DNA and RNA Isolation 56 RTPCR 57 Library Construction 58 Genomic 'A Library Construction. 58 eDNA Library Construction 58 Polymerase Chain Reaction 59 M 13 Cloning and Sequencing 59 Probe Synthesis and Library Screening 59 Analysis of DNA Sequences 60 Methods of the Hydrolagus Immunoglobulin Heavy Chain Analyses 61 Gene Titration 61 Primers 61 Cosmid Library Construction 62 Methods Specific to the Chondrichthyan Light Chain Analyses 62 Polymerase Chain Reaction 62 Probe Synthesis and Library Screening 63 V L to JL PCR Linkage Analysis 65 Analysis of DNA Sequences 65 GenBank Accession Numbers 66 Methods Specific to the Chondrichthyan TCR Analyses 67 PCR Amplification, Subcloning and Sequencing 67 eDNA and Genomic Library Screening 67 DNA Sequence Analysis 68 CHEF Analysis 68 GenBank Accession Numbers 69 4 HYDROLAGUS COLLIE! IMMUNOGLOBULIN HEAVY CHAIN 70 Introduction 70 Results 72 Southern Blotting Suggests an Excess of V H Elements 72 Genomic Library Screening Isolates Shark-like IgH Clusters 77 Linkage of a Second Heavy Chain Type 82 eDNA Library Screening and RT-PCR 86 Isolation of transcripts from a VDJ joined gene 87 The Excess V H Elements are Pseudogenes 90 Discussion 94 5 CHONDRICHTHYAN LIGHT CHAIN 101 Introduction I 0 I 11
Re s ult s 102 Two Di s tinct VL Families are Found in H ydrolag u s 102 The Genomic Structure of H ydro l agus Light Chain Gen es i s Clustered and the Element s a re Germline Joined I 03 The H ydro la g us Light Chain Gene i s a Member of a Second Family 108 Type TI Genes can b e Detected in Other Elasmobranchs I 09 Type II Genes are Germ line-Joined 110 A Third Light Chain Type (III) i s Present in H e t e rodontu s 112 Di sc u ss ion 121 6. HETERODONTUS FRANC/SCI TCR 128 Introduction 128 Result s 129 Isolation of Novel Variable Region Element s 129 A Heter odo ntus Homolog 130 The Het e r odontus Gene Organization Resembles that of the IgH 133 H e terodontus V P Family Diversity 138 A Het erodo ntus TCRO Homolog 138 The Genomic Organization of H e t erodontus TCRO 143 Di sc u ss ion 143 7. CONCLUSIONS 151 The Origin s of Rearrangin g Genes 152 Relationships Among Rearran gi n g Genes 153 The Acquisition of Novel F unction and Cell Types 156 REFERENCES 159 111
Table I. Table 2 Table 3. LIST OF TABLES Number of Clones Po s itive for Variou s IgH Element Probes in a Screening of Approximately One Genome Equivalent of a Hydr o la g us colli e i Genomic Library Variable Region Sequence s Used in Minimum Evolution Analysis Constant Region Sequences Used in Minimum Evolution Analysis IV 77 115 116
LIST OF FIGURES Figure I. Immunoglobulin Structure 7 Figure 2. Human Ig and TCR Gene Organization 12 Figure 3. Mammalian B Cell and T Cell Receptor Complexes 19 Figure 4. Hypothetica l Vertebrate Phyloge n y 21 Figure 5. Non Mammalian lg Heavy Cha i n Gen e Organization 26 Fig u re 6 Non-Mamma l ian Ig Light Chain Gene Organizations 42 Figure 7. Relations h ips among the Chondr i chthyans Discussed in this Text 73 Figure 8 Genomic Southern Blot Hybridization with Hydro lagus V H and TM Probes 75 Figure 9 Gene T i trations Us i ng Hydrolagus V H and CHI Probes 76 Figure 10. Restriction and Sequence Derived Maps of H y dr olagus Genomic IgH A C l ones 79 Figure 11. Nucleic Ac i d Seq u ence and Trans l ation of Hydrola gus A.20 1 82 Gen omic V H Region 80 Figure 1 2. Nucleic Acid Sequence and Translat i on of Hydr olag us A.2050 1 Genomic V H Region 81 Figure 1 3. Al i gnment of H ydro la gus lgH Constant Region Exon Amino Acid Sequenc es 84 v
Figure 14. Nucleic Acid Alignment of Genomic Y H Region Clones 2050 1 and COS5 85 Figure 15. H vdrolagus IgH eDNA and A-20182 (Hc182) Genomic Germline Joined Clone Variab l e Region Nucleotide Alignment 89 Figure 16. Alignment of 20I82 RT-PCR Product Sequences to the A-20 182 Genomic Sequence 92 Figure I7. Alignment of \j/V H Genomic Sequences with Germ line Joined VH from A.20I82 93 Figure I8. Alignment of Hydrolagus colliei Light Chain Nucleotide Sequences from eDNA Clones Hc712 and Hc702 104 Figure I9. Southern B lot of Heterodontus francisci, Raja eglanteria and Hydrolagus colliei Genomic DNA Hybridized with Type II Light Chain Variable and Constant Region Probes 105 Figure 20. Hydrolagus colliei Light Chain Genomic Clone Hc717 Variable and Constant Region Sequence 107 Figure 21. Heterodontus francisci Light Chain Genomic Clone HflB I Variable and Constant Region Sequence I I I Figure 22. Nucl eotide and Amino Acid Alignment of Raja erinacea Type II Light Chain cDNAs C lones Rer4 and R e r5 I I3 Figure 23. Alignment of Heterodontus francisci Type-III Light Chain cDNAs Hfrc4 and Hfrc5 1I4 Figure 24. Minimum Evolution Tree of Immunoglobulin Light Chain Variable Region Amino Acid Sequences 118 Figure 25. Minimum Evol ution Tree of Immunogl obu l in Light Chain Constant Region Amino Ac i d Sequences I20 Figure 26. Diagram of Chondrichthyan Light Chain Gene Arrangements 125 Figure 27. The Nucleotide and Predicted Amino Acid (Numbered 1 293) Sequence of eDNA Hf6 132 VI
Figure 28 Alignment of the Pr e dict e d Peptide Sequence of Po s ition s 126-293 Corre s ponding t o the Constant Region of the Heter o dontus Hf6 TCR Homolog 134 Figure 29. CHEF Analyses of H e terodontus DNA Using Hf6 V and C Probes 136 Figure 30 Heterodontus TCRP Genomic Maps 137 Figure 31. Alignment of Heterodontus TCRP eDNA Variable Region Predicted Amino Acid Sequences 139 Figure 32 Alignments of the Variable Region of Hf2AcDNA with Best Match from a Search of GenBank Using Fastdb ( Human TCRo, HSTCRDR ) 141 Figure 33 Alignment of Putative Horned Shark Clones Hf191 Y and Hf2A Constant Region s with Human Mouse, Chicken and Pufferfish TCRa, and Human and Mouse TCRo 142 Figure 34. Heterodontus Genomic Southern Blots Hybridized with Hf2AcDNA Variable and Constant Region Probes 145 VII
c eDNA CDR CRE D DNA FIGE FISH FR GALT lg IgH lgL IVS J MHC mig NK LIST OF ABBREVIATIONS AND ACRONYMS Constant Region Complementary DeoxyribonucleiC Acid Complementarity Determining Region Cyclic-AMP Responsive Element Diversity Region Deoxyribonucleic Acid Field Inversion Gel Electrophoresis Fluorescence in Situ Hybridization Framework Region Gut Associated Lymphoid Tissue Immunoglobulin Immunoglobulin Heavy Chain Immunoglobulin Light Chain Intervening Sequence Joining Region Major Histocompatibility Complex Membrane Immunoglobulin Natural Killer Vlll
PCR RNA RSS RT-PCR SEC TCR TdT TM v Polymerase Chain Reaction Ribonucleic Acid Recombination Signal Sequence Reverse Transciptase-polymerase Chain Reaction Secretory T Cell Receptor Terminal Deoxynucleotidyl Transferase Transmemebrane Variable Region IX
REARRANGING GENES OF THE CHONDRICHTHYAN IMMUNE SYSTEM by JONATHAN PATRICK RAST An Abstract Of a dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Marine Science University of South Florida August 1995 Co-major Professors: Gary W Litman Ph.D. Raymond R. Wilson, Ph.D. X
Rearranging immune system genes appear to have originated within the early vertebrates Phylogenetic investigations of the organization and diversity of these genes will increase the understanding of their early evolution Immunoglobulin heavy (lgH) and light (IgL) chain gene organization differs substantially between representative elasmobranchs and osteichthyan-tetrapods. Three broad questions regarding rearranging genes of the chondrichthyans are addressed in thi s dissertation: (l) Is the elasmobranch IgH gene organization present in a representative holoceph a lan, Hydrolagus colliei? (2) What is the nature of IgL isotypic divergence within the Chondrichthyes and how are their genes related to those of osteichthyans and tetrapods? (3) Do the chondrichthyans have rearranging T -cell receptors (TCR) and if so do they also differ in their segmental organization from the tetrapods? In answer to the first question a distinctly elasmobranch-like, cluster-type IgH gene organization, including germline-joined genes, was found in Hydrolagus In addition, a second IgH cluster family was identified in which the typical vertebrate CH 1 ex on appears to have been replaced by a duplication of the CH2 exon In answer to the second question chondrichthyan IgL genes are diverse. Different gene families were identified among the representative species that were investigated. Some of these IgL families cannot be classed into the IgK and IgA groups of modem mammals, although it seems probable that IgK emerged before the chondrichthyan-osteichthyan divergence. The third question is answered by the findings and TCR8-like genes in Heterodontusfrancisci and TCRa, xi
and TCR8-Iike transcripts in Raja e glanteria. gene organization in Heterodontus is mammal-like, although the entire complex locus appears to be present in multiple copies in the genome. The degree of V13 family divergence and multiplicity is similar to that found in mammals. Thus, while it appears that the evolution of Ig has taken very different paths in the chondrichthyans and osteichthyans, it is likely that combinatorial mechanisms for generating diversity were present in the TCR genes of the common ancestor of these two major vertebrate groups. Abstract Approved: Co-Major Professor: Gary W. Litman, Ph D. Professor, Department of Pediatrics College of Medicine Co Major Professor: Raymond R. Wilson Ph.D. Professor Department of Marine Science Date Approved: XII
1. INTRODUCTION The emergence of multicellular life established an evolutionarily relevant self on a level above that of the individual cell and conferred an advantage to those organisms that were able to guard against foreign invasion of their ti s sues. The analogous problem of intrusion at the intracellular level had been addressed previously within the prokaryotes, where mechanism s such as those for the recognition and elimination of viral DNA are well recognized and in the eukaryotic progenitors of the metazoa, fungi and plants. However, immunity at the multicellular level require s cellular self/nonself distinction in order to ( 1) guard against invasion by entire foreign cells and (2) eliminate self cells that put the entire organism at risk when fonns of viral or intracellular parasite eradication have failed. Mechanism s of cellular self-nonself recognition were present prior to the advent of multicellularity as is illustrated by the ability of certain amoebae to distinguish prey from other members of their own specie s and s pecies-specific mate recognition among protists. Whether any of these fonns of recognition were used in the emerging metazoan immune system is unknown, although it is possible such relationships could be inferred f rom the genes involved The specific driving force behind the development of self-nonself recognition employed in metazoan immunity may have had selective advantages other than protection from pathogens. This is exemplified in the colonial hydrozoan Hydra c tinia, where self-nonself recognition is instrumental in
2 protecting against colonial invasion by the germ cells of competing strains of the same species ( Buss et al., 1985), and in the colonial tunicate, B o tryllus (Scof ield et al. 1982). and other sessile colonial invertebrates, where competition for space selects for those colonies that can distinguish self from others. A different but not unrelated driving force may have arisen from the need to maintain a structured balance between cell proliferation and death (Buss, 1987) in organisms composed of up to trillions of individual cells, each with the capacity to multiply (Buss, 1987). Metazoan immune systems may have arisen from a multiplicity of selective factors and may be similar or different among the various groups. One common manifestation of self-nonself recognition is allograft rejection or the ability to react to tissue grafts from other individuals. This phenomenon can be demonstrated in virtually every metazoan phylum including the Porifera ( Hildemann et al., 1979) and has been used to examine the variability of the immune response among diverse vertebrate representatives (Perey et al., 1968). Carefully planned experiments employing grafts with various level s of relatedness to the host (e.g., self conspecific, and congeneric) and over various time courses can define ph e nomena such as recognition of self, recognition of nonself, and memory. These experiments may indicate similarities and differences among metazoan recognition systems; however, to determine whether s imilarities are analogous or homologous may require definition at the molecular level. The metazoan immune system is best understood in the mammals where immunity involves cooperation among numerous cell type s and genetic systems, ranging from generalized physical barriers such as skin and mucus through more specialized
3 molecular mechanism s directed at specific types of pathogens. To varying degrees all of these aspects of the mammalian immune system are interdependent. This dissertation focuses on the genes involved in one facet of the vertebrate immune system, adaptive immunity (i. e ., the ability to recognize and "remember" precise molecular forms). Adaptive immunity in mammals is directly dependent on three classes of proteins : (I) the major histocompatibility complex (MHC) class I and II antigens (2) the T cell receptors (TCR) and (3) the immunoglobulins (lg). The DNA encoding two of these proteins, the TCRs and lgs whose evolution will be the focus of this work, rearranges during the development of the individual cells in which they eventually will be expressed. While these genes share many features with other multigene families the phenomenon of segmental rearrangement appears to be entirely unique to the vertebrate immune system Although somatic rearrangement of DNA was once proposed as a general mechanism for cellular differentiation it appears to be restricted to only a few other cases among the metazoa (Borst and Greaves, 1987 ) The origins and evolution of specific immunity are of interest both becaus e of the uniqueness and the adaptive importance of this system In addition the genes of the immune system, the structures of the proteins that they encode, and the development of the cells that express immune receptors comprise one of the most fully characterized genetic system s and as such may serve as model systems for the study of the evolution of proteins by gene duplication and diversification As will be discussed in greater detail below, the domain structure comprising the T cell antigen receptors and immunoglobulins is found in many different proteins However direct evidence for rearranging forms of genes in this family has only been
4 found among the jawed vertebrates (g natho s tomes). Although based on negative finding s, it seems that rearranging genes may be restricted to the gnathostomes or else are present in an undiversified state in the agnathans. Whatever is the case, the evolution of the segmentally rearranging genes currently can be inferred only from genes found in the jawed vertebrates. To obtain as distant a perspective as possible relative to the mammals, the studies described here have concentrated mainly on representatives of the chondrichthyan fishes which diverged from the lineage giving rise to the mammals approximately 450 mya during the Ordovician period (Carroll, 1988). The data reported in this dissertation are divided among three sections, each of which is concerned with a different class of immune system rearranging gene. The first section describes analyses of immunoglobulin heavy chain genes in the holocephalan fish Hydrolagus co lliei. The Holocephali are a subclass of chondrichthyan fishes which diverged from the present day elasmobranchs about 350 my a (Carroll, 1988) Previously, analyses of heavy chain gene arrangements in representative elasmobranchs have shown major differences in the genetic organization and chromosomal distribution relative to mammals. These studies with Hydrolagus co lliei were undertaken in order to investigate the extent to which this "elasmobranch condition" is found throughout the Chondrichthyes and in an attempt to assign a character polarity to the chondrichthyan-mammalian divergence The sec ond section of thi s study is concerned with the evolution of immunoglobulin light chains among chondrichthyans and how their genes are related to those found among mammals Specifically, this section involves an analysis of light
5 chain isotypic diversity and organization in three diverse chondrichthyan fishes : Heterodontus francisci, the horned shark, Raja erinacea, the little skate, and Hydrolagus colliei, the spotted ratfi s h The evolutionary relationship between these lig ht chain isotypes and the more comprehensively studied light chains of the tetrapods is discussed. The third section of this study consists of the identification and characterization of T cell receptor homolog s and other IgffCR related molecule s in Heterodontus and Raja. For a number of reasons T cell receptor genes have been postulated as the more primitive member of the rearranging gene family. Yet, because they display an extremely rapid evolutionary rate, isolation of corresponding genes in vertebrate species outside of the mammals has proven difficult. A unique PCR (polymerase chain reaction) strategy was developed in the course of these studies that led to the identification of T cell receptor genes in a number of non-m ammalia n vertebrates, including the chondrichthyan fishes, placing the origin of the TCR system at an early point in vertebrate phylogeny. To put all of this information in perspective, the next section consists of a background literature review The final section draws together what has been learned regarding all rearranging immune system gene types in a necessarily speculative treatment of their origins and interrelationships.
6 2. BACKGROUND Immunoglobulins (lg) and T cell receptors (TCR), along with a wide variety of both functionally related and unrelated proteins, are united by a characteristic domain structure in an assembly termed the lg gene superfamily (Williams and Barclay 1988) A vast array of proteins are members of this family and although they are most commonly cell surface adhesion and receptor molecules, extracellular matrix and cytoplasmic proteins are also included. The hallmark of this gene family is the "immunoglobulin fold", a protein domain composed of seven to nine arranged into two sheets that are usually stabilized by an intrachain disulfide bond (Figure 1A). Many of the proteins possessing this fold are cell-surface adhesion molecules of which a number function in various facets of intercellular recognition in both the immune and nervous systems. Ig superfamily domains can be classified into a number of subsets, including the Vand C 1-type domains (by analogy to Ig V and C; see figure I A) which are found in Igs and TCRs. This V /C 1 division appears to represent a relatively early split in the gene family The V -type domain is characterized by an additional pair of strands. Outside of the vertebrates, probable members of the Ig supergene family have been isolated from a variety of metazoans including molluscs (Williams et al., 1988) insects
A B FRI FR2 FR3 FR4 TM CYT 7 Figure I. Immunogl o bulin Structure. A. lg light c h ain variable region (V-type, left) and co n s t ant r eg i on (C I -t ype, right) domains. Arr ows repre sen t p st r a nd s. Th e cysteine disulfide bond i s s hown as bla c k b and. Complementary determining reg ion s (CDRs) are s hown as s trip e d turns (after Abbas ct al., I 994). B Ig m o nom e r structure. Note that either secre tory (SEC) or transmembr a ne (TM) region s would be found on both heavy c hain s. CDRs are s h own as stippled r eg i ons. Framework regions are crosshatched. CYT indi cates cy t op la s mic regi o n
g: (S un rt a l.. 1 990) and spon ges u l .. 1994 ) A dwpt t tit 1 ttf tlte' h appears t o possesses this charuc t e risti c fold I H hi1s been s h own that the cadh erins a famil y o f tt!EI'j' !Je r e l a t e d t o the J g superfumily bused o n its Mn w W t'Et s i g n if i c ant s e q u e n ce similarity i s f ound t o l g d o m nins (Owm Juin ul.. .. remains d e b a t able whethe r many o f these e xumples in whit:h -s. or l o w a r e a ctual h o m o l ogs of the verte b r a t e J g (Jf p h e n o m e n a it i s pro b able that the protein domain c o nsid e r a bly olde r than its r o l e in the inducibl e rec o g n i tion m o l e cule s in lgs and T C R s Whe r ea s a numbe r o f l g superfnmily protein s a r c involved in 1M immune sys t e m the lgs and T CRs are unique in that the ir final prore1 n somat i cally b y the rearrangement o f multipl e g ene fragm ents at the D N A o n the function a l structural and s equence similaritie s betwe en TCRs and lg, iin. tha t they s h a r e a c lose e v o lutionary r elatio nship Md tha t any @!f-n!lrrmr origins will b e facilitat e d thro u g h studies of the phylogenetic d is tribuaioo 2lllOOII of b oth protein families. This section will begin with a brief summary fu; o f thes e prot eins within the mammals then continue with a more i s known in the nonm ammalian jawed and jaw less verteb rat e s FanaBDy_ I of the many remainin g questio n s which this thesis to differing exttea:nts
9 The Mammalian Immunoglobulin Model The biology and structure of the genes that encode variable region proteins was described first, and is understood most thoroughly, for mammalian Igs. Features which are held in common between Ig and TCR s will therefore be illustrated in a description of these genes. Mammalian TCRs will then be described briefly followed by a review of the current state of knowledge of the structure and function of these genes in the nonmammalian vertebrates. The immunoglobulin monomer protein is comprised of four polypeptide chains: two heavy chains containing from 4 to 5 immunoglobulin domains and two light chains, each comprised of 2 domains (Figure IB). Association between chains is stabilized by cysteine disulfide bonds. The two light and heavy chain s in the immunoglobulin monomer are identical giving the basic immunoglobulin molecule two binding sites with identical specificities. Interestingly the camel, Camelus dromedarius, expresses some forms of Ig as heavy chain dimers with no light chains (Hamers-Casterman et al. 1993). Immunoglobulin is secreted often in the form of multimers (dimers, tetramers, pentamers and hexamers) of this basic unit which are held together by an additional polypeptide chain termed the 1 chain. The amino-terminal domain of both the light and heavy chains i s termed the variable region and i s the site of antigen binding. Variation within the variable region is heightened in three areas termed complement a rity determining region s (CDRs). These regions are located on protruding turns between (s ee figure I a). The themselves along with turns on the other side of the V H domain, make up
10 four more con s erved structural zone s termed framework regiOn s (FR I-FR4) The carboxy-terminal domains of the Ig chain s are relatively uniform among immunoglobulins of differing binding s pecificity and are termed constant regions. It is this constant region of the heavy chain that is involved in eliciting the various effector function s (i.e., action induced after binding) of the vertebrate immune system and in mammals (and probably all Sarcopterygians) a number of constant region isotypes have evolved that i mpart different immune functions to different Ig types (see discussion below) Immunoglobulin diversity which is necessary for binding specificity is achieved by the recombination of a number of repeated but varied, gene fragments at the somatic DNA level to form the variable region (Hozumi and Tonegawa 1976) In the heavy chain a V H region which encodes most of the variable domain (including CDRs one and two) is recombined with a short DH region and a JH region. The light chain variable region organization is similar but lacks the short D region (i e V L-JL). Joining between these regions is mediated by conserved heptamer-nonamer sequences termed recombination signal sequences (RSSs) The heptamer sequence (consensus: CACTGTG) is separated from the nonamer (consensus : ACAAAAACC ) by either 12 or 23 nucleotides of sequence that is only phylogenetically partially conserved and may play a role in joining (Ramsden et al. 1994) Joining occurs only between an RSS with a 12 nucleotide spacer and one with 23 nucleotides The joining of these segments is inexact in that, depending on the position of the join nucleotides may be removed by exonucleolytic activity or added in a non-templated fashion by terminal deoxyribonucleotidyl transferase (TdT) This nontemplated imprecision i s responsible for the greatest portion of generated
II diver sity. Small run s of nucleotides termed P region additions that are complementary to th e coding region may a l so be added as a result of hairpin tum formation at th e coding joints during the joining process. The region encoded by thi s junction form s the third CDR (CDR3). The exact mech a nism of recombination is just beginning to be elucidated, but genes nece ssary for I g!TC R specific recombination have been i so lated (Oettinger et al. 1990 ). Although only a single heavy chain locus is present it contains nine constant region type s (Figure 2A ; see below ) and numerou s variable region V D and 1 elements. In human s and mice there are two light chain loci, 'A and K (Fig ure 2B and 2C), each located on separate chromosomes and differing in their elemental arrangement. Ig is first expressed as a membrane-bound antigen receptor on the surface of B lympho cy te s ( B cells). Subsequent selecti ve proliferation of tho se cells whose expressed immunoglobulin is reactive with foreign antigen then supplies the directional force necessary for the amplification of useful specificity. B cells can ei ther b e activated direc tly by the crosslinking of their Ig receptors upon binding multivalent antigen or indirectly b y T cells with TCRs spec ific for peptid e antigen bound to major histocompatibility class II proteins and presented on their s urfac e. In the latter case, the peptide antigen is obtained from the proteolytic cleavage of Ig-bound and endocyto sed foreign protein s Once activated, B cells proliferate and some may t erm inally differentiate into pla sma cells whose role it i s to produc e a secretory form of I g that is assembled through an alternative mRNA splicing mechanism (see Figure IB for a diagram of TM/SEC configuration). Other B cell may remain inactive as memory cells. Ig specificity and affinity may be e nhanced further in
A l g H V H ( -200) D,, (>30 ) J11 ( 6) CJl Co Cy3 Cy l CE\JfCal C'f111Cy2Cy4 CEI Ca2 H "HH_HH HH __ HH I IIIII I I H H H HI .. K'.:: !B lgK v < (-1 00) C IgA. V,(-100) -M...HH _HH_HH D TCRa E TCRb F 1 1 c). I 1 )2 c)2 J).3 c). 3 J).4 c 4 J,s c).s J1. 6 c). 6 t----.....,H'7EHH::B1--fHT.B--,'l-12GJ---! J a(50-IOO) C a V62 V63 D6 J6 C5 V64 VP (-75) D P I J P I C P I D P 2 C111 G TCRy J 2 c 2 y y 1 2 Figure 2. Huma n Ig a nd TCR Gen e Organization Numbers in parenthe ses are esti m mes of ge ne numb er. "'I'" indi ca t es a pseudogene (so m e V and J pseudogene.s are not indicat e d) Con stan t regions are s h aded. Note that TCRo is contained within the TCRa l ocus.
13 rounds of somatic point mutation (about 106 X background) known as somatic hypermutation, which is directed at the V region A further dimension of differentiation among mammalian (and possibly all sarcopterygian) Ig heavy chains is the ability to match multiple constant region isotypes to a predeveloped V region specificity. This is accomplished by both alternative mRNA exon splicing in the case of IgD and a second mechanism of DNA rearrangement that appears to take advantage of homologous recombination mechanisms for IgG IgA and IgE (see Figure 2A). It is noteworthy that two entirely separate mechanisms of DNA rearrangement have evolved as part of the diversifying mechanisms and transcriptional regulation of this gene system a phenomenon which is not found in other vertebrate genes. In most B and T cells of the mouse and human, and probably in other osteichthyan and tetrapod species, only one allele of each antigen receptor chain is expressed; a process termed allelic exclusion. In this way B and T lymphocytes express receptors with only a single specificity as opposed to four if all alleles were expressed in T cells or the complex situation with multiply-specific molecules in the case of immunoglobulins. In addition to allelic exclusion, mammalian Ig and TCR genes are generally isotypically excluded, with only one locus (in the case of Ig light chains) or set of loci (in the case of being expressed in a given cell. Other genes are known to be expressed in an allele-specific manner including hepatocytes producing albumin and afetoprotein (Michelson 1993) and the vertebrate olfactory receptors (Chess et al. 1994 ). The mechanism by which these phenomena take place in B and T cells is currently
14 unknown, but while it may share some common aspects with these other examples, it is probably accomplished through a different mechanism Lymphocytes can be found that have not undergone isotypic (Ciachino et al. 1995) or allelic (Schittek et al. 1989; Davodeau et al., 1993) exclusion and while rare, they may help to elucidate the processes by which these phenomena take place in the majority of cells As will be discussed later, the nature of chondrichthyan antigen receptors invokes a number of problems with regard to how allelic and isotypic exclusion is accomplished, if it indeed takes place Mammalian TCRs Mammalian T cell receptors are expressed on the surface of T cells, so named because their differentiation takes place in the thymus. T lymphocytes are similar to B cells morphologically and like B cells, originate in the bone marrow. The TCR is a heterodimer of two chains, each containing a variable regionand constant region-type domain, similar to the Ig light chain Unlike the Ig light chain, however, each TCR peptide possesses a hydrophobic transmembrane region and short C-terrninal cytoplasmic tail similar to that of the Ig heavy chain Two dimeric TCR types are known from studies of mammals: TCRaP and TCRy8. Figure 2D-G illustrates the human genomic organization of these genes. The function of the TCRaP is the best understood of the two This TCR is comprised of an a chain composed of a V J and constant region and a P chain composed of a V, D, J and constant region. The TCRa genomic locus contains multiple V and J region elements associated with a single constant region exon complex.
15 The genomic TCRP locus contains multiple V 13 elements and two constant region complexes each associated with multiple J region elements and a single D13 element. As with lg, there are variations in the number and positions of TCR gene elements in humans and mice. Ig and TCR genes exhibit extreme organizational plast i city which is especially evident when the equivalent genes of non mammalian vertebrates are considered As with Ig, the ap T cell receptors utilize their diversity to recognize foreign protein antigens; however they do so only in the context of a third set of cell surface glycoproteins, the products of the major hi s tocompatibility complex (MHC) genes. The ap TCRs recognize peptides that are presented on the cell surface bound to the MHC class I or class II proteins forming a trimolecular complex which brings the T cell and antigen presenting cell into contact. Peptides are bound to the MHC molecules in a highly polymorphic pocket. MHC class I molecules present peptides that are generally of endogenous origin (i.e ., are the cells own proteins or those of a virus or intracellular parasite produced within the cell) and typically elicit a cytotoxic (killing) response from the T cell when foreign. Most nucleated cells types in the human and mouse express MHC class I proteins on their membranes Peptides presented in an MHC class II context are usually of exogenous origin and are the result of the endocytosis and proteolytic degradation of extracellular protein B cells, macrophages, dendritic cells and endothelial cells express class II MHC glycoproteins and are termed antigen presenting cells In B cells and macrophages where endocytosed material is recognized by surface lg (in the case of macrophages this may take place through an Ig receptor intermediate) this antigen presentation phenomenon offers a direct link between TCR and lg recognition. ap T
16 c ells after s uch an encounter may then be activated a s helper cells which in tum activate B cell s with surface Ig specific for the initial antigen In such a case the actual epitope (site of recognition on the antigen) recognized by the lg and the TCR do not need to be equivalent, but merely must be physically linked to one another at the onset of endocytosis The genes encoding the mammalian TCR8 chain are embedded within the TCRa locus (see Figure 2D-E), probably the result of an early duplication The precise function and mechanism of operation of mammalian y8 TCRs has not been established It is becoming increasingly evident that T cells bearing these antigen receptors operate in a fundamentally different fashion than do the TCRs. If they function through an MHC type intermediate, then it is probably not through the classical MHC proteins and may not involve a peptide/MHC intermediate (Schild et al., 1994; Rock et al., 1994) Analyses of TCRy8 CDR3 lengths versus those of other variable region proteins give some indication that antigen binding in these cells may be more closely related to that of the Ig bearing B cells than that of the T cells (Rock et al. 1994 ) A recent study has implicated a TCRy8 subset in a role as epithelial cell nursemaids and thu s having a function that is entirely unrelated to that of the TCRs (Boismenu and Havran, I 994 ). It might be noted that the acquisition of secondary function in a system with the plasticity of the variable region proteins seems entirely plausible. Some authors have suggested that the function of these genes as immune regulators is secondary to their original function a s regulators of cell proliferation (Stewart, 1992)
17 IgffCR Accessory Molecules and Coreceptors Both membrane-bound lg and TCR proteins have very short cytoplasmic tails that do not appear to independently perform any signaling function. These proteins instead are dependent on a number of other molecules with which they are noncovalently associated in the cell membrane Membrane lg is expressed with two proteins Iga and Igp, which possess longer cytoplasmic tails containing phosphorylatable tyrosine residues that are used in the process of cell s ignaling TCRaP and TCRy8 are associated with the CD3 y, 8 and E and the and 11 proteins which also possess phosphorylatable tyrosines. Figure 3 illustrates these associations. Upon antigen binding these proteins are able to recruit various cytoplasmic signaling proteins that may eventually effect changes in the nuclear transcriptional machinery The evolution of the antigen receptors i s clearly linked to their function and an explanation of their divergence must take into account how their interaction with the cell signaling m ac hinery is affected. In addition to the CD3 and chain proteins mature ap TCRs are expressed with one of two additional proteins that interact with nonpolymorphic regions of the MHC proteins. Helper T cells express CD4 which interacts with MHC class II molecules, while cytotoxic T cells express CDS which interacts with MHC class I proteins Notably each of these coreceptor chains contain Ig superfamily V -type domain s The CDS V domains are especially TCR!Ig-like invoking the possibility that they resemble the nonrearranging an c estors of the TCRs and Igs (Davis and Bjorkman, 19SS) All these chains contain tyro si ne kinase recognition sites and are involved in cell signalling.
F j g u re 3. M ammalia n B Cell a nd T Cell R ece ptor Compl e x es. Extracellular r egio n is a n th e tap : cyt oplasm ( he bott om. L oops ind k ate immunoglobulin gene superf a mily d o main s s tabilized b y c y s t ein e bo nds. (s-s.) T'yi'OI$inte iliat are flll1'1l of the ty r os ine kina s e recognition motif are design a ted (Y).
21 Agna tha Raja 't-------------------Teleos tei \----------La time r ia 't------------Dipnoi 1 Chondr ichthyes Amphibia 2 Holocephali 3 Ela s mob r anchii Reptilia 4 O s te i chthyes 5 Sar copterygi i Ave s 6 Tetr apoda Mammalia Figure 4. Hypothetical Vertebrate Phy l ogeny. Terminal taxa are listed as they are referred to in the t ext and do not represent equivalent l evels of classification Placement o f the sa rcopteryg i ans (Latimaia and the Dipno i ) rel at ive to the tetrapods is uncertain.
22 Rearranging Immune System Genes of Nonmammalian Vertebrates Ig genes are easily recognizable as light and heavy chains in all the jawed vertebrates and these diverged from one another before the divergence of the chondrichthyans and bony fish. Therefore, the phylogeny of each will be discussed separately. Until recently, lg heavy chains have received the greatest attention, thus enabling a more detailed discussion of their evolution. Recent work on both chondrichthyan and osteichthyan light chain genes have shown that both their isotypic diversity and organizational plasticity are comparable to that of the heavy chains. The evolutionary relationships between heavy and light chain genes are unknown as there is little context in which to develop hypotheses (i.e., intermediates have not been established). Although comparison to TCRs may be invaluable it is difficult to align TCR sequences with those of lgs with confidence that evolutionary relationships are reflected in the comparisons. It is notable that all variable region-containing proteins (TCRs and Igs) are made up of dimers in which one chain possesses diversity (D) elements and the other does not. It presently is unclear whether this represents: ( 1) a vertical evolutionary relationship between quaternary structure in these proteins (2) the ease of acquisition of additional recombining segments, and/or (3) functional limits in the extent of diversity that can be carried by a single antigen receptor. The relationship between Ig and TCR will be discussed in more detail in the last chapter of this dissertation. Within Igs, light chain constant regions show the greatest sequence identity to the fourth exon of the f..l-type heavy chain constant region, which may reflect the origin
23 of lg from a monovalent TCR Iike molecule where duplication of internal chain domains (exons) within one chain ultimately resulted in its elongation into a heavy chain and a concurrent transition from a monoto divalent state. Although the peptide chains which comprise the Igs and their corresponding genetic elements are similar in all jawed vertebrates that have been studied, the diversity of their specific sequences, chromosomal arrangement, and the mechanisms in which they are employed to create diversity appear to be extraordinarily variable phylogenetically. In a sense, this is fortunate as one vertebrate group may serve as the basis of a natural experiment and help to explain function in another. Immunoglobulin Heavy Chains The Heterodontus Prototype. Relative to the mammals, the most divergent extant jawed vertebrates are the chondrichthyan fishes, which include the elasmobranchs (sharks, skates, and rays) and the holocephalans (chimeras; see summary of chondrichthyan interrelationships below). As Chondrichthyes comprise the most phylogenetically distant taxon to man in which specific humoral immune responses to a variety of defined antigenic determinants have been characterized, species of this group have been the focus of significant efforts to elucidate immune function (Litman et al., 1985). The most intensively studied chondrichthyan, Heterodontus francisci (horned shark), does not exhibit affinity maturation lacks fine specificity differences in the hapten-specific antibody responses among genetically unrelated animals (Makela and Litman, 1980) This
24 lack of affinity maturation led to the hypothesis that somatic mutation may not take place in the Ig loci of Heterodontus in the same manner observed in higher vertebrate immune responses. However, recent genetic evidence indicates that somatic mutation plays an important role in diversifying specific Ig genes, although it may not be associated with affinity maturation (Hinds-Frey et a!., 1993). The single heavy chain type described in Heterodontus corresponds most closely in its physicochemical characteristics to mammalian IgM, and is designated as J.I-type. Like the mammalian counterpart Heterodontus IgM is expressed in its secreted forrn as a pentamer, but also is found in high concentration in the monomeric configuration. V H gene homologs in Heterodontus were originally detected using heterologous cross hybridization with a mouse V H probe, which enabled the subsequent definition of the genomic structure of Heterodontus heavy chain loci. The Het e rodontus J.Itype heavy chain genes exist as closely linked V H-DH 1-DH2-JH-CH clusters which are dispersed throughout the genome, at multiple loci and on different chromosomes (Amemiya and Litman, 1991). The VH, DH and JH segments are separated by -300 base pairs and the overall length of a cluster is estimated at -18 kb. An organizational model is shown in Figure 5A. The V H DH, and JH elements share significant sequence identity to the corresponding mammalian elements (Litman et a!., 1990) e.g. V H genes are -60% identical at the DNA level. All of the V H genes belong to a single gene family (V HI) in which the individual members are -90% related at the nucleotide level, with the exception of one cluster which contains the monotypic representative of a second V H gene family (V HIT), which is only 60% related to V HI. The detection of other V H families by negative
Figure 5. Non-Mammalian Ig Heavy Chain Gene Organizations. A. Unrearra nged V-D1-D2-J-C cluster arrangeme nt fo und in Hydrolagus, Heterodontus and Raja. B Partially rearranged VDD-1-C cluster found in Heterodontus and Raja. C. Fully rearranged VDDJ-C arrangement found in Heterodontus and H y dro l agus. D. Partially joined VD-DJ-C cluster found in Raja. E. IgX cluster found in R aja. F. Tandem repeating V -D organization found in the li ving coe l acanth Latimeria chalumnae L=hydrophobic leader sequence; V=variable segment; D=diversity segment; J=joining segment; C l-C4=constant region exons; S=secretory exon; TM=transmembrane exons ; 0 =decamer-nonamer TCR CRE promoter element (found only in Heterodontus);e =lg octamer promoter element. RSS spacer l engths are indicated by trian gles: = 12 bp, ..,. <1111=23 bp.
A B C L TM Cl HI--D L 1 2 TM L 1 2 E L L L F N 0'1
27 selection strategies employing CH-specific probes have been unsuccessful as have been effort s to detect additional CH familie s. The DH and JH elements are remarkabl y si milar between different clusters (Litman et al., 1993c). The recombination signal sequences (RSSs) in the Heterodontus V HI gene cluster are arranged s o that either DH2, or both DH I and DH2 (but not only DH1) may participate in VDJ joining during recombination ( Litman et al., 1993a) In the V HII gene cluster the intervening sequence (including the RSSs ) between DH1 and DH2 is inverted, permitting either DHl-or DHI-DH2joining but not DH2-joining alone. Using cluster-specific probing techniques, which permit the identification of tran s cription products of specific gene clusters, it has been s hown that V H rearrangement occurs only within and not between clusters, eliminating combinatorial diversity as a mechanism in the generation of diversity. The high level of identi ty between V HI family members in different clusters severely restricts thi s potential source of diversity Comparisons of genomic parent clusters (V HI and V HII) sequences to their corresponding eDNA sequences indi c ate that both extensive deletions and N ( nontemplated ) -additions occur in these animals during the joining process ( Hinds-Frey et al., 1993 ) The introduction of a joining site at DH l/DH2 also seen in TCRb (Chien et al. 1987) provides an additional site for variation which may compensate in part for the reduced levels of V H variability and limited V H family d i versity in the ela s mobranchs ( Hind s -Fre y et al. 1993) In terms of isotypic complexity, a ll constant region genes are J.l-type, consisting of six exons: CHI, CH2, CH3, CH4/SEC TM l and TM2 ( Figure 5A-D). There are
28 significant degrees of sequence difference between the CH exons identified in individual clusters, particularly in CH 1. Although the entire JH-CH intron has been sequenced (unpublished observations), there is no evidence for the presence of isotype switching sequences associated with these clusters consistent with the existence of only one heavy chain isotype (Kokubu et al., 1988a). A second characteristic unique to the chondrichthyans is the presence of germlinejoined lg genes. Approximately half of the germJine (nonlymphoid) heavy chain genes examined thus far are partially (VD-J) or fully (VDJ) joined, lacking intervening sequences between some or all of the V region elements, respectively (Litman et al. 1990; Figure 5B-C). The V H coding sequences of these genes closely resemble non joined genes at the nucleotide and predicted peptide levels, with the exception of the D regions (Litman et al., 1993c ), which exhibit distant relatedness to typical, unjoined D regions. These differences cannot be explained by the typical variation which accompanies V -D J joining; instead it appears that these D segments either have diversified extensively by typical evolutionary mechanisms sustained very high levels of regionalized somatic variation or were formed from a different type of D segment(s) The intact leader and JH-CH intervening sequences (see below), typical JH-CH splice sites (correct reading frame in VDJ-joined genes), up s tream (5') putative regulatory sequence s (see below) absence of internal termination codons and presence of additional 5' and 3' sequence identity with unjoined cluster-type genes suggests that these genes are not pseudogenes (Kokubu et al. 1988b ) It is unlikely that these genes arise from B cell contamination in the tissues u sed to construct the genomic libraries because the incidence of such genes is high, and
29 identical germline-joined genes have been recovered from both liver and gonadal DNA of unrelated animals (Litman et al., 1993a). As will be described below, joined light chain genes also have been described. The existence of joined germ line genes has several important implications. 1) The specificity of these genes (unlike the unjoined clusters) may be selected for entirely by conventional evolutionary processes. 2) The inherited specificities of VDJ-joined genes (or to a lesser extent VD-J joined genes), which are compromised in terms of junctional diversity, could be modified by somatic mutation and/or provide antigen binding motifs serving specialized needs 3) These genes may be substrates for gene conversion, or serve as substrates for secondary recombination events (Kokubu et al., 1988b ). Although the joined genes appear to be transcriptionally competent in Heterodontus, extensive searches for transcripts arising from joined heavy chain genes using a variety of direct screening and polymerase chain reaction (PCR)-based approaches have been unsuccessful. Several characteristics of Heterodontus lg gene loci are more reminiscent of mammalian TCR loci than mammalian lg loci supporting the hypothesis that the cartilaginous fishes possess lg that retain characteristics of the putative lgffCR common ancestor Shared properties include: I) proximity of DH, JH, and CH segments (TCR and 0), 2) closely linked V H JH, and CH (murine TCR y) 3) two DH segments closely linked to JH (murine TCR 0), 4) absence of hyperconserved regulatory octamer in VH (all TCR), 5) presence of decamer nonamer cyclic AMP reactive element (CRE; Anderson et al., 1989), and 6) low affinity of antigen binding (Matsui et aJ., 1991 ). However, it must be emphasized that the exon organization of the Jl-type heavy chain
30 predicted amino acid sequence, and capacity for somatic mutation(s ) are characteristic Ig properties. The finding of TCR genes in the cartilaginous fish (see below) provides a basis for defining the genetic mechanisms involved in the divergence of Ig and TCR systems. Raja as a Developmental Model. Most of the characteristics of Heterodontus heavy chain loci that distinguish them from the osteichthyan heavy chain locus including: I) cluster organization, 2) germline-joined genes and 3) restriction(s) in V H family diversity, extend to the other cartilaginous fishes, including a skate of the genus Raja. Raja possesses an advantage over Heterodontus as a model organism in that staged embryos are available for developmental studies of Ig gene expression (Luer 1989). In addition, the skates and rays provide an important group for assessing diversification of the chondrichthyan Ig gene system over an extended evolutionary time period as they diverged from the lineage leading to Heterodontus -220 million years ago (Carroll 1988). Studies of the J..l-type gene loci in the little skate, Raja erinacea, show that these clusters consist of one V H element, two DH elements, a JH element and six CH constant region exons, including the two TM exons, paralleling the Ig exon arrangement in Heterodontus (Figure SA). Raja V H elements are related closely to Heterodontus V H elements with 78% nucleotide identity in framework regions (FR) and 60% overall amino acid identity. Raja V H elements, like those of the Heterodontus exhibit a very high degree of nucleotide identity ( -90%) between different clusters. The minimal variability of DH segments in Raja is similar to that observed in Heterodontus. but Raja JH segments are considerably
31 more variable (Harding et al., 1990b) Preliminary studies indicate that the J .Hype heavy chain clusters are present at different chromosomal location s in Raja (Anderson et at. 1994) and in Heterodontus (Litman et al. 1993c). In Raja germline-joining occurs in this locus in -50% of the clones characterized, although the completely joined VDJ form has not been detected (Harding et al., 1990b ). The most common pattern of joining in Raja is V HDH l-DH2JH (Figure 5D) (Litman et al., 1993c). As indicated above, analyses of Heterodontus Ig genes have established that only one constant region type is found in association with the V H genes described in this shark species Other unrelated heavy chains may be present, but there is currently no DNA or protein evidence to suggest this to be the case in Heterodontus. However both protein and nucleic acid analyses have identified a second heavy chain type IgX in Raja. There is also evidence from protein studies for a second H chain i s otype in the frill shark (Chlamydoselachus anguineus), a member of a more primitive elasmobranch lineage (Kobayashi et al. 1992), suggesting that multiple heavy chain types may occur throughout the elasmobranchs The Raja V x elements are -60% related in nucleotide sequence to the V H elements of Raja IgM (Harding et al., 1990a) Comprehensive analyses of both nucleotide and predicted peptide structure suggest that the lgX heavy chain gene is unique to the Chondrichthyes and unrelated to isotypes found in higher vertebrates. The IgX heavy chain loci are in the cluster configuration consisting of one V ,., two D,. (D,. I and Dx2), one Jx elem e nt and two Cx exons (Cx 1 and C x 2) and a cysteine-rich exon which may correspond to a secretory (SEC) exon (Figure 5E) (Anderson et al. 1994 ) A putative
32 promoter sequence (TATTT AAA) is located 18 bp 5' of the start codon. The octamer (ATGCAAA T), which is found invariably 50-70 bp up s tream of all higher vertebrate Ig V H and V L element transcriptional start sites (Parslow et al. 1984 ; Litman et al. 1993c; Atchison et al., 1990), does not occur within 1500 bp upstream of the Raja lgM or IgX start codon. The VH region elements (Vx, D x 1 D x 2, and Jx) of the IgX heavy chain locus are flanked by typical RSSs The chromosomal distribution of Ig genes, as revealed by fluorescence in situ hybridization (FISH) suggests that, as with IgM, there are several dispersed IgX loci in the genome of Raja and that the clusters for the two isotypes are not contiguous (Anderson et al., I 994 ). The presence of two Ig isotypes, both in multiple cluster form and on different chromosomes raises questions about the regulation of Ig gene expression and the nature of allelic/isotypic exclusion in this species IgX and IgM are co-expressed on individual embryonic spleen cells in the skate Bathyraja aleutica (Aleutian skate) but not on adult spleen cells indicating that these loci are able to escape i s otypic exclusion during at lea s t one stage of development (Kobayashi et al., 1985) Heavy chain isotypic exclu s ion i n mammals involves a combination of isotype switching and alternative mRNA processing which is dependent on a linear arrangement of different CH isotype exons Exceptions to isotypic exclusion are however found in B-cells that express both A and K lights chains at low levels in mammals (Ciachino et al., I 995) No sequence resembling higher vertebrate isotype switch sequences have been found in the Raja JH-CH I intervening sequence and the failure to detect both IgX heavy chain and cluster elements at the same chromosomal loci is consistent with the absence of conventional class switching
33 between IgM and IgX (Harding et al., 1990a). Tran s-ac ting factors may inhibit rearrangement of other clusters in a manner analogous to allelic exclusion perhaps in a developmental stage-specific manner The control of Ig gene transcription in mammals is linked directly to rearrangement of the V H elements which brings the octamer promoter in close proximity to an enhancer elements present in the JH-CH 1 IVS, resulting in transcriptional activation of the locus. Certain characteristics of the cartilaginous fish Ig loci differ from mammalian Ig loci, including cluster-type organization, multiple dispersed loci, absence of octamer, and germline joined genes, suggesting a different mechanism of transcriptional regulation. For example germline-joined gene expression does not appear to be dependent on somatic V region rearrangement, since the V region is already in a rearranged configuration. It is possible, however, that germline-joined genes are under transcriptional regulation that is independent of the unjoined clusters. Further study is necessary to determine the role of isotypic exclusion in the expression of IgM and IgX and its implications for specificity and autoimmunity. Like mammalian Ig, Raja IgM gene loci generate either SEC or TM forms of heavy chains by alternative splicing and Polyadenylation (Harding et al., 1990b ). IgX gene loci however, give rise to at least three different types of transcripts, which do not contain traditional TM exons (Anderson et al., 1994). The putative SEC mRNAs (VxD,,JDx2JxCx1Cx2SEC) are the fully spliced transcription products of fully rearranged gene loci A second class of IgX mRNA s contain V x regions from either rearran ged or incompletely joined loci, correct Jx-Cx l-Cx2 splicing and -1250 bp of 3' sequence,
34 downstream of C x 2, which does not include SEC. This "tail doe s not contain traditional TM exons, but instead consists of additional lg domain s which have not yet been defined at the genomic l evel (M. Anderson, personal communications ) Because lg transcript s containing the SEC exon instead of the TM exon(s) are not expressed in high abundance until activated B cells differentiate into plasma cells (Desiderio 1993) the absence of SEC in the incompletely joined transcript s of genes may indicate a relatively early stage of B-cell development. Actinopterygians: the Ray Finned Fishes. lg heavy chain gene organization has been investigated in a number of teleost species and appears to be generally similar to that found in mammals. In Elops saurus, a member of a primitive teleost lineage, closely linked V H genes and pseudogenes have been identified in individual A genomic clones (Amemiya and Litman, 1990) V H elements contain typical RSSs with 23 bp spacers and are associated with typical mammalian-like upstream octamers. Similar findings are described for the V H elements of the rainbow trout, Oncorhynchus mykiss (Mats unaga et al. 1990). In Elops several JH element s each having RSSs with 23 bp s pacin g are <3 6 kb upstream of the first CH exon. Studies of the channel catfish (lctalurus punctatus) genomic JH region revealed nine JH elements in a 2.2 kb region that is -1. 8 kb upstream of CHI. These JH genes are similar to each other except in their 5' regions, where sequence variation can contribute to CDR3 diversity unles s it is eliminated during the joining process. Each catfish JH has a typical RSS with 22-24 bp spacing (Hayman et al., 1993) Field inversion gel electrophoresis (F IGE ) mapping in Elops indicate s that V H
35 elements are found within -100 kb of CH. Only a s ingle 11-type CH region isotype has been found in all teleost species thu s far investigated with the exception of th e Atlantic salmon, Salmo salar, in which two very similar (98% nucleic acid identity) C11-type constant regions are present, consistent with the quasi-tetraploid genome of this species (Hordvik et al., 1992). In Elops saurus two V H gene families have been identified that share about 50% nucleotide identity and more families may be present (Amemiya and Litman, 1990 ). Five V H families have been identified in Ictalurus punctatus (Ghaffari and Lobb, 1991 ). Southern blot analyses using V H family-specific probes derived from / cta lurus indicate that V H region family diversity is present throughout the teleosts (Jones et al., 1993). DNA sequencing of isolated clones has established the existence of at least three V H families in the pufferfish, Spheroides nephelus (M. Margittai unpublished observations), which possesses a relatively small genome (a bout one seventh that of the typical mammalian genome; Hinegardner 1968). It is thus reasonable to conclude that extensive V H family diversity is present among the teleosts The splicing of the TM exons to the third rather than the fourth CH exon is a feature of I g M gene expression which appears to be unique to the teleosts. In all of the teleosts studied thus far, including channel catfish (Wilson et al., 1990) rainbow trout ( Andersson and Matsunaga I 993), Atlantic cod (Bengten et al., 1991 ), and Atlantic salmon (Hordvik et al., 1992), the typical consensus 5' splice site embedded in the vertebrate CH4 exon is ab sen t consistent with the sequences of the TM cDNAs which demonstrate direct splicing of CH3 to TM. Genomic Southern blot and gene titration
36 experiments indicate that there is only one copy of the IgM-type heavy chain gene locus in teleost fish, suggesting that this type of transcript must result from alternative splicing event s, rather than from the product of a related gene locu s (Ghaffari and Lobb, 1989 ) Although CH4 is absent from the TM, the transcript is longer than the SEC owing to the presence of a long 3' untranslated region (Bengten et al. 1991 ). The los s of the CH4 exon in membrane-bound lgM does not appear to effect function suggesting that CH4 function is specific to the SEC form of IgM possibly in the fonnation of complex covalent structures (Wilson et al. 1990) Sarcopterygians: The Lobed-finned Fishes. Among the living sarcopterygians, V H gene structure has been investigated most extensively in the coelacanth, Latimeria chalumnae and closely linked V H elements have been identified as in Elops A significant portion of the V H gene population is comprised of p s eudogenes, also reminiscent of those described originally in Elops (Amemiya and Litman, 1990). Each non pseudogene V H element is associated with a typical upstream octamer and RSS with a 23 bp spacer. However in contrast to the typical osteichthyan arrangement a DH element is located -190 bp downstream of each non pseudogene V H segments. The DH segments are flanked by RSSs with 12 bp spacers (Figure 5F) Obtaining further infonnation on the heavy chain locus in this species has been problematic owing to the difficulty in procuring lymphoid tissue that is suitable for the extraction of mRNA. The atypical, somewhat chondrichthyan arrangement of the V H gene elements is con s istent with a number of other chondrichthyan-like morphological and physiological characters found in Latimeria
37 (Amemiya et al., 1993), however the IgH locus is evolutionarily plastic and the similarities to the chondrichthyan s may be secondarily derived. Studies at the protein level have shown that the African lungfish (Protopterus aethiopicus), a Dipnoi, possess three different CH isotypes (Litman et al. 1971 ). cDNAs encoding two different isotypes have been sequenced (J Rast, unpublished) and the V H region associated with each CH is representative of a different family. Northern blot analyses of spleen mRNA indicate that each of these V H family-specific probes and their respective constant region-specific probes hybridize to different size bands. These preliminary findings contrast with observations in other tetrapods in which different V H families associate with different constant region isotypes through class switching. This interpretation suggests that the heavy chain locus structure in this species may be complex. Germline arrangements of heavy chain genes in Prot opterus presently are unknown, as such analyses are complicated by the extraordinarily large genome of this species ( -37 times that of humans) This group is of great interest in terms of future study, owing to these preliminary observations as well as its unique phylogenetic position high haploid cellular DNA content and putative evolutionary relationship to Latimeria. Amphibians. Like the teleosts, amphibians possess the single extended locus form of Ig chromosomal organization, including multiple V H gene families. However, the amphibians as well as Dipnoi and reptiles possess multiple CH isotypes The most extensively studied representative of this group, Xenopus laevis (South African clawed toad) an anuran, possesses three heavy chain isotypes : IgM (Schwager et al., 1988 ), IgX
38 (Haire et al., 1989) and IgY (Am emiya et al., 1989). At least two Ig isotypes which are homolog s of Xenopus IgM and IgY (Fellah et al. 1992; Fellah et al. 1993b) are found in a urodele, the Mexican axolotl (Amblystoma mexicanum). Xenopus IgM appears to be homologous to mammalian lgM, which is found throughout the radiations of jawed vertebrates; however, IgX and IgY may lack mammalian counterparts. The observations that IgM and IgY share common JH sequences, and the isolation of cDNAs containing f.l or u-type constant regions (Wilson et al., 1992) and identical V H regions are consistent with a class switch occurring between these isotypes. However the IgM to IgY class switch differs from mammalian class switching (which typically occurs after antigen stimulation) in that lgM levels remain high after the appearance of lgY. There is a weak secondary response to antigen in Xenopus, associated with the IgM to IgY class switch and limited affinity maturation, which is intermediate between chondrichthyan and mammalian levels. The limitations in affinity maturation in Xenopus are not due to a paucity of genetic elements available for rearrangement. Xenopus possesses eleven highly diversified V H families, at least 17 DH elements, and at least 10 different JH elements, which may exceed the level of genomic complexity found in mammalian systems (Haire et al., 1990) The overall relatedness of the V H regions (FR1-FR3) belonging to different families ranges from 35-70% nucleotide identity The Xenopus V H family members, with some exceptions appear to be interspersed throughout the V H locus (Haire et al., 1991 ). Some of these families do not contain certain amino acids which are hi g hly conserved in FR I and FR2 of human, mou se, Het e rodontus, caiman, chicken, and Elops V H genes, further
39 expanding the diversity of the X e nopu s V H repertoire ( Haire et al. 1990). In addition, Xenopus DH elements can be fuse d, inverted or truncated, and are used in different reading frames. There is ample evidence that the number of elements available for combinatorial diversity, the level of junctional diversity and the rate of somatic mutation are comparable to that found in mammalian systems. One explanation for the limited affinity maturation of specificities may be the lack of highly developed germinal centers in these animals (Wilson et al., 1992). Reptiles. Studies of the caiman (Caiman crocodylus) also are consistent with an extended locus form of lg gene organization. Caiman heavy chain gene coding segments show 6570% nucleotide identity and 60-65% amino acid identity to mammalian V H genes, most of which is in the framework regions ( Litman et al., 1983) These exhibit an RSS spacer which potentially could provide a novel recombination mode direct joining of JH to V H bypassing Dw Notably, in situ hybridization using heavy chain probe s in the s napping turtle, Chelydra serpentina has demonstrated th e presence of multiple chromosomally dispersed loci (C. Amemiya and G. Litman personal communication). A vians. The Ig gene system of birds possesses an entirely unique form of I g heavy and light chain gene organization. The chicken heavy chain locu s consists of a 60-80 kb cluster of V H pseudo genes and a single functional V H gene, V H l Several DH elements and a single JH elements are located downstream. The pseudogenes lack RSSs exhibit 5' truncation, lack leader sequences, and are in different relative transcriptional polarity.
40 Some of the V H pseudo genes appear to be fused to closely rel a ted potential DH region s, and some of these contain some putative JH codons (Reynaud et al., 1989). These similarities may be indicative that these pseudogenes are the result of a previous germline fusion event and subsequent duplication and diversification (Ota and Nei, 1995). VDJ rearrangement involves only the V H 1 gene one of the nearly identical DH elements, and the single JH gene, providing only limited combinatorial diversity and little apparent junctional diversity Diversification is achieved primarily when gene conversion occurs between the rearranged allele and upstream pseudogenes which donate blocks of sequence to V H 1. This sequence transfer may be partially driven by the pseudo gene homology to DH and, in some cases, JH. The use of different pseudogenes as donors is not completely random as some exchanges are preferred. Multiple exchange events result in a gradual increase in modifications with increasing cell divisions ; the greatest degree of diversity accumulates in the DH region (Reynaud et al. 1989) Three constant region isotypes: lgM, IgG (lgY), and IgA occur downstream of JH. The evolutionary relationship between these isotypes and those denoted similarly in other vertebrates i s unclear, although the functional relationships appear equivalent. Immunoglobulin Light Chains In higher vertebrates, light chains may not participate in antigen binding to the same extent as heavy chains; however, their class diversity and organizational complexity exceeds that of the heavy chains. For example, the presence of two loci (1C and A.) and
41 the organizational variability of the 'A locu s within mammals in some ways surpasses heavy chain variation within the entire osteichthyan/tetrapod lineage As additional light chain genes in lower vertebrates are investigated, it appears that their complex structural and organizational variation may rival that found in mammals Continuing analysis of patterns of light chain variation in lower vertebrates may shed important new light on the overall mechanisms of evolution of antigen binding receptors. Chondrichthyan Light Chains. The prototypic Ig light chain genes in the cartilaginous fishes (type I) are arranged in clusters of closely linked V u JL, and CL elements (Figure 6A). This light chain family has been identified in both Heterodontus and Raja. Their genes exhibit high identity at the nucleotide level to mammaJian TCR p chains. However, amino acid sequence and gross constant region structure unequivocally allies these with the Ig light chains (Shamblott and Litman, 1989). There are at least forty unique type I light chain clusters in the Heterodontus genome. The multigene cluster-type arrangement of the type I light chain genes parallels the Heterodontus heavy chain genes. However unlike shark heavy chain genes the type I genes possess the invariant light chain octamer (ATTTGCAT) 5' of their V region start codon. No germline-joined type I genes have been detected in Heterod o ntus (Shamblott and Litman 1989a ) In contrast, extensive studies of the V L -JL regions in over fifty different clusters have established that all of the type I light chain genes of Raja erinacea are V L -J L gerrnline-joined (Figure 6B) (Anderson et al., 1995) Furthermore, there is strong evidence that at least some of these germline-joined genes are expressed at the mRNA
A L e --1 v 1>-- I I =I v r--Figure 6 Non-Mammalian Ig Light Chain Gene Organizations. A. Unjoined cluster found in H e t ero dontus type I and nur se shark (Ginglymostoma) and possibly H e t e rodontus type III (K) clusters. B. Germline rearranged cluster arrangement of Raja type-I clusters and all known cho ndri chthya n type-II clusters. C. Catfish (lctalurus) li ght chain genes cluster organization. Arrows indicate tran sc riptional orientation (Ghaffari and Lobb, 1 993). L=hydrophobic leader se quence ; V=variable seg ment; J=joining segment; C=constant region exon;e=Ig octamer promoter element which may be pre se nt in catfish clu s ters. RSS spacer length s are indicated by triangles: t>= 1 2 bp, <4=23 bp. Presumably RSSs are present in /ctalurus but have not ben defined. N
43 level, suggesting that these are physiologically relevant. The existence of potentially all of the light chain genes in a gerrnline-joined form severely limits the genetic mechanisms available for generating diversity in the antigen-combining sites Two other cartilaginous fish Ig light chain types have been described. A light chain from the sandbar shark, Carcharhinus plumbeus (Schluter et al., 1989 ; Hohman et al., 1992 ; Hohman et al., 1993), differs considerably from that described previously in Heterodontus and Raja. This light chain type has subsequently been isolated from a number of different chondrichthyan representatives (Rast et al., 1994 ) and is discussed at length in the fourth section of this dissertation. A third chondrichthyan light chain type was isolated from the nurse shark Ginglymostoma cirratum and appears to be a homolog of the mammalian K light chain isotype, setting the divergence of this IgL type prior to the chondrichthyan-osteichthyan divergence (Greenberg et al. 1993) V L region divergence into distinct gene families (isotypes) has been well documented in mammals (Kabat et al., 1991) and amphibians (Schwager et al., 1991 ). A case of such divergence in the chondrichthyans is described in the fourth section of this dissertation. Teleosts. Genomic Southern blot and eDNA sequence data for the Ig light chains in two teleost species, the cod (Gadus morhua) and rainbow trout (Oncorhynchus mykiss) indicate multiple copies of both V L and CL region genes. Sequence analyses of light chain cDNAs and Northern blot analyses revealed several different type s of aberrant light chain transcripts: 1) a rearranged and spliced cDNA(s) lacking the V L sequence upstream
44 of the CDR3, 2) an unspliced eDNA containing CL preceded by what is presumably JL-CL intervening sequence, and 3) a JL-CL spliced eDNA containing some VL-JL intervening sequence (Daggfeldt et al., I993). The presence of these transcripts is consistent with transcription patterns which are found in mammalian pre-B cells. Transcription of unrearranged Ig loci may be an essential component in promoting the accessibility of the locus to recombination machinery Genomic analyses of the catfish lctalurus punctatus, have shown multiple V L regions upstream and downstream of single JL and CL elements (Figure 6C) This pattern is found in multiple copies (> 15) in the catfish genome V L elements are in opposite transcriptional orientation to JL regions and presumably are rearranged by inversion rather than deletion. As in the cases described above, a number of aberrantly processed cDNAs were detected (Ghaffari and Lobb, I993). Amphibians. Xenopus po ssesses at least three types of light chains. The genes that encode these two light chains are present at two separate loci, one comprised of V 0 and C0 segments, the other comprised of V e and Ce segments. The cr locus contains multiple V 0 I family segments, a few V 02 family segments, and C0 I and C02 segments. The V 0 elements are unusual in several regards and are comprised of II5 residues rather than the typical 110. In addition, the two cysteines that form the intradomain S-S bond are separated by 73 residues rather than the usual 65. The otherwise invariant Trp36 and Trp47 are substituted by lie or Leu. The two C0 regions also differ by 26 nucleotides in the coding region, resulting in a nine amino acid substitution located outside the four-
45 stranded sheet which (in mammalian Ig) participate s in CL-CH interactions. C0 s hares 29 % residue identity w i th and 29 33 % identity with the C L residue s of s hark, chicken or mammals suggesting that the s e isotypes diverged early. However, the expressed V 0 elements are highly conserved, with little divergence in the CDR regions in outbred animals (Schwager et al. 1991 ) A third light chain with sequence that is more similar to typical vertebrate light chains has recently been cloned. The existence of this light chain was anticipated in protein studie s (Hsu et al., 1991 ). This isotype appears to have multiple constant region genes (possibly in cluster form) that are associated with a number of highly diverse variable region families (R. Haire personal communication). A vians. A vians possess a single light chain locus which shows greatest sequence identity to the mammalian A type gene. The chicken Ig light gene locus is arranged similarly to the heavy chain locus, as the light chain locus contains 25 V L pseudogenes upstream of one functional V L one JL, and one CL (Thompson and Nieman 1987) Like the heavy chain genes chicken light chain genes are diversified by gene conversion but there is also evidence indicating that imprecise joining and somatic point mutations are involved in generation of diversity in this system The avian light chain and heavy chain gene loci represent a clear case of parallel evolution where a unique diversity generating mechanism acts at two loci that diverged before the mechanism became operative
46 Non-Mammalian T Cell Receptors Very little is known regarding the T cell receptor s in vertebrates outside of the mammals. Whereas lg genes can be isolated relatively easily from any jawed vertebrate using typical crosshybridization or degenerate PCR strategies TCR genes seemed resistant to these techniques, probably as a result of the fact that they tend to evolve at higher rates than the lg. In addition, their members are more diversified, complicating isolation techniques that are aided by high gene copy. To date outside of the mammals, only the chicken genes have been identified by crosshybridization and this was only accomplished after screening with a panel of mammalian V 13 probes (Tjoelker et al., 1990) Unlike the chicken lg light and heavy chain loci, the chicken locus is generally similar to that of mammals, although it possesses fewer and less diverse V 13 segments (Tjoelker et al. 1990). A strategy that employed coprecipitation of TCR using an antibody to CD3 was used recently to isolate a TCRa homolog from the chicken (Gobel et al., 1994). Again this locus appeared generally similar to that of the mammals. The transcripts from the Axolotl, a urodele amphibian, were isolated by a degenerate PCR strategy that took advantage of similarities between mammalian and avian constant region sequences and those of various vertebrate Ig light chains (Fellah et al., 1993a) Unfortunately this same strategy has proven unsuccessful with a number of other vertebrate species and in any event offers no means for identifying other TCR isotypes homologs (i e., a ; y and B). An analysis of V13 family diversity in the axolotl demonstrates a diversity comparable to that of the mammals and comparison of sequences
47 suggests that some of these V 13 families may have diverged prior to the divergence of the amniote and amphibian tetrapods ( Fellah et a!., 1993a) More recently information derived from the axolotl V 13 sequences was used to con s truct primers that were successfully employed in the isolation of a homolog from the rainbow trout (Partula et al., 1994 ). A similar but broader approach that wa s developed independently and used to isolate chondrichthyan TCR genes is described in section five of this dissertation Agnathans: the Jawless Vertebrates The jawed vertebrates were derived at least 450 million years ago from the agnathans (Carroll, 1988) which then formed a vast assemblage, but apart from the Gnathostomes is represented today only by hagfishes and lampreys. Little i s known at this time about Igs and antibodies in the surviving jawless vertebrates. These two families represent highly divergent forms, although they may s hare a common ancestor since their divergence from the ancestors of the jawed vertebrate s ( Stock and Whitt 1992). Historically, two views have been presented as regards the nature of humoral immune recognition molecules in these specie s : 1) that the recognition molecules, such as found in the lamprey, were heterodimers resembling labile forms of "modem-type Ig (Litman et al., 1993c) and 2) that inducible, specific recognition is mediated by molecule s of atypical (lacking typical heavy and light chains) structure. These possibilities are not yet resolved ; however, it recently was shown that the hagfish molecule ( which ha s been
48 equated with the "conventional lamprey recognition protein ) possesses a heterodimeric structure with very limited peptide identity to Ig. Molecular genetic studies show that the structural re se mblance is coincidental and that the Ig-like molecule is a complement component homolog (Ishiguro et al., 1992). Extensive screening of both lamprey and hagfish genomic DNA and eDNA libraries with a wide variety of Ig gene probes and various degenerate PCR amplification strategies as of yet have failed to yield products exhibiting a significant degree of sequence identity with the highly conserved segments of higher vertebrate Ig. These findings suggest that authentic homolog s of Ig genes do not exist in these species or that they are highly divergent. The recent development of a minimal degenera cy PCR amplification approach may facilitate elucidation of true lg or TCR gene homo logs. Based on the general assumption that cellular immunity precedes humoral immunity, it is possible that these spec ies may possess TCR and not Ig or may posses s a TCR-like gene that can be expressed in an extracellular form. These possibilities will be discussed in more detail in the last section of this text. Immune System Organs and Tissues Based on the foregoing descriptions it is apparent that a considerable amount of data regarding the immunoglobulin s and T cell antigen receptor genes now exists for lower vertebrate species. However in terms of fully understanding cellular mechani s m s that generate and select recognition diversity as well as the nature of alternative forms of
49 antigen receptor genes, it is instructive to consider what is known regarding immune system organs and lymphopoiesis in these species. The phylogenetic distribution of the tissues and cells associated with B cell development and lg production potentially can provide insight into the evolution of the immune system. Because these analyses are based mainly on histological considerations, where function is inferred by analogy to mammalian systems, care must be taken to avoid interpreting a lack of certain aspects of mammalian immunity in a divergent vertebrate group as an absence of immune system complexity Unique adaptations of the non-mammalian immune systems may only become evident when these systems are investigated using tools of similar sophistication as have been used in mammalian studies. These considerations have been reviewed extensively (Du Pasquier 1989). In the agnathans, immune system tissue is limited to that associated with the digestive system and is termed gut associated lymphoid tissue (GALT), which is present in all vertebrates that have been studied. The thymus and spleen of the jawed vertebrates can be regarded as specialized forms of GALT. The most detailed hematopoietic studies carried out on the agnathans have employed lampreys. In the ammocoete lamprey larvae, blood production begins at isolated blood islands, but later moves to typhlosole (a structure of the gut considered by some to represent a spleen homolog), and then to intertubular and fat regions of the nephric fold which represent the primary regions of blood formation and lymphopoiesis throughout larval life. Involution of these tissues takes place at metamorphosis and hematopoiesis moves to the protovertebral arch in the adult (Percy and Potter, 1976) No definite thymu s homolog has been described in either hagfish or lamprey.
50 A of cell typ es that a r e morpholo gically simila r to the immune of the jawed vertebrates h a v e been described, Zapa t a et aL (Zapata et aL. 19 8 I) have reported ce lls t hat m o rph o l ogica lly resemble p l asma cells in the ammocoore larvae of P etrumywn m a rinu s In the h agf i sh, Ep tatr e tu s sto utii t wo cell types t h at appear w be irwoh .. ed in mixed leucocyt e reactions h a v e been described ( Rai son et aL, 1987 ) As no gene homo log s of the jawed ve r tebrate lg or TCRs h ave been id entified in the agnathan s it is diff icult to move beyond speculation with r egard to pos s ibl e relations between agnathan tis s ues and cells and tho s e of the higher vertebrate immune sys1em. All jawed vertebrate s that have been studied posses s a well defmed thymus au:nd spleen (Good et al., 1966). The ela s mobranch thymus has been described from a number of species and is delineated by a connective t issue capsule. Typical Jobaae strocmre widJJ corticular and medullary zonation i s exhibited (Zapata, 1980 ). Similar thymic sttructnnre has been observed in a number of phylogeneticaU y diverse elasmobrancb .spec:ie$ amd distinct age-dependent involution appears to occur in some species (CarD Luelr_ communication). Thymic lymphocyte s in the shark H eterodontlls joponiats appear Ito D:act surface lg (Tomonaga et al. 1985 ) As with mammals the el asmo bran ch spleen contains both erythroponettiic red pwlljp> and lymphopoietic white pul p w hich appears to be the major site of iinn these taxa. Th e splenic white pulp lymphatic tissu e is orgmuiz.ed arood blood. iiun a fashion similar to that observed in the mamm als and is rich in lg ( Tomonaga et al. 1985) Plas ma cells are apparent in clasroobrnJOCh s;plelmii!C 1tii$Sll1Ye ltwtt r are in the p e riph era l circulatio n (Zapata.. 1980).
51 In addition to the thymu s and spleen the elasmobranchs possess two unique organs that histologically appear to be associated with immune function: the Leydig's organ (spec ialized GALT adjacent to the esophagus) and the epigonal organs ( Zapata, 1981; Mattison and Fange, 1 982). Both of these organs are similar in structure and contain granulocytes, lymphocytes and plasma cells. Some elasmobranch species lack either the epigonal or Leydig's organ, but at least one is always present. The holocephalan cartilaginous fishes (e.g. Hydrolagus) lack these organs, but similar lymphomyeloid tissue can be found in the ocular orbit and the roof of the mouth Major aggregations of lymphocytes are also associated with the spiral valve of the elasmobranch intestine (Tomonaga et al., 1985) In the osteichthyans (bony fishes), the kidney provides both hematopoietic and lymphopoietic function. GALT i s present and probably has important immunological function (Rombout et al., 1993) Both the thymus and spleen of the bony fishes are similar in structure to those found in mammals. Lymphopoietic bone marrow is present only in the amniotes and some anuran amphibians. Intere s tingly even among the anurans the use of bone marrow for hematopoiesis is both seasonally and phylogenetically variable. While it is increasingly evident that all of the jawed vertebrates possess lg, both classes of MHC and TCR genes, l ymph nodes and germinal centers, which serve as sites for immune cell interactions, appear to be absent in all but the mammals and avians. The lack of significant affinity maturation in vertebrates other than the mammals and avians, has been attributed to an absence of germinal centers (Wilson et al., 1992 ; Hinds-Frey et
52 al. 1993) An analysis of reptilian lymphoid tissue may indicate whether these structure s have arisen independently in these two lines or from the a vian-mammalian common ance s tor. Inferring immune function from similarities among vertebrate tissues can be problematic; however two points emerge ( 1) The heretofore observed in variance of certain splenic and thymic characteristic s among jawed vertebrates representing all classes suggests that the immune processes that take place within these structures and at least some of the i r associated functions, were well established prior to the radiation of these taxa (2) The evident ubiquity of GALT throughout the agnathans and major taxonomic groups of jawed vertebrates, and the possibility that the spleen and thymus of higher vertebrates is a derivation of this tissue type may contain clues as to the origins of the vertebrate immune system At the same time there is a certain flexibility with regard to the anatomical sites of lymphopoiesis If this flexibility extends to sites of immune cell interactions, then some immune functions, analogous to those already well defined in the mammals, or other unique interactions, may go unrecognized with currently available methods. With the identification of the genes encoding lg, TCR and MHC proteins in all the major jawed vertebrates groups as well as the development of a method for detecting unique types of antigen receptor genes (Rast and Litman 1994 ) it should be possible to significantly advance our understand i ng of the tissues and cellular interactions involved in the immune systems of these animals.
53 Evolution of Immunoglobulin Variable Region Genes The evolution of multigene families is often characterized by non-Mendelian processes collectively termed molecular drive (Dover, 1982) These processes include unequal crossing over and gene conversion both of which tend to homogenize the members of gene families although in special cases such as the chicken Igs, they may enhance diversity The variable region genes in the mammalian V H locus appear to undergo homogenizing processes at a very low rate if at all compared to other gene families such as ribosomal RNA genes. In fact, evolution at the CDR regions of these genes appears to proceed in the absence of purifying selection and at a rate that is among the highest for proteins (Gojobori and Nei 1984). Tanaka and Nei (Tanaka and Nei 1989) demonstrate that positive Darwinian selection occurs at the CDR regions of mammalian V H genes and invoke a diversity-enhancing selection mechanism such as over dominant selection as the major driving force in their evolution. Ota and Nei ( 1994) also find little evidence for gene conversion in mammalian V H genes and describe the evolution of these genes a s a process of birth by gene duplication divergence by diversifying selection, and death by dysfunctional mutation Whether these same processes also apply to the cluster type genes of the chondrichthyans is not yet known as comparisons of the divergence of orthologous (vertically related) gene pairs in closely related species have not been made. It is possible that because these species have an entirely different Ig gene organization, gene conversion may play a larger role in their evolution.
54 Non-Ig/TCR Rearranging Gene Systems and Nonrearranging lg/TCR-like Genes The clear relationship between TCR and Ig variable regions leaves little doubt that they share a common rearranging ancestral gene This allows the possibility that rearranging genes exist that emerged before the TCR-IG divergence and can therefore be classed as neither. Recently a new rearranging gene type that appears to be entirely different from both Igs and TCRs was isolated from the nurse shark Ginglymostoma cirratum (Greenberg et al., 1995). This gene is comprised of a variable region and five C 1-type constant region domains Upon phylogenetic analysis the constant region domains appear more like each other than those of Ig or TCRs The genomic organization of these genes is cluster like with a V -D-D-D-J variable region configuration allowing for the generation of an extraordinary amount of CDR3 diversity. The product of these genes appears to be further diversified by an unprecedented level of somatic mutation. The gene is expressed as a dimer in both membrane bound and secretory forms. More recently two nonrearranging genes with Ig!TCR-like variable region domains have been isolated: a thymocyte membrane protein from Xenopus laevis (Louis Du Pasquier, personal comm unic ation) and a tran smemb rane eDNA isolated from the spleen of the pufferfish, Spheroides neph e lus (Rast et al., 1995). Further Questions From the foregoing discussion it is evident that the rearranging genes of the
55 gnathostome immune system display an extraordinary amount of variety both with regard to the genes that encode them and the chromosomal organization of these genes. To a large extent this plasticity is probably a consequence of their role as ligands of biological entities which are themselves dynamic in nature (see Murphy, 1993). Sequence plasticity may also result from freedom afforded by a relatively limited terminal function (Kuma et al., 1995). In addition, these genes are partially freed from the constraints of linear order and are dependent on the in situ chromosomal topology which may allow for increased organizational flexibility. The consequences of these unique characteristics, both with regard to the evolution of the vertebrate adaptive immune system and the molecular evolution of proteins in general will be discussed in more detail at the end of this dissertation. As demonstrated in this review, the chondrichthyans offer a valuable phylogenetic vantage point from which to view the mammalian adaptive immune system where the genes are neither so different as to make comparison impo ss ible as is pre se ntly the case among invertebrates, nor so similar as to be uninformative Sections 4 through 6 should add to our knowledge of the chondrichthyan immune system in a number of ways. Sections 4 and 5, by focusing on the holocephalans, extend investigations of lg heavy and light chains to the chondrichthyans as a whole. Section 5 begins to address the problem of sequence relationships between tetrapod and chondrichthyan lg light chains; a problem complicated by the multiplicity of this gene family and for which data is just now reaching the point where meaningful analyses can be made. Finally, section 6 addresses the heretofore unknown nature of chondrichthyan T cell receptors, and sheds new light on the evolution of lgs and rearranging genes in general.
56 3. METHODS General Methods Animal Procurement Hydrolagus colliei specimens were obtained from Pacific Biomarine (Venice, CA) and Seacology (Vancouver, British Columbia). Raja eglanteria specimens were obtained from Dr. Carl Luer (Mote Marine Laboratory, Sarasota FL), Raja e rinacea specimens were obtained from the Marine Biological Laboratory, Woods Hole MA, and Heterodontus francisci specimens were obtained from Pacific Biomarine Supply Co. Venice CA. Tissues (blood cells muscle, liver spleen, gonads) were removed and processed or frozen in liquid nitrogen and stored at -70C immediately after sacrificing animals. DNA and RNA Isolation DNA was isolated from liver testes and red blood cells ac co rding to standard methods (Sambrook et al. 1989) Total RNA was isolated using the RNAzol B method (BIOTECX Labratories Houston TX). Briefly tissue wa s homogenized in RNAzol B
57 (2 00mg/5ml) after which 0 1 volume of chloroform was added and the sample was placed on ice for 5 minutes The sample was then centrifuged at 14,000 x g for 10 minutes and the upper, aqueous phase was removed and added to an equal volume of isopropanol and placed at 0C for 15 minutes. The precipitated RNA was then pelleted at 14,000 x g for 15 minutes and the pellet brought up in a minimal volume (1-2ml) of RNAse free H20. One tenth volume of 3M sodium acetate (pH 5 2) and 2 volumes of 100% ethanol were then added, the sample precipitated and the RNA pellet brought up at a final concentration of lJ.lg/J.ll. Polyadenylated mRNA was isolated from total RNA using Dynal oligodT25 Magnabeads (Dynal, Oslo, Norway). Between 100 and 200ul of bead slurry was washed and resuspended in lOOJ.!l 2X binding buffer. 100-200J.1g total RNA in lOOJ.!l dH20 was heated to 65C for five minutes, mixed with the beads and allowed to hybridize at 22C for 5 minutes The supernatant was then removed after magnetic separation and the beads washed 3 times in 200J.1l of wash buffer. Finally the beads were resuspended in I OJ.!l TE (I OmM Tris pH 8 0 I mM EDT A), heated to 50 C, and the mRNA collected in the supernatant. RT-PCR First strand eDNA was made from 3 to 10 J.lg of total RNA. The RNA was dissolved in l5J.1l of water and lJ.!l (200 ng) of an oligo-dT (T17) primer was added The sample was incubated at 65C for five minutes, then chilled on ice 1J.1l of RNase
58 inhibitor (IU; Inhibitace, 3'-5' B o ulder CO.), 2f.ll lOX RT buffer (500mM Tris-HCI pH 8.3; 400mM KCl; 60mM MgC12 ; I OmM DTT; 1 OmM each dNTP ) and 1 f.! I AMY reverse transcriptase (30U) was then added and the sa mple was incubated at 42 C for I hour followed by 52 C for 30 minutes. The sample w as then heated to 99C for 5 minute s then soake d at 65C. 400f.ll of TE was a dded and the sa mple was microfuged for 5 minute s and the s upernatant tran s ferred to a Millipore MC-1 0000 spin filter ( Millipore ). The volume was reduced to -lOf.ll and 400f.ll of TE was added. After a seco nd reduction of volume to 20f.ll, the sam ple was s tored at 20 C. One half t o one microliter was th e n used as PCR template u s ing primers and conditions s pecific t o the target DNA. Library Construction Genomic A Library Construction A H y drola gus genomic A library of 5 x I 05 primary recombinant s was constructed u s ing Sau3A p a rtially-dige s ted liver DNA and the A-Dashll vector ( Strat age ne ), corresponding to 2 5 haploid genome equivalents (O hno et al., 1969 ) A H e t e r odo ntus genomic library wa s constructed in the /...DASH vec tor (Strategene) corresponding to -1 haploid genomic equivalent. eDNA Library Construction. A H ydro la gus A-Zap (S trat age ne ) eDNA libr ary ( -1 x 106 recombinants ) was constructed from spleen poly-A+ RNA. H ete r odo ntu s and Raja e rina cea spleen eDNA l i brari es wer e constructed from poly-A+ RNA in Agtll vector.
59 Polymerase Chain Reaction Unless otherwise specified, PCR mixtures contained 50mM KCl 1.5mM MgC12 0.01 % gelatin, lOmM Tris pH 8.3; 0.2mM of each of the four deoxyribonucleotide triphosphates ; primers at 0.1 !Jm each ; and 2.5 U Taq polymerase in a 1 OO!Jl volume, which was overlaid with mineral oil. The final cycle wa s extended for 10 min at 72C. M13 Cloning and Sequencing The appropriately sized product was cloned into M13 (mp18 and mp19 ) and sequenced using the dideoxynucleotide chain termination method (Sanger et al., 1977 ) with 35SdATP Probe Synthesis and Library Screening Probes were labeled by random primin g (Feinberg and Vogelstein, 1983 ) to a specific activity of 108-109 cprni!Jg. Libraries were screened at low stringency : 65C for 16 hr in 0 .6 M NaCl 0.02 M EDTA, 0 5 % SDS (w/v), 0.1% Na4P207H20 ( w/v ), 0.2M Tri s pH 8.0 (SET). Nitrocellulos e or charged nylon plaque lifts were washed four times for 20 minutes each at 65 C in IX SSC (0.15M NaCl 0.015M sodium citrate) 0 05 % Na4P207 10H20, 0.1 % SDS and s ubjected to autoradiography. When probes were
60 heterologou s, filters were screened as above except that the washing temperature wa s reduced to 52C. Southern blot hybridization conditions were as described above with 52 C wash temperature. Synthetic oligonucleotide hybridizations were performed according to sta ndard methods (Devlin et al., 1988). Briefly, lifts were prehybridized in 6 X SSC, 5 X Denhardt' s solution (0.02% Fico II, 0 02% polyvinylpyrrolidone 0.02% BSA) for one hour. Hybridization was in 6 X sse 5 X Denhardt's solution, 1 OOjJg/ml yeast tRNA, 0.05 % Na4P207H20, 0 1 % SDS with a probe activity of -5x 106 cpm/ml at temperature s specific for the probe used Lifts were washed in 6X SSC, 0.05% Na4P207H20, 0.1% SDS three times for 15 minutes each at the same temperature as hybridization Hybridization and wash temperature were optimized according to probe length and G/C content and generally ranged from 37-50C. Analysis of DNA Sequences Initial databank used the FASTDB program of IntelliGenetics ( Mountain View, CA). Parameter settings for genomic PCR product analyses were: Similarity matrix, unitary; K-tuple=2; Translation Frame 3; Mi s match penalty= I; Joining penalty=20; Gap penalty= 1.00; Window size=32; Gap size penalty=0 05; Cutoff score= 10. Initial matches were then sorted by optimized score. Rough alignments were achieved using the Genalign program oflntelliGenetics, Inc. (Mountain View CA) and were subsequently optimized by manual inspection. Amino acid alignments were also made u s ing the
61 multiple alignment program in CLUST AL V (Higgins eta!., 1992 ) The alignments were then adjusted in order to maximize identity at conventionally accepted conserved amino acids. Methods of the Hydrolagus Immunoglobulin Heavy Chain Analyses Gene Titration Genomic southern blots were run using 1 OJ..lg of restriction enzyme digested genomic DNA (3 .3xl06 genomic equivalents assuming a haploid genomic DNA content of 3 pg/cell). Restriction digested genomic 'A clones containing the restriction fragment of interest were diluted to multiples of 3.3xl06 copies (1,5,10 15 . ) and run on the same gel as the genomic DNA. The gel was then blotted and hybridized with the probe of interest. Exposures were adjusted according to the intensity of the band of interest. Primers Primers used in the Hydrolagus heavy chain study are a s follows ( shown in 5'-3' orientation; K = T or G, Y = C or T): RFJ30RC, GTCAGTCACTGTCASCATGGKYCCTTGTCC RF182SEC, CCAAGCTTGTGTCCATCTGGACAAGG RF182TM, CCGGATCCACTACAACATACTC RF82CDR2R, CCAAGCTTACAGGAGATCCCATTCCGCAAT
RATF -FRl, TGTACAACCAGCGGGTTT HCOLLLP, TGTTGCTGGCTCTCTTCTCC Cosmid Library Construction 62 H ydro lagus testes DNA was partially dige s ted with Sau3A fo r different time period s and those producing the greatest concentration of fragment s in the 35 to 50 kb range as determined by 0 3% agarose gel electrophoresis were size fractionated on a sucrose gradient. Fraction s that, upon determination b y a second 0.3% agarose gel, contained fragments of -40 kb were ligated into the SuperCos I vector ( Strategene; Evan s et al., 1989) and packaged according to the manufacturer's instructions. Colonies were plated at 1 x 104 cfu/plate on LB Ampicilin plates (2 5!lg/ml ) on nitrocellulose filters. Replica filters were mad e on Hybond-N + ( Amersham) filters and after processing (Nizetic et al., 1991) were screened with H y drola gus CH4 and VH probe s. Methods Specific to the Chondrichthyan Light Chain Analyses Polymerase Chain Reaction Degenerate oligonucleotides complementary to the first framework (FR I ) re gio n of the light chain variable region of the holocephalan Callorhynchus callo rh ynch us were constructed from amino terminal s equences reported previous ly (De loanne s and Aguila 1989) The nucleotide sequence degeneracy of this primer wa s partially restricted in
63 accordance with vertebrate codon usage. Amino acid sequence AVTLTQPSSISPV p G N T v K I T c Primer sequence GGAT CCC GGG AAY ACC CTG AAR ATC ACC TG A c c Reverse translation CCN GGN AAY ACN GTN AAR ATY ACN TG A A BamHI restriction site (underlined) was incorporated in order to facilitate cloning. This primer was used in an anchored PCR employing a 17 -mer oligonucleotide (5'AA TT AACC CTCACT AAAG-3') complementary to the T3 RNA polymerase promoter flanking the EcoRI cloning site of the A. Zap cloning vector (Stratagene) as a 3' primer and a Hydrolagus colliei spleen eDNA library (-3.3 x 106 pfu) as template material. The thermal cycling protocol was 1 min, 94 C -7 2 min, 55 C -7 4 min 72 C for 30 cycles The final cycle was extended for 10 min at 72 C. The mixture was then ethanol precipitated digested with E coRI and BamHI and electrophoresed in low melting point agarose The appropriately sized product wa s cloned into Ml3 (mpl8 and mpl9) and sequenced using the dideoxynucleotide chain termination method (Sanger et al., 1977) with 35SdA TP. Probe Synthesis and Library Screening Homologous VL-and CL-specific probes were made using PCR primers that were s ynthesized based on the sequence of the putative H y drola g us light chain product that was obtained using anchored PCR. The V L probe is a 188 bp segment extending from the
64 degener a te FRl primer de s cribed above to the conserv e d FR3 glutamine codon (codon 79). Amino acid codon number de s ignation s follow Kabat et al. (1991 ) The C L probe i s a 198 bp s egment extending from the fir s t con s erved con s tant region tryptophan codon (codon 148 ) to the p e nultimate cysteine codon (codon 214) Approximately one genomic equivalent (a total of 3.3 x 109 bp of insert DNA assuming an average insert size of 16,000 bp) of a Hydro lag us colli e i liver genomic library was constructed in the A.Dash vector (Stratagene) and was screened at low stringency A H ydro la g us colli e i s pleen eDNA library and Raja erinacea and Heter o dontus francisci genomic and spleen eDNA libraries were screened a s above except that the washing temperature wa s reduced to 52 C when probes were heterologous Southern blot hybridization conditions were as described above with 52C wash temperature To improve cross-species hybridization probes u sed for Southern blotting were extended to include almost the entire constant region from Ala 112 through Cys 214 for the CL probe (309 bp) and from the degenerate FR I primer described above through the conserved JL primer (described below) for the VL primer (280 bp ). A synthetic oligodeoxynucleotide specific for the H y drolagus colliei clone Hc712type VL CDR2 (5'-CACTGATGGAGTGAACAGTTTCACCA GGAT-3') was used to screen the Hydrolagus genomic library as described by Devlin et al. ( 1988 ) using 50 C hybridization and wash temperature A 323 bp nurse shark light chain constant region probe extending from glutamic acid (amino acid 112) to the stop codon was synthesized by PCR from genomic DNA and used to screen a Het e rodontus fran c isci spleen eDNA library From eDNA sequence Hftc5 a 265 bp VL region probe extending from the FRI threonine (amino acid 18) to the
65 FR4 leucine (amino acid 104) was made by PCR and used to screen a genomic library. V L to J L PCR Linkage Analysis PCR employing conserved V L and JL primers was used to determine the V L HJL linkage distance of Hydrolagus colliei genomic light chain clones. Three PCR reactions were carried out employing: 1) a V L primer homologous to the region of the conserved FR3 cysteine (underlined) (5'-GGCCGATTATTACTGTGGT-3'), 2) the degenerate FRI primer complementing the N-terminal amino acid sequence described above, and 3) a conserved FR2 primer terminating at the conserved proline (codon 44) (5' CTGGTACCAGCA(G/A)A(NG)GCC-3') each in opposition to a conserved reverse complement JL primer (5'-TTGGTCCCTTT ACC(AG)AA-3') with the 3' end at the conserved FR4 phenylalanine (codon 98). If the template is germline-joined, then the FRI, FR2 and FR3 primers will yield products of 300 bp, 220 bp and 70 bp, respectively. In unjoined clusters, a product of -350-500 bp would be expected if the V -J intervening sequence (IVS) is equivalent in length to the clusters in Heterodontus (Shamblott and Litman, 1989a). The PCR was performed as indicated above with a 50 C annealing temperature. Analysis of DNA Sequences Nucleic acid sequences were compared against Genbank 74 and amino acid sequences against Swiss-Prot (version 23) using the IFIND program from IntelliGenetics.
66 Palo Alto CA. Multiple alignments were made with the GENALIGN program (IntelliGenetics) u s ing the Needleman Wun s ch algorithm (Needleman and Wun sch, I 970 ). The alignments were then adjusted in order to maximi ze identity at conventionally accepted conserved amino acids. Phylogenetic analyse s were conducted with the minimum evolution method (Rzhetsky and Nei, 1992) with pairwise genetic distance s, which were obtained from the aligned amino acid sequences using a Poi sso n correction. Only framework regions for V L sequences and con se rved, sheet forming region s for CL s equence s were used for analysi s. No po s itions with deletions/insertion s in any se quence were included. GenBank Accession Numbers L25549 (Hco702), L25550 (Hco704), L25551 (Hco7IO) L25552 ( H co7 I 2), L25553 ( H co7 17C), L25554 ( H co 717V), L25555 ( H co718C), L25556 ( H co7 18V) L 25557 ( Hfr I A 1 C), L25558 (Hfr 1 A 1 V ), L25559 (Hfr 1 B 1 C) L 2 5560 ( Hfr 1 B 1 V ), L 2556 1 (HfrK4), L25562 (HfrK5), L25563 (HfrK6), L25564 (Reri02) L 25 565 ( R e r 2), L25566 (Rer4) L25567 (Rer5), L25568 ( R e r900).
6 7 Methods Specific to the Chondrichthyan TCR Analyses PCR Amplification, Subcloning and Sequencing The sequence of a 5' primer complementing the conserved TCR and immunoglobulin light chain framework region 2 (FR2) WYRQ and related motif s i s CCGAA TTCTGGT A(TC) C(GA)NCA. The sequence of a 3' (A ) primer complementing the conserved (YYCA) in FR3 is CGGA TCCGC(GA)CA ( GA ) T A ( GA ) T A and a 3 ( B) primer complementing the conserved (YFCA) in FR3 ts CGGATCCGC(GA)CA(GA)AA ( GA)T A The thermal cycling protocol was J min 94 C ; I min 45C; 15 sec, 72 C for 30 cycles. The final cycle was extended for 10 min at 72 C. The mixture was subjected to agarose gel electrophoresis (NuSieve FMC) appropriately sized bands were plugged with Pasteur pipets and used as templates for a second amplification using the same conditions for 25 cycles The mixture was then ethanol precipitated dige s ted with EcoRI and Bamlll, electrophoresed in low meltin g point agarose and cloned into Ml3 (mpl8 and mpl9). eDNA and Genomic Library Screening A H e t e rodontu s spleen eDNA library ( 18) was lifted onto char g ed n y l on filt e rs and hybridiz e d with the random labeled TCR homologous-PCR product HFYYCI S 2. Tile filters wer e washed at 52 C and subjected to autoradiograph y A 214 bp probe wars
68 generated from the putative first exon of the constant region of eDNA HF6 u s ing PCR. Approximately 0 8 haploid genome equivalents of the Heterodontus liver genomic DNA library were screened with the same probe and under the same condition s described for the eDNA library. Oligonucleotides complementing the putative HF5 transmembrane (5'TTTGCAGATGAGGATCAGGTA-3' and cytoplasmic/3'UT (5'GGITTGACTGATCGTCGCAA-3' ) exons were end-labeled and used a s probes for genomic clone mapping. DNA Sequence Analysis Searches were made against GenBank 79. Initial matches were then sorted by optimized score. Initial alignments of C T segments of TCR gene homologs from various species were achieved using the Genalign T M program of IntelliGenetics Inc. ( Mountain View, CA) and were subsequently optimized by manual inspection. CHEF Analysis Agarose-embedded erythrocytes (51-lg DNA per lane) were digested for 16 hrs using 40U of the respective enzymes. Pulse field gels were run on a CHEF-DR II system (Bio Rad) using a 1 % LE (Seakem ) a g arose gel in 0.05M Tris 0.05M boric acid 1 mM EDT A at 200 V ; switch time=60s for 15 hr 90s for 8 hr. Following visualization of standards, the gel was blotted onto BioDyne B membrane (Pall), hybridized with random
69 nonamer-labeled V T-and CT-specific probes and exposed for five days. Between hybridizations, the blot was stripped and exposed to confirm complete removal of the probes GenBank Accession Numbers U07624 (HF6cDNA), U07623 (genomic VT segment, HF2VA), U07622 (genomic JT segment, HFlJA), U22670 (Hj2A), U22671 (Hfl91Y), U22672 (Hfl91YcDNA), U22673 (Hj2AcDNA)
70 4. HYDROLAGUS COLLIE/ IMMUNOGLOBULIN HEAVY CHAIN Introduction The observation of extensive organizational variation among jawed vertebrate heavy chain gene loci invokes a number of questions regarding the nature of the selective factors and driving forces that have contributed to this diversification. Why does this gene system, which presumably functions similarly throughout the vertebrates, differ in ways which in some cases appear to contradict central tenets of the mammalian functional model? How is allelic exclusion accomplished if at all, in a multilocus system? Does the regulation of joined genes differ from that of their unjoined counterparts? Are the seemingly profound differences between the chondrichthyan and osteichthyan lg heavy chain gene systems reflective of an essentially different mode of operation or are they merely obviated at an early level in the regulatory organization of the entire system, leaving the rest of the system unchanged? The answers to these questions represent some of the ultimate goals of this research and could eventually lead to an increased understanding of the mammalian immune system and of the processes by which complex genetic systems evolve. As both positive and negative evidence mounts, it is beginning to appear plausible that the Ig system of the jawed vertebrates originated in a common ancestor separate from
71 that of the living cyclostome s (see discussion below ) If this system arose near to the divergence of the osteichthyan and chondrichthyan lineages then it may have been present as a single cluster-like locus that subsequently increased in complexity through two series of duplications one resulting in the present day mammalian-type locus and the other in the elasmobranch multicluster loci. The explanation for this vast difference could involve instability in an emerging gene system that was "captured" at the chondrichthyan/osteichthyan divergence. Alternatively, the observation that lgH genes exhibit a considerable amount of functional and organizational diversity even among the tetrapods, suggests that a degree of plasticity is inherent in the system. Defining the Ig heavy chain gene organization in phylogenetically diverse representative species is a first step in addressing these questions and possibilities. The polarity of the evolutionary development of the major division in this system (i.e whether the elasmobranch or osteichthyan system resembles the primitive state most closely) is undefined Although, the simplicity of the cluster type organization must in some ways resemble the primitive antigen receptor genes (i. e ., a simple locus must have been present before it could duplicate and diverge into the complex osteichthyan-type lgH locus) the structure of the elasmobranch TCR genes, as discussed later in this dissertation, establishes that a combinatorial-type locus existed at an early stage in vertebrate evolution. Thus, the elasmobranch lgH cluster could be secondarily simplified if it originally evolved from a complex TCR-like locus. In this section the lg heavy chain genes of the spotted ratfish Hydrolagus colliei, a representative of the chondrichthyan subclass Holocephali, are investigated as a step towards expanding the data base with which to address some of
72 the s e questions The holocephalan s (chimeras and ratfi s hes) are one of the most early diverging groups of jawed vertebrates whose immunoglobulin heavy chain gene organization ha s not been characterized. The phylogeny of this chondrichthyan subclass and its exact affinity to the elasmobranchs is uncertain. However, fossil evidence demonstrates the existence of holocephalans in the Devonian and a probable divergence from the common ancestor with the living elasmobranchs about 350 million years ago (Carroll, 1988; see figure 7 for chondrichthyan interrelations) The Holocephali are predominantly deep-sea species and are thought to form an outgroup to the extant elasmobranchs and be paraphyletic to the sharks as a whole (i.e., extinct and extant species ; Maisey, 1984). A less widely held vi e w maintain s that the holocephalans originated separately from the other chondrichthyans (e.g., 0rvig, 1985). Regardless of their phylogetic affinities, the holocephalans occupy a divergent phylogenetic position and the study of their lg heavy chain gene organization should offer new insight into the evolution of the vertebrate Ig heavy chain s Results Southern Blotting Suggests an Excess of V" Elements Preliminary hybridizations of H y dr o lagus colli e i genomtc Southern blot s employing Heterodontus IgH constant and variable region probes (data not shown)
J> 0'G .c a. 0 0 :r: 300 CD "0 '5 til a:l 220 CD (lj :2 c .c ._ .c () ._ 0 Q) ro "0 :0 0 0 t5 0 oe;, 0". ocP y...e'-e<. e<-o Figure 7. Relation s hips Among the Chondrichthyans Di sc u ssed in this Text. Numbers at the bottom of the cladogram represent e s timated time s ince divergence in million s of years. -1 VJ
74 produced band patterns that were unlike those found in either Raja (Harding et al. 1990b) or Heterodontus (Hinds and Litman, 1986). Homologous V H CH 1 and CH4 probes were isolated from Hydrolagus genomic 'A clones after library screening with the Heterodontus probes, and were used in subsequent Southern hybridizations (Figure 8). Whereas the variable region probe produced a typical multiband pattern indicative of the presence of a large number of genes, the constant region screening resulted in a pattern exhibiting lower complexity using both CH 1 and TM probes. The distinct possibility that single bands were in fact representative of multiple, comigrating DNA fragments was addressed using gene titrations A V H gene titration employing both Hydrolagus genomic clones 'A20 182 and 'A2050 1 V H + clones as standards is shown in figure 9 A. In each case, the corresponding bands appear to represent between 5 and 10 genes. In figure 9B, clone 20182 is used as a CH 1 standard and the corresponding genomic band appears to be represented about five times in the genome A cosmid clone (COS2) was used in figure 9C to standardize a COS5-type CH l screening. A secondary 3kb band resulting from cross-hybridization with the cosmid type CH2 is evident. In each case, corresponding bands appear to represent between 10 and 20 genes Nucleotide sequence variation among genes can cause underestimation of gene number when probe and standard sequences are identical (e g., in the 20182CH hybridization) and both underand over estimations when the sequences are not (e.g., the COS5 CH2 band) Although it is likely that multiple genes are represented in the genomic constant region bands, estimation of gene number using only gene titrations needs to be interpreted cautiously. Nonetheless, there is evidently a significant discrepancy between V and C region numbers.
75 VH TM S E S E 23 .... 9.4 .... 6 7 .... 4.3 .... 2 3 .... Figure 8 Genomic Southern Blot Hybridization with Hydrolagus V H and TM Probes. Enzymes used for digestion are Sstl (S) and EcoRI (E). Size standards are designated in kb.
A 23. 9.4. 6 7 4 3 2 3 2 0 1.3 1.0 .. 0 9 .. 0 6 .. G 1 10 5 20 30 B G 1 23 9.4 6.7 4.3 -2.3 a 2.0 c 5 10 20 30 40 50 60 70 G 1 5 10 20 30 -I" .. 23. 9.4 6.7 4.3 2 3 2 0 1.3. 1.0 0 9 9 Oene Titralions Us in g Hydrol"Rus V11 and CHI Probes Numbers on top indicat e genome-equivalents that are represented by t he phage digest s r e l ative to the gen o mi c DNA digest. A A-20182-VH probe against Hydr o lagu s genomi c clones A-20182 and VH* c lones c ut with EcoRI B. A-20182type C111 probe against A-20182 c ut with EcoRI. C. COS5-type CHI probe again s t C O S 2 dige s ted with E coRL Upper lighter band represent s cross hybridization with C112. -J 0\
77 Genomic Library Screening Isolates Shark-like IgH Clusters Genomic library screening was used as both a second means of estimating gene number and to determine linkage distance between IgH gene elements. Results consistent with those of the genomic Southern blotting were found when V H CH 1 and CH4 probe s were used to screen a Hydro lag us genomic A library (Table 1 ). Table 1. Number of Clones Positive for Various Igh Element Probes in a Screening of Approximately One Genome Equivalent of a Hydrolagus colliei Genomic Library. VH CHI-Type 1 CHI-Type 2 CH4 JH TM VH/CH1-Type 1 571 13 53 60 52 66 4 The only four CH +N H + clones were isolated and found to be identical to clone 'A20 182. A map of clone /..,20 182 along with a putatively overlapping constant region clone /..,20457 (as determined from both mapping and sequence data) is shown in figure 1 OA. The /..,20 182 variable region gene was found to be completely VDJ joined upon s equencing a s has been described previously with Heterodontus The sequence and translation of this gene is shown in figure 11. When a Hydrolagus JH 30-mer oligonucleotide complementing the 'A20 182 JH sequence was used to screen the genomic library in conjunction with the V H probe, only
Figure 10. Restriction and Sequence Derived Maps of Hydr o lagus Genomic IgH A. Clones. A. Putatively overlapping clo n es A.20 1 82 and A-20457. The variable region of A-20182 is comp l etely VDJ joined. B Sequence derived map of A-20501 variable region elements. C. Cosmid c lon e COS5 Variable region is unj o in ed. Note sc ale i s different than t h at of A. clone s. D Maps of two pseudogene containing c l ones A-20420 and A-20421. Note that 20421 contains two V H element s.
A B c D 20182 E8HE X E XE EHX 8HE II I I I I II I It I I I ,, , UD , -,. r----.-.-----ITTI.' v CH1CH2 20457 X E E HX 8 HE E H H II I II I I I I I I '' D 10 I I CH 1 C H 2 CH3 CH4 TM 20501 CT -Tn===::J U U 0 500 bp L V D D J COS5 8 8 8 HE E E H E E EE E E:X E H H 8 . . -. . .. -.. . ... -. ' ' ' ' lOIII ' = '' 'LOJ""""" LOJ' u VooJ CH 1 CH2 C H 3 C H 4 TM 2 Kb t----1 20420 EH H HX 8 8 EE H 8 0 0 0 ___ .l ______j__ L I 1 1 l__ I I I 1 0 'VV .------20421 l;l l;l 1 Kb 'VV 'VV t----1 -.1 \C;
CTGCAGGTGATCAGAACATCACTAAACCATAGGAAATAATTAAGGCACAATGTTAATTATTGTACATAAA ATAAACAGTTACTACAATTTAAAGAGCATTTTGTGAAGTTCCTTTAGACATTTCAATAACATGAAAAGGC ATTATGTCGAATGCAAGGTTTATGATCCAAAAGATGATTATATAAACATAAGAAAATACTAATTGCAGCA CAAAAGCCAATGGTCCATTTGGATTGCCTTCCACCATCCTGGCCGTCACACATACTTGTCATGTCCTCAA CAATATACTGTATCACAAAATATTATTTTCAAAAGTAACCCAGGGCTTCTATATTTTTCTGCTGCCACTG TCTCCCTAACTTTAACAGTCACTCACTGAAATAGGGTTCACGTAGGTTTGTGTTGAACTTACCAATTACN CCACTCACACACACACACACACATTAAAGGAAACAATTCATGAGGCTTCAGGGTCGGTGATGGGGCTAAT ACTATNGACACTTATATTAANATGAATTTGCGGTCAATTGCCGAATNGTAAATGGGCACAATTATTNGAT CTGTGCAGGTCCAAGGGAATGTGAGGCGTGACTCCCCNCAGTCCTGTAGCTATTGGTCCATTCTCCAAAC ACCTGGCCCCACTGCCTCTTGTTAAATAGCGGTCACTGGGGACTGAGGCCATCTGTCCATCCGGCAGATC M R S L I S TTTGTCCCTCGGTTATTGGGCTCAAGGGTCAGTTCTGACAATTGCCCAGAAACATGAGGTCCCTGATAT C L S L L L A L F S TCTCAGTCTGTTGCTGGCTCTCTTCTCCQIDGAGTTCTATTATTGGGACAAGTCTGACTGTGTGTGTGTG TGATTCTCTCTGCAATGTCACAGAGCCGGGATATCTCACAATAATTAATCTTTGGTCATTTGTTCTTTAC R V 0 A D V V L T 0 P S S L T G K P D 0 A L R L T C T T S G F D L S N K F M F W V R R V P G CTCACCTGTACAACCAGCGGGTTTGATCTCAGCAACAAATTTATGTTCTGGGTCCGACGGGTTCCTGGGA K G L E W L L R Y Y R R S H S D N L Y A P G I R AGGGGCTGGAATGGCTGCTCCGGTATTACAGGAGATCCCATTCCGACAATCTTTATGCGCCCGGCATCCG N R V T V I K D Y S D H I C D L T I N S L T V AAACCGTGTCACTGTCATCAAAGATTATTCCGATCACATCTGCGATCTGACCATCAACAGCCTGACAGTT 0 D T A I Y Y C A R G Y N V Y S V Y T G V L I CAAGACACCGCCATCTATTACTGTGCAAGAGGGTACAATGTATACAGTGTGTACACTGGAGTAC TCATTG D Y W G 0 G T M V T V T D GAGTTTATATTTATATGTTTATTTTATTCTCTTTAACAGCTATTTAATT TAATGACCTGTGAGTAGCCAG GCCTATCTCAGAGTTTAGTGCGAGTTAAATTCATTTCACTGCTATTGCTTTCAACAGCCACTGTTGATTT AAGGATTATTCTGTCCATGTTTAGTTCATTGCTGGTGATATTTGTGGTGGTTAGTGACTGTACAGCTGAA GTTACACTGATGGACTTTCTGCAATGTCTGATCACAGTGACTGGTGACTGCTGTTTTCGGGGTATAATGT GATCAGGATTGATAAAAGTGTTTGAAATTTTGCCCACATGTTGGGAATCTTGGCCTGAATGTTGTGGGAT GAGCGACTGGAAGTTATCTGATCGGTTCCAATATGTATTTCAATCCATTATTTAATATAATGAAAGT 80 L v DJ Figure 11. Nucleic Acid Sequ e nce a nd Translation of Hydr o fa gus Clone A.20 182 Genomic V H Region. Splice donor and receptor si tes are shown in rever se ima ge Note that the clone is completely VDJ joined. L = first leader exon; V = variabl e re g i o n ; OJ = pos ition of putative diver s ity and j o ining r egion. 10 % of VH+ clones also wer e found to b e JH+ A se qu e nc e d erive d map of one s u c h VH+/JH+ gen o mic segment, clone A-20501, i s s hown in fig ure lOB; th e organization of VH, tw o DH and JH segments is typical of that found in elasmobranchs RSS spacing w as as
AAGCTTGTGATTTCAGTAAATGCAGGGACTGAGAACAGAGAGGATTATTAATWAGTGGACTGAAGGCTGA ATTCATTCCCATTATCTTCCAATCCAGAGATAATTATTGTGGCCACGTTTGAC ACCGAC ACAAACACTAA 81 AATAGACTTCCTCAAATGTAAGAGAACAGAGAGGAAAGTAGGAGTAAGGACACATCAACAAGTGATAAAA TAGAAAAAAAGAGATTAAAATTGCCAGGAACATAATTATTTTTCATTATAAATATATCACATCAAAATAC CAGTTGAGGTAACAAGAGGATTCATGCCTTCATTCCCATATATATGTGAAATCAGAATTGGAAGAGCAGT AAACAGGGAATTATCATCTGCAGGTGATCAGAACATCACTCTACCATAGGAAATAATAAATATATTAAT T AACCAGGATATTAATTATTGCTCATAAAATACACAGTCACTACACTTTACAATGCATTCTGTGCTGGTTC CTTCAGACAATTCATCGACATAATGAGGTGTTGTGTCAAATGCACAAAGTTTATGATAATAACATTTATT ATTATATGAGCATAAGAAAATAAGAATTGCGGCAGGACAAAATAACAAGGTCCATCTGGATTGGCTCCCA GCATCCTGTCCTAAACAATATATTGTATCACACAATAGTATTTCAGACAGTCACCTATCTCATTTATTTT TAAGATGAATCGACACTTTCTGCTTCCAACGTCCCCTAACTTTAACCATTGCTCACTGAATCAGTGTTCG CTAGTTGTGTTGAACATAACCTAATCCACACACACACACAAACACATTCAAGGAAACAATTCATGAGGCT TCAGGGTCAGTGCTGGGGCTAATACTATTGACACTTATATTAATATGAAGTTGCTGTCAATGCAGAATGT AAAGGGTACAATTATTGATCTGTGCAGGTCAGAGGGAATGTGAGGCGTGACTCCCCTCAGTCCTGTAGCT GTTGGTCCATTCTCCAAACACCTGGCCCCACTGCCCCTTGTTAAATAGCGGTCACTGGGGACTGAGGCCA TCTGTCCATCCGGCAGTCTTTGTCCCTCGGTTATTGGGCTCAAGGGTCAGTTCTGACAATTGCCCAGAAC M R S L I S L S L L L A L F S L I S L S L L L ATGAGGTCCCTGATATCTCTCAGTCTGTTGCTGGCTCTCTTCTCCCTGATATCTCTCAGTCTGTTGCTGG A L F S CTCTCTTCTCCCm8CGAGTTTCTATTATTGAGACAAGTCTGACTGTGCGTGTGTGTGGGTGTGATTCTCT R V 0 A D V V L T 0 P S S L T G K P G 0 A L R I T C GGCCGATGTGGTGCTGACACAGCCATCGTCTCTAACCGGGAAGCCTGGCCAGGCCCTGAGAATCACCTGT R T S G F 0 L S K Y G M N W V W 0 V P G K W L AGAACCAGCGGGTTTGATCTCAGCAAATATGGCATGAACTGGGTCTGGCAGGTTCCTGGGAAGTGGCTGG E W L L Y I S Y T S A S N S R Y T P 0 I 0 S R V AATGGCTGCTCTACATTTCGTACACATCTGCGTCCAACAGTCGTTACACACCTGACATTCAAAGCCGTGT TVNSNNSNNI IOLNIKSPTGHNT CACTGTTAACAGTAATAATTCCAATAACATCATCGATCTGAACATCAAAAGCCCAACAGGTCACAACACC A I S Y C A S 0 AGGAAAGGCAGCCCCTCCCACCCTAAACTGAAGGGCAGTTATGGTGTCAATGCTGTGTCTGTAGCAAAGA TTAAAGAAAAAGCTTAAGATGGATTTCTTATCATTCCAAACCAATACAGTGAACGGTGGTTTGATGGGGG l)l GGACATATTTGTTTTCTGTTTCTGTGGCTGTGAAGAAAGTGTTAGACAGACTCTGTGCCATGCCACAGGA AAATAGTCCATTTCAGTGAATGTTGGCAGCCAGTATAGAGCAGACACTGAGGTdAGTATGTG]ATGGGAC TGT G TICAC TGTGITATACGGGGC]CACAGCATT AACTGTCAA J1ACAAAAACdCTTGAACACCAAACCAGAT !)2 AATCTCTCCCACTGCTGACGTGTGTGGGTTCCAATTAACGTGAGGAGATTTGATATGTTTATAATGGGGC CTAACACTGAGGATTTTACCAGATTTTAATCGACATAAAAACATAAAATATAATCTTGAATCTGTGGGTG ACAAGTGAAGGTGTTTGACGGGAACATTGCAAACTGGCAAAGATACAAACATTTAACAGCTAAACACAGA GCTGTGCAATTGTTTTCAGAACATTACAGAACTATACGGTTCCCGGTATTTAGTTTGGGGAAATCACCAT Y L D C W G 0 G T M V T V T N J ACTCACTCACAATGCTGAGTTTGGATTTATATGTTTATTTTATTCTCTTTAACAGCTATTTAATTTAATG ACCTGTGAGTTSCAGGGCTATCTCAGAGTTTAGTGCAAGTTAAATTCATTTCACTGCTATTGCTTTCAAC AGCCACTGTTGATTTAAGGATTATTCTGTCCATGTTTAGTTCATTGCTGGTGATATTTGTGGTGGTTAGT GACTGTACAGCTGAAGTTACACTGATGGACTTTCTGCAATGTCTGATCACGGTGACTGGCGGCTGCTGAT TTTGGGGTATAATCAGGAATTGTATAAAAGTGTTTGAAACTTTGCCCACATGTTGGGAATCAGAGCCTGA ATGTTGTGGGATGAGCGACTGGAAGGTTATCTGATCGGTTCTAATATGTATTTCAATCCGTTATTTAATA Figure 12. Nucleic Acid Sequence and Tran s lation of H y drolagus 1..20501 Genomic V H Region Clone L = first leader exon; V = variable region ; D 1 a nd D2 =divers ity re gio n s; J =joining region D H regions are underlined Recombin ation s i g nal seque n ce heptamer and nonamer s are boxed Splice donor and recept or s ite s are shown in reverse image.
82 described previously in Heterodontus ( Kokubu et al. 1988b ) and Raja ( Harding et al. 1990b), putatively allowing the same rearrangement configurations. The seq uence of clone 20501 is shown in figure 12. Three other VH+/JH+ clones were partially sequenced and found to be very similar to A-2050 I, both in the position of V H elements and in sequence. Interestingly CH4, TM, and V H/JH containing clones appear to be present in nearly equal numbers (see Table 1). The slightly lower numbers of JH clones may reflect both the increased difficulty of oligonucleotide hybridization and greater variation in JH sequence. It is notable that in all cases when elements from these clones were sequenced they were found to be genuine Ig elements. Therefore spurious hybridization does not appear to be a factor in these estimates of gene number. Linkage of a Second Heavy Chain Type A Hydrolagus cosmid library was constructed in the SuperCos I vector (Stratagene, La Jolla, CA) in order to establish linkage between V H/JH elements and the constant region exons, which could not be achieved within the size limit of A genomic cloning ( -18 kb ). A number of clones were isolated that were both V H and CH4 positive. Three representative clones were analyzed which did not hybridize to the CH 1 probe derived from the 20182 ratfish prototype. Sequencing of the exons of one of these clones (COS5) and partial sequencing of two others (COS I and COS2; data not shown) demonstrate that variable region and certain constant region exons differ significantly
83 from the prototypic 20182/20457 cluster, especially at exons CHI and CH2. The COSStype CHI exhibits greater identity with the CH2 of the other Hydrola g us Heterodontus and Raja lgH sequences than with the CHI sequences of the typical chondrichthyan IgH clusters (Figure 13). A map of cosmid clone COS5 is shown in figure I OC. It is possib l e that the COS5-type heavy chain is the result of a duplication of the more typical CH2 exon and that the original CHI has been lost or diverged to the point where cross hybridization with the A.20 182 CH 1 probe (which exhibits interspecies cross hybridization with Heterodontus CH 1 exons) does not take place. The implications of this replacement in terms of the Ig structure are significant and are discussed below. The variable region sequence of COS5 heavy chain-type cDNAs most closely resembles that of the previously described V -1 unjoined genes isolated from the A. library (e.g., A.20501) and the typical heavy chain transcripts that were isolated from the eDNA library (e g., 20442; see figure 15 below). The variable region of COS5 contains two D regions and a single J region with typical RSSs A 398bp insertion is present in the COS5 DH region sequence when compared to that of clone A.20501 (Figure 14). This insertion is found at the point corresponding to the 3' end of the 2050 I D2 3 heptamer and spans the region between the 3' end of the COS5 3' D 1 heptamer and the 3 end of the 3' D2 heptamer. Thus, D I in COSS corresponds to D2 in 2050 I and the COS5 D2 is found at the end of the insertion. The regions corresponding to the 2050 I D I RSSs in COS5 appear to be incapacitated by point mutations, while the putative 2050 I D I region itself is identical between the two sequences As no COS5 specific transcripts could be recovered using specific RT -PCR, the possibility remains that this gene is a pseudo gene, although there a r e no overt flaws in its coding sequence.
Hc442 H c 1 82 H cC O S5 CHl OAT PTAPSLYALLPTCGSPPLEGOLTFGCLAAOFSPEPATFSWTSKGHPIANKGOVKSYPPVLSKAGFYTASSOLTIEASOVDIFNLC COVKSSIWLDGDICFPVV DV PSPSTSRS VWTNYNN RS T D TS NNLLAEMNSIR K SPVTPL EL DIVKPKSPVVILOSAEDIAMLRPAAVIFS E DKLSVKLKNWOVT S GVTSSKEDN TFSI --RVPREWRNGGVYISHP TTTVKSITMPTV CH2 H c 442 OPSPPVVLLTOTSPEDI ALINOATLLCSVSGFRPDKLIVDWLKDGKAVTSGIETSPSLK DKGNFSTDSRLTIPAKDWRWDVVYSCRVTHOPSOTLllKNITNWN Hcl82 LOA l T NK AVLlS R WO! VV LVE N T S -V R-N A T O VT l V K S H cCOS5 DE IOE I P lHKPGlAV Ll E T K WEP! V V E N T T S VL RE SGS TO SS T YV KSS H cCOS5C"I DI VKPK l L P A-MLRP AV L!FS E SK NWOP .. VV .. V E N T IS V R E NGG l O S S TO TV S MPTVI H c 442 Hcl82 HcCOS 5 CH3 PECL GEVKVSFLPPSPKOVLLGDTVTLTCVVTEAPSGVNVTWTOGHMARQSARESRSA ORVTSELNVSTREWNGGKEFECVANHVDMLTPKRGT!SKKPD S VDHDOMESSl l NSLDOOTTL GHGKO TEPE l IHD!SNI S EP l EK V REV TVR!PL N YLMESSON LDSEKO LG YGKPVTERHlE D KSF S EPV EKFORTV CH4 SEC Hc442 H cl82 H cCOS5 QDTNRPSVYLLLPSPEELSAHOWVTLTCLVKEFAPKE!FVEWAVNDOAIDASQY TNTEAMAANTSHNYSMYSMLSVSAGDWDKGNTYSCVVGHRSFNSSKTLT RYVNKTS KPTFVNVSLVLVDI"VKTCO. LHMKO A T S 0 T V N M V SAD T l E RH E PL A C S I Ot1 NA DHKO AA T O S 0 EOT D KE V NK L DSADR TI EGRFS A I ET AL T S Figure 13. Alignment of H y drolagus IgH Constant R egio n Exon Amino Acid Sequence s. H c442 i s from eDNA 20442, H e 182 i s from VDJ joined genomic clone A.20 182 and H c COS5 i s from genomic cosmid clone COS5. All seq u e nce s are co mpared to 20442 Das hes indicate identity with 20442 Spaces indicate gap s introduced t o o ptimi ze a lignment. The CH4 associa ted sec r etory signal region is boxed. H cC OS5 CH 1 is aligned to both CH 1 and CH2 ( boxed ) to emphas ize its CH2-like sequence Tri a n gle indicates position of conserved CH 1 cysteine. 00
v 8 5 c 7 -23-9 7 -n 9 AGGCAGGCATTGGGTTATTG*********TAACGAAGTTTAAAGAAAAAGCTGGTAATGGArrrcrrArcAAACTTArrcAATGGAACAAATCGTGGrrr G---TT--G-T-C -A--CTGTGTCTG--G-A--A -----------TAAG--------------TT-CAAC----AC-GTG--CG------GAATGGGGC*GAATGA*GCTGCTCACGTTTGTTGTAATATTATATAATTATATGCTTTATAATTTTACCTGGTAAGTGAATCTCTGCTGTT*GAGGCGGG --TG----GT-----A ------T-TT-G-C-------C-IGTGATTTCAI-***************----T---ICACTGTGI----T--------9 -12 7 D 1 AAGGACAATGATTGGAGATTGGCTGTGTAACGGTTCAGATGTACTCTCTTGTTAAAAGCTGCTGGTTATAATTATTTTTCTGTTTCTGTGGATACGAAGC C --A-AGA-G-T--CC---T---AC--T--G----*-----------C-GT---A 7 23 -9 9 A-G-GT----G -CT --C----C--C-G-A--TA-CC-**--C-GTG-A-GT-GG------N--A----A--AC-A-7 7 2 2 -9 9 ACAACATTCACACTAACCAAATGTACTGTTACTGACAGTACATCATCCTCAGAAATATCCTGCTTTTTAAAAGATGGATTCAGATTAGTGTAGATTACAT ****************** ******* ****************** *********************** ** ******************** ********* GATGTCGCTAAACAAGATTGGGGTCTGGTGAGTGGCTTCACATTAATTAAAAGTCTCCGATAAACATAAAAATGTATTCTGACTGAAACATTACACGATG **** ***************** ** ********************* ************ *** *********** ****** ******** ***** ** 9 -12*** ************* ********************************************************************************** 7 D2 7 9 TCAACACATAACCAGTTAATCTCTCTCATTGTTAGCACAACTGAGA*CT =-"-"-"=><=-G----CA-----A--------C--C--C-GA-GTGTG-G-TT -C 9 GACTAA*GTGACCAATTAGT T G A TATGGTTAAAACAGTGCTGATCACTGAGGATATTACCAAATTT*AACAGAAATAAAA*CATACAAANATATACTTCA A -T---C----GG-GA-**--------T---T--TGG --CT-A----------T-----G ----T--TC--C------A----*---T---AT---GTTCCGATGTTGATAAGGCAAGGCGTTTGACAGGAACATTGGAAACCGGCAATGATACAAACATTTACCAGGTAAAAATAGAATTTTTCAATTTTTTTCTG A T -TG-G-C--TG----T--G --C--T -A -------A-C-C-C GCG-G---G -A-9 J 9 AAGAAAGAGAAAAA ----C-A -C-CT Figure 14 Nucleic Acid Alignment of Genomic VH Region Clone s A.20501 and COS5. Dashes () indicate identity except in RSS s where both s equen c e s are written out. Asterisk s ( *) indicate gaps introduced to optimize the alignment. V H and J H codin g element s are boxed. D region s are underlined The V H s econd cys teine c o d o n i s marked with a C. The JH splice donor site i s shown in reverse ima g e
86 eDNA Library Screening and RT PCR In an attempt to isolate additional divergent constant region isotypes which may have been associated with the excess genomic V H hybridizing elements and to identify new V H gene families a H y drolagus spleen eDNA library was constructed in the A.Zap vector (Strategene; La Jolla CA) and screened using the H y dro lag us 20 182V H and CH4 probes. Clones that were either V H-/CH + or V H +/CH-were targeted. Although a number of clones of these types were identified V H -/CH + clones were found to represent 5' truncated clones containing no V H region and V H +/CHclones to represent unjoined V H transcripts without an associated constant region Thus although these analyses do not exclude the presence of heavy chain loci that are extensively diversified in both V H and CH sequence, it shows that unlike the osteichthyans and tetrapods, extensive V H gene family or constant region isotype diversification is absent (within the -60% boundary of nucleotide divergence that can be recognized by this technology) at least in genes that are expressed at level s detectable by eDNA library screening It was not possible to detect spleen cDNAs corresponding to the COS5-type gene by library screening ( -4 x 105 pfu) However when gene specific oligonucleotide probes were employed corresponding transcripts were recovered by the more powerful method of RT -PCR (see sequences in figure 15) The doma i n structure of the constant region of these cDNAs was as predicted from the COS5 genomic sequence with two CH2-like domains and no recognizable CHI domain Junctiona l diversity was analyzed by sequencing the products of PCR reactions
87 utilizing primer s complementing conserved FR 1 and constant region sequences ( Fi g ure 15). Also included i s the corresponding region of gerrnline joined clone 20182. A s in other vertebrates variation appears to be concentrated in the CDR regions. Interestingly, although the COS5 type genes are unique in their constant region sequence, no unique se quences separate their variable region genes from those of the m ore typical IgH s equences (Hc442, Hc44I and He 182 ). The length of the CDR3 junction among these genes varies by greater than a factor of two and it i s notable that the gerrnline joined gene (Hci82) possesses the longest CDR3 Isolation of transcripts from a VDJ joined gene The clones that were isolated from the eDNA library contained V H regions that were related most closely to the 20501 type gene. Clones hybridizing with a specific oligonucleotide corresponding to the 20182 CDR2 were absent. It should be noted however, that the typical spleen eDNA screening of 6 x I 05 pfu identifies only -150 C H 4 + clones. If joined genes perform specialized, antigen specific functions, then their expression may be expected to be limited in both tissue-specific and temporal manners, and may not be pre sent at the level s examined in a eDNA screening (i.e., typically only l 00 to 200 IgH constant region clones are identified in a library screening requiring expression levels of 0 5 to I % of total lgH me ss age for a cluster to be detected). To address this problem, an RT -PCR strategy employing both 5' -UT and CDR2 specific primers directed against both SEC and TM specific primers was devised An alignment
Figure 15. H ydro lagus IgH eDNA and 20182 ( Hc182) Genomic Germline Joined Clone Variable Region Nucleotide Alignment. All seq uence s are compared to eDNA 20442. Dashes indicate identity with 20442. Hc442 Hc441 and Hc2659 are typical of sp leen cDNAs. He 182 is a germline joined gene. Sequences beginning with C are cDNAs associated with COS5-type constant regions. Spaces in the se quences represent gaps inserted to optimize alignment. All nucleotides are shown in DHjunctional regions (boxe d ) No attempt was made to align D region nucleotide s.
H c 4 4 2 H c 44 1 H c2 6 5 9 H c l82 C 3 4 3 6 C 3 43 3 C343 1 C 3435 C 34 3 4 C 3 4 32 C3430 H c 44 2 H c44l H c2 659 H c 1 82 C 3 4 3 6 C3 4 33 C343 1 C3435 C343 4 C3 4 32 C3 4 3 0 H c 442 H c44l H c 2 6 59 H c l 8 2 C 3 4 36 C3433 C 3 4 3 1 C3 4 35 C3 4 3 4 C3 4 32 C3430 FRl CDRl FR2 GGCGCT G T GGTGCTGACACAGCCGGCGAC T CTAAC CGGGAAG C CTGGCACGGCCGTGAGACTCACC TGTAGAACCAGCGGGTTC A ATCTCGGCAG C TACCGCATG CACTGGGTCAGGCAGGT TCCT G ---------------------------------------------------------------A----------G---A -----TTC------------------------------------------------G ---A ------C ---A -T A AG------CT A --------------T--T------------------A -CA----C-------------C------------TG-----A--AA -ATTT---TT---------A G -------C T C-----------------------------------------A-----------A ---------C---A ---AC T G ---AG--------------------A------------A ------------T ------------------A ----T ------C-A-----AC-T -G ---A G ---------------------------------------------------------------A -------------C-A-----AC-T G ---AG-----------------------------------------------------C -----------C ---A-------G ----AG---------------------------------------------------------------------C---A ----G-----AG------------____________________________________________________ _____ -------------------------------A----c-------------------------C--C ---------A -----------AG----CAT -C--G ---------G --------A--A T --AAC--T T----T A C--C C * * *** ************ ** ** * *** * * * * ** ***** ** *** * * * * * * ** *** **** * * **** CDR2 FR3 GGAAAG G CTTGGA A T G G CTCCTGCATTATTACAC C TCAAG C AGCA A TGGTTACGCGCCCGGCA T T G A AAGCC G T G T CACTGTCATCAAAGAGGA TTCCAACAACATCTTCGATCTGATCAT ---------TGGG ------C-----------C ----- -----A-----------A ----------------C G ----G----GNG T A--------C -C--------------A -----------G--G C---------G--C -GG------G GAGATC-CATTCGA----CT---T----------C C G--A--------------------TT-----G--C------GC-------C-------------------------T------G ---G G -GTA--C---------------------------GA-----CAC----------------------C ----------------C------T------G ---G----G -GTA----N C----------------------------GA-----CAC --------------------C --------------------T-----G ----G G G T A----C--------------------------G A---CAC--------------------C----- T -TCC C - C --- - --- G -A -C -- C -T ---- TC- C---- A C - G -A- ---GCC C-A---CAGCG--CCA----CTGGC GG --T C -GC-C T ---G-T G--C--A-------G ----T ** *** * * ** * * ** * * * * * ****** ***** * * **** **** * *** *** CDR3 FR4 CAAAAGTCC G A CAGTTCACGACACGGCCATAT A TTACTGT GCCAGAG' GATTCTAACTACGGGGTGGG A T CTT A CTACTGGGGACAAGGGACCATGCT GACAGTGACTGACG ----- -- C ----- -- T TGATCTATC ------- -- -- ---C--C T-A-----C ---C----------A---G G TACAATGT ATACAGTGTGTACACTGGAGT A C TCA -------------------C-----------------------------------------C-----------------G TCCTGGGTCCGACT ATC ---A----------------------------------C ----C ---------G GGACCCCCACACAGT C G A TTTCGGA TA C-----------C-------------------C ------------------TATACAGTGGGTA CCCGAT A T C---------------A--------------A -----C-------------------C---------------------G T CGGGACACAGTGGGTAGCCGGAGGGGGTCG ---------------------------------A---T -C------G--C--TC CACAGT AGGGGGGAACCAAACT A T C -------C ----C--------------C------------------G GTGGGTACCGAGAACTA CGGA --------------------------------------------------------------------G CGGATCGGGGTGGACAA T ATC--------------------------------------C TG -GGG---CCC----C------T G CCCGACAGTGGATACGTATAGAT A A CCCT ------------G ---C-G A GN -------A ** ** ** * * ***** **** **** **** *** ****** *** * ** * * **** *** 00 \0
90 of the sequence of two of these amplified products, representing both the secretory and tran s membrane transcripts of the 20 182-type gene with the germ line exons seque nce s taken from A.clones 20182 and 20457 is shown in figure 16. In both cases CDR3 junctional sequences were identical to the prototypic germline gene. While the TM transcript constant region sequence was identical to the germline se quence three differences are evident in the SEC transcript. These may have resulted from reverse transcriptase and Taq error or allelic/isotypic variation. The Excess V" Elements are Pseudogenes Four V H +/JH-genomic A clones were isolated and their variable region hybridizing elements were subcloned and sequenced. The sequence of these V H genes are truncated by comparison to 20182 and presumably represent pseudogenes_ An alignment of three of these genes with the A.20 182V H is shown in figure 17. \j!V H sequence identity ends abruptly about midway through the V" coding sequence All three \j!V" sequences contain in frame stop codons shortly downstream of this point. The sequences of Hc420\j!V" and Hc421L\j!V" are nearly identical to each other and to A.20182V" extending at least several hundred bases upstream of the start codon. The sequence of Hc421S\j!VH is very similar to all of the se sequences beginning about 24 bp past th e A.20 182V H start codon. Notably, \j!V H A-20421 was the only clone identified that contained two V hybridizing elements. A restriction enzyme map of clones 20420 and 20421 from which the se quences were derived is s hown in figure 1 Od. In order to determine if the
Figure 16. Alignment of 20182 RT-PCR Product Sequences (top) to 20182 Genomic Sequence (bottom) Both a secretory (SEC) and transmembrane (TM) type eDNA sequence is shown. Dashes indicate identity to genomic clone. Intervening genomic sequence is indicated by ellipses ( ... ).
92 sr;cJ 1 73 -----------------------------------------------TM3171 -------IGC ICCGGIA11ACAGGAGAICCCA11CCGACAAICI11AIGCGCCCGGCA1CCGAAACCG1GTCAC1G1CATCAAAGA11A11CCGA1CA VDJ C:AlCIGCGATC1GACCAI(AACAGCC1GACAGliCAAGACACCGCCATC1AliAClG1GCAAGAGGGIACAA1 G 1AIACAG1G1GIACAC1 GGAG1 AC 1 CA 11 GAC1 ACl!;GGGACMGGCACCA 1 GG 1 GACAG T G A C 1 GACGtlD\AGMACCGA ..... T Cl1 GG1 1 1 CC!i!lJCIG 1 CCC I A CTCC1CCCAGIC11TA1GCTC11ClGCCATCCTGTGGACCAICCAClTCTAGAGGCTCAGif.ACTTTCTGGTGTC1GGCCACCAAC1ACAA CCC1GAGCCGGCCAC1111AGf.TGGACATCAAATGGACGTICCAICAC1GACCAAG1CAAGACATA1CCC1CAGlGC1CAA CAAGAACGGT CTC1ACACTLIIAGCAGCCAGI1AACAA1CGC1GAAlCCCAGAIGGACAACAGCAACATCCGTTG1AAGGT1AAAAGCAGCAGClGGCCAG .. ...... (------------... -......... C . -.......... -...... AIGIGGACAI1CCT 11ACCAG!DIAG1111GIG .... AIAIAAIGIC11GCm ltGAGII ACTCCAGGAICCATC ACC1CCAGTIGIGAIC C 1 C ACACAGAC11CCCC1GAAGACATCACACTGAACAAACAGGCCACTGCIGTCTGCITAAI A ICCAGTITCCG1CCAGACAAAC1CAGGG -.----.... -..... -...... G.-.... --...... --... -... -.-.... --... --.------... --... -.... --.-. 1 1r.ACIGGCTGAAGGAC1\,GCAGAICAIAACC1CAGGTGI1G1GACCT C T C T C TCGG1TAAAGAAGA CAATGGGAC A T TCAGCACCA\,CAG CCGGCIGACTG1CCCCGCCAGAGACTGGAGGAACGACGCGGT11ACAC11GCCAGG1GACCCAIGICCCCACCCAGACCA1CAI1GICAAG C11GCCACCT1CACAGAAACAGGTTTIGATGGAATCCTCTGTAAT1C1GACCTG1ATIGTGACTAA11CACCACTTGACG1CAATGTGGAT 1GGACACAAGGACAGACAAC111ACAG1f.TGr.ACAIGGGAAACAGTCTGCCAC1GAGCC1CAGGAAGTGATCAG1GAAllAAACA1 11CTA CACACGA CTGGAT1AG1GGGAAAGAATTCAACTGCAT1GCCAGCCA1GAAGATATGCCAACTCCCATCAGAGAAAAGATAGITAAGAGGGA . . T\,ICIAC1CA1 CAG1C1G1CACCCTGACC1Gf.C1GGTGAAG\.AC11CAC1CCCAAAGA\.ATC11 IGTGGAGTGGGCTG11AA1foACCAGfol GAIIGATGCCAGCAAC1ACATGAACACTGAGfo1GATGGCAfoCCAGlfoCAGA1CACAACTACTCCA1G1ACAG1A1G11AACCATClCAGCC --......... ---------------------------------GAAGACTGfofoACCGfoGGACACACCTACTCCTGIG1GG1CGGACATGAATCT1TCCCACTAAAGACCCIGACCAGAGCTG1GAAIAAATGC A
H c la 2 V11 Hc420VII\(I Hc421LV,V Hc421SV,V H ci82V11 Hc420Vnlfl Hc421LV,'I' Hc421SVV Hci82V11 H c42t iV.,V /lc421LVV Hc4 2 1 SV.,V H c i 82V11 H c420V11'4' /lc421LV,V Hc421SV,V H c 1Jt2V11 Hc4 2tiV'I' Hc4 21LV,V Hc4 21SV V Hci82V H c42t i V"f H c421LV,V H c42 1SV11\(I Hci82VII H c42t1V.'I' /lc421LV.'I' Hc421SV,. If H c i82V. /lc42UV,'I' /lc421LV 'I' /lc42 1 SV V A TA TT A TTTTCAAAAGT AACCCAGGGC TTCTATA TTTTTC TGC TGCCAC T G TC AAAGGAAACAA TTCATGAGGCTTCAGGG TCGGTGATGGGGCTAA TACT A TTGACAC TT ATA TT AATA TGAA TTTGCGGTCAA TTGCCGAATXG TAAA TGGGCACAA TT A TTXGA TCTGTGCAGGTCCAAGGGAA TG TGAGGCG TGACTCC AAAGGAAACAA TTCATGATCCT TCAGGG TCAG TGC TGGGGC T AA TACT A TTGACAC TT ATA TT AATATGAAGTTGCTGTCAAT GCAGAATG TAAAGGGCAC AA T TAT TGATC TGTGCAGGTCAGAGGGAA TGTGAGGCGTGACTCC AAAGGAAACAA TTCATGAGGCTTCAGG TCGGTGCTGGG CT AA TA TT A TTGACAC A TA TT AATATGAAGTTGCTGTCAAT GCAGAATG TAAAGGGTACCAA TT AT TGA TCTG TGCAGGTCCAAGGGAATGTGAGGCGTGAC ICC ACA AAC TTG CTTT AAGTTTGATAA TGGG TTTG TGACAACCA GC TGACAAGA TG TT A TT AT AGG !CAT AA TT AC AGATATAT TTC AT TAAGA AACT cc TCAG TCCTGTAGCTA TTGG TCCA TTCTCCAAACACCTGGCCCCACTGCCTCTTGTT AAA TAGCGGTCACTGGGGAC TGAGG. CCAT C TG TCCATCCGGCAGA TC TT TGTCCC TCGG TTA TTGGGC TCAAGGG TCAG TTC TGACAA TTGI CC TCAG TCCTG TA TCTA TTGGTCCA TIC TCCAAACACCTGGCCCCAC TGCCCCTTGTT AAA TAGCGGTCACTGGGGACTGAGG CCA TCTGTCCATCCGGCAGA TCTTTGTCCCTCGGTTA TTGGGCTCAAGGG T CAG TTC TGACAA TTG CCTCAG ICC TGTAGCT A TTGGTCCA TIC TCCAA CAC TG CACACTGCCCCTTGTT AAA TAGCGGTCACTGGGGACTGAGG TCCA TCTGTCCATCCGGCAGA TCTTTGTCCC TCGGTTA TTGGGC TCAAGGG TCAGTTCCGACAA TTG TTGATACAT AAA T CC TGG A TA TAA TA TGG TIGA AGA T GGGTAT GTGA T Figure 17. Alignment of \j/VH Genomic Sequence s with Gerrnline Joined VH from A.20182. Dashe s indicate gaps. A s teri s ks under the a lignm ent repre se nt conservation throughout all four sequences Region of greatest identity is box e d Tran s l ational s tart s ite (M), s plice junctions (spl) and conserved seco nd V H region cysteine motif (YYC), which i s miss ing in the pseudogenes, is d es ignated. Triangle indicates point at which pseudogenes lose identity to A.20 182V H \0 w
94 truncated nature of the se genes could be generalized to all or most of the V H +/JH-genomic A, 24 of these clones were used as templates for a PCR from a conserved FR I primer to a conserved FR3 primer In all but one of the 24 clone s no product wa s formed, consistent with mo s t of the clones representing the type of pseudogene described above. The single non pseudogene clone was s hown to represent a clone in which the JH region was excluded in the cloning process (i.e., the V H element mapped near a A ann). Discussion Although unique in many ways, the Ig heavy chain loci of Hydrolagus colliei are unequivocally elasmobranch-like in their sequence cluster-type organization, multiplicity and in the presence of joined representatives. These molecular findings are consistent with a close phylogenetic relationship between the Holocephali and the Elasmobranchii although a separate jawed vertebrate ancestor from the elasmobranchs and osteichthyan s technically cannot be excluded because the cluster-type lgH gene arrangement may represent a primitive character lo st in the osteichthyans Sequence similarity would tend to argue against a separate origin from the elasmobranchs. Unfortunately, data from the outgroup Agnatha or more divergent groups would be nece ssary to properly analyze these relationships In any event, it is clear that the observed cluster-type arrangement was established at an early point in chondrichthyan evolution as was the use of two D H regions in each locus. The increased junctional variation associated with the incorporation of two D H regions into the IgH CDR3 should exceed that lost by the lack
95 o f eombinutorinl relative t o th os t e i c hth yun l g H loc us. ut l e ast with regard ro th e p o tential mtmb e r of differe nt l g H transcripts that can be pr od u ced Of co ur se. th e osteiehth ynn combi nat o rial systcnl pro duce s diversity throughout the e ntire V H region whereas the incrense in diversity associated with the additional DH region is concentrated o nl y in CDR3. It i s possible that the us e of multiple DH e lements may be allowed only in systems with little var iation in the other V11 elements if an upper limit to diver sity i s defined by a limitin g numb e r of B cells and the need for a certain threshold number of cells carrying a s pecificity in order to mount a timely immune response ( see theoretical discussion of limits in repertoire size in Cohn and Langman 1990 ). Recently Ig light c hain transcripts were found in the skate, Raja erinacea that have identical sequence to genomic joined genes (Anderson et al., 1995) Although joined and partially joined heavy chain genes are found throughout the elasmobranchs no RNA transcripts from these loci have been identified. Their ubiquity and apparent lack of seq uence differences that would signify them as pseudogenes leads to the conclusion that they are functional ; however in the case of Het erodontus, a number of properties of these genes have confounded the isolation of joined gene transcripts. Specifically, homed shark joined IgH loci are numerous and remarkably similar to both one-another and to their unjoined counterparts, complicating eDNA library sc reening and RT -PCR strategies. The H ydro lagus colliei 20182 joined V H gene was an ideal candidate for transcript isolation as it exhibits only -75% identity in both variable and constant region nucleotide seq uence with the more typical eDNA sequences that presumnbt f r o m unjoined loci (ass essed by sequence similarity to known unjoin cd l oci) ln nddttlon,
96 eDNA and 20182 VH sequence alignments contain multiple gaps (see Figure 15) that were exploited in the design of gene-specific PCR primers. cDNAs that were nearly or entirely identical to the 20182/20457 germline sequence were recovered by RT-PCR (Figure 16). As suggested by the 20182 V Hand CHI gene titrations, similar genes are probably present in from 5 to I 0 copies per haploid genome. The three differences between the SEC3173 eDNA sequence and the Hcl82/457 genomic sequence probably reflect inter-locus or allelic variation. PCR or reverse transcriptase errors, although possible are unlikely to be the cause of all three differences as they are confined to one transcript, while none appear in others. Whether or not these genes derive from the A.20 182/20457 is relatively unimportant in light of the fact that perfect V H CDR3 and JH matches suggest that, at the very least, a very similar and probably functionally identical joined gene is transcribed. The fact that TM and SEC transcripts were recovered suggests that the gene may function like other Igs. The CDR3 of the He 182-like transcripts is the longest of the Hydrolagus eDNA sequences (see Figure 15). This could exemplify an additional advantage of germline joined genes. The high number of possible CDR3 peptide sequences of this length may be beyond the diversity that the system can usefully employ. Time limitations for mounting an immune response probabbly set a lower bound for the number of cells with a particular specificity. An upper limit of useful diversity may be defined by the total number of B cells divided by the threshold number of cells necessary to mount a timely immune response. Above this level of diversity, not enough B cells are present with any one specificity to mount a timely immune response. However this is by no means a
97 s imple situation and phenomena such as: immunological memory population considerations, and the nonrandom nature of the joining process and re s ultant Ig s pecificities, which are to some extent under the influenced by Darwinian selec tion can increase the usable or apparent diversity. In addition, the neces sary conditions are probably different for each antibody-antigen interaction complicating generalizations. It is interesting that all known antigen receptors appear to be made up of a VJ and VDJ pair of polypeptides. It is tempting to speculate that two VDJ chains would exceed the diversity that the system is able to accommodate, although structural constrains as well as feedback and regulatory considerations could be more significant. The COS5-type Hydrolagus lgH transcripts are unique in their lack of a typical CH 1 type domain. Structural evolution of immunoglobulins occurred through domain duplication and subsequent diversification (Hunkapiller and Hood, 1989 ). The COS5 lg gene offers an example of a relatively recent duplication and a diversification which possibly is still in progress The sequences of the more C-terrninal CH3 and CH4 domains of COS5 are very similar to tho se of the typical Hydrolagus l g H cDNAs The V H regions of cDNAs with the COS5-type constant r eg ion exhibit no obvious ubiquitou s nucleotide substitutions relative to the V H region sequences of the more typical A-20 182 A-20442 and A-20441 V H sequences. One of th e V H genes (C3432) is nearly an exact mat c h to the A-20442 sequence, yet possesses the COS5-type constant region. To det e rmine whether thi s results from a recent divergence from the typical H y drola g us l g H or involves non Mendelian proce sses such as exon shuffling or gene conversion would require a more in depth investigation.
98 The implications of the replacement of a CHI exon by a CH2 exon could be significant in terms of Ig quaternary structure In a typical Ig heterodimer the CHI exon interacts with the IgL constant region, while the CH2 exon interacts with the same from the corresponding IgH chains. Typically the CHI contains a cysteine which forms a disulfide bond with the penultimate or antepenultimate cysteine of the light chain constant region Such a cysteine is missing from the sequence of the COS5 CH 1. This invokes the possibility that a transition from a typical Ig structure to a heavy chain dimer resulted as a consequence of the replacement of CHI with a CH2-like exon It is difficult to predict the repercussion of such a transition but the gene appears to have duplicated and diversified and is transcribed from a multitude of loci. This may represent one of the best examples of a recent duplication of an Ig domain and the subsequent divergence of the resultant gene. As such, it should offer a unique opportunity for further study. Notably, an IgG of camels appears to be devoid of light chains and is also missing the CH 1 (Hamers Casterman et al., 1993). As discussed in the background section, while an octamer promoter is located upstream of their start codon of all studied osteichthyan and tetrapod heavy and light Ig variable region genes, it is present only with the light chain genes of the elasmobranchs so far observed The next chapter will demonstrate that this is also the case for the light chains of Hydrolagus. The heavy chain genes of H e terodontus have a decamer nonamer, cyclic AMP reactive elements (CRE ) just upstream of their start codons. This same element is found upstream of mammalian TCRP variable regions. Hydr o lagus has neither an octamer nor the CRE element upstream of its V H genes within about 600 bp of the
99 three genes that have been looked at in detail at the genomic l evel ( 20 I 82 2050 I and COS5) Although the exact consequences of the phylogenetic varia bilit y of promoter elements in the s e cas es are unclear such flexibility could play role complementary to the variation in gene structure and s hould be considered with regard to the evolution o f this system. Although pseudogene s are common in mammalian Ig systems, making up 40 to 50% ( Matsuda et al. 1993 ) of V H genes in human s they have proven extremely rare in the elasmobranchs studied, with only one example in >500 genes analyzed from Heterodontus. In Hydrolagu s pseudogenes appear to be prevalent yet rather than representing the disabled member s of diverse gene subfamilies, they are very similar to one another and appear to be the result of a massive duplication of a DNA fragment containing a part of a V H region that is nearly identical to the 20182 VDJ germline joined gene It is unclear whether these gene fragment s are used in the generation of IgH diversity as are those of the chicken. However, it seems unlikely that their similarity to 20 I 82 V H and e ach other would allow for the generation of diversity A s is evident in the next section, more typical p se udogenes are not uncommon in the light chain clus te r s of the chondrichthyans B y extending the multicluster-type IgH gene organization to all the extant chondrichthyans thi s investigation s ets a minimal time of 350 MY A for the emergence of this organizational form The existence of joined genes in this species a lso demonstrate s the wide s pread nature of thi s phenomenon A s in the ost ei c hthyan system, the chondrichthyans have diversified the gene clu s ter. Both Raja and H ydrol a gus are now
100 known to posse ss multiple constant region type s and it is p oss ible that other non cross hybridi z ing gene cluster s r e main to be found The question s outlined in the be g inning of thi s section will not be answered by broad scale phylogenetic s urveys alone. Such investigations are, however, an efficient means to uncover unique circumstances which may serve as model s upon which to test hypothese s. Two such opportunities ari se from the investigation of H y drolagus: ( 1 ) while expression of joined heavy chain gene s ha s been extremely difficult to study in the elasmobranchs, the divergent nature of H y drola g us VH gene 20182 make the demonstration of its expre ss ion a relatively straightforward task, and (2) while it is generally believed that Ig gene s uperfamily evolution consists of domain duplication and divergence, examples of recent duplications involved in the generation of novel genes are rare. The H y drola g us COS5-type lgH can now be studied as such an example. While broad scale investigations may be criticized as merely descriptive they both create the questions that subsequent hypotheses will address and uncover the natural experiment s that can best be used to te s t the se hypotheses
101 5. CHONDRICHTHYAN LIGHT CHAIN Introduction Immunoglobulin heavy chain gene organization has been investigated in representatives from most major vertebrate groups. The study of H y drolagus in the last section along with numerous other investigations targeting phylogenetically diverse jawed vertebrates have illustrated a major organizational division between the chondrichthyans in which segmental elements are arranged in numerous independent clusters, and the bony fishes and tetrapods, where extended tandem arrangements are found (Litman et al. l993c). The phylogeny of light chain genes has been studied less thoroughly In the four elasmobranch species so far investigated the horned shark, Het e rodontus francisci (Shamblott and Litman 1989a; Shamblott and Litman 1989), the sandbar shark Carcharhinus plumbeus (Schluter et al. 1989 ; Hohman et al., 1992), the nurse shark, Ginglymostoma cirratum (Greenberg et al. 1993), and the little skate, Raja erinacea (Anderson et al., 1995), light chain genes are organized in clusters reminiscent of the elasmobranch heavy chain gene organization To investigate whether this arrangement might be present in the entire class Chondrichthyes, this chapter focuses on the light chain gene structure of Hydrolagus colliei. Initial observations of sequence similarity between the light chain found in Hydrolagus colliei and that reported for Carcharhinus plumbeus
102 and the relative dissimilarity between these and that previously reported for Heterodontus francisci, led to the hypothesis that the Hydrolagus light chain type may be widespread throughout the cartilaginous fishes. Using the Hydrolagus light chain gene as a probe homologs of this light chain-type were identified in both Heterodontus francisci and Raja erinacea, representing a second distinct light chain type in each of these species. In addition, cDNAs from a third light chain type in Heterodontus francisci that resemble the kappa-like light chains isolated from the nurse shark, Ginglymostoma cirratum, have been cloned and sequenced. Collectively, these data suggest that a significant level of light chain gene diversification was present at an early stage in the phylogenetic diversification of the jawed vertebrates. Results Two Distinct V L Families are Found in Hydrolagus An anchored PCR strategy using the Hydrolagus colliei spleen eDNA library as template yielded amplification products of 1.0 and 0.8 kb as judged by agarose gel electrophoresis. DNA sequencing demonstrated that the different band sizes correspond to light chain cDNAs with different lengths of 3' untranslated sequence; however, a spleen RNA Northern blot that was hybridized with the Hydrolagus light chain probes produced only a single band of about 1.2 kb (data not shown) Thus, the PCR heterogeneity is considered a priming artifact. Homologous V L-and CL -specific probes, derived from the
103 mes s age isolated by PCR, then were used to screen a spleen eDNA library A number (Hc702 Hc704 Hc710) were sequenced. Both Hc702 and H c710 are full length and include some 5' untranslated sequence. Hc704 is truncated within FR2 of the variable region but contains a complete JL and constant region. Most V L -/CL + clone s contain truncated CL fragment inserts (probably cloning artifacts) ; however one V L-/CL +clone was isolated (Hc712) that contained the complete coding region of a eDNA with a different type of V L region A nucleic acid sequence alignment of clones Hc702 and Hc712 is shown in figure 18. The Genomic Structure of Hydrolagus Light Chain Genes is Clustered and the Elements are Germline Joined To characterize the germline organization of these genes approximately one genomic equivalent (2xl05 pfu) of a Hydrolagus colliei liver genomic library was screened with the homologous V L and CL specific probes at reduced stringency and these hybridized with 166 and 204 clones, respectively. Sixty-nine of these clones hybridized with both probes Genomic Southern blotting was utilized as a second method to assess genomic complexity Both VL-and CL-specific probes hybridize to multiple bands consistent with library screening results (Figure 19). Two genomic V L +/CL + A clones were isolated and the V L-and CL -containing regions were subcloned and sequenced (Hc717 and Hc718). Both clones are V J germline-joined (i.e., there is no intervening sequence separating V L and JL). The
flc 712 Hc702 1 04 LEADER M A P W V P F L A T L M T C L H C V TGTGACCGATC .. CCGCAACCC .. TCACCATGGCGCCTTGGGTTCCATTTCTCGCCACATTGATGACTTGCCTCCACTGTGT CTCTTA----TT-GT---A-TGT-A-GCGA---C-C-G----GG-T---CCC1GT -G -1 G A A C M G D W L R F L A A L L L C L H G T FRl CDRl N S l P A L H 0 P P S V S T S L G E K L T L G C T M T G G S I S AAACTCGCTGCCAGCCTTACATCAGCCGCCGTCGGTCTCTACCTCCCTGGGAGAGAAGTTGACCCTGGGTTGTACTATGACTGGGGGCAGTATTAGCA -C G -AGCTGTTA-C-GAC------A-----CA--A-------C----A-C -CCG-C-AGACACC--C---T -CA --C--CT T A A V T L T 0 P P S I S T S P G N T V K I T C T M S G G S I C FR2 CDR2 FRJ S E Y V Y W Y 0 0 R P G N A P 0 Y L L Y T D G V N S F T R M S G V GCGAATATGTGTATTGGTACCAGCAAAGGCCGGGGAATGCTCCTCAGTATCTCCTGTACACTGATGGAGTGAACAGTTTCACCAGGATGTCGGGAGTT TT-C--CAC-AGC-----------G-A --TC -GC--C---GT-T-G-TTG---T ...... --G ...... --CC AGGTCA---GA S Y Y T S W Y 0 0 K P G S A P V F V W Y G S S T R G 0 G I CDR3 SDRFSGSRDSRSGOLI ISSVOGDOAADYYCGVW TCCGATCGATTCTCGGGCTCCAGGGATTCCAGGTCTGGCGATTTAATCATCTCCAGTGTTCAGGGGGATGACGCCGCTGATTATTACTGTGGTGTATG C-T---------A-T--T --T-C--GAATCGACATGC-G-C---AGGG--C-----CC-G-----G------T-------CTA P D R F T G S T G S S D M H L T I R G L 0 P E 0 A A 0 Y F C G A R FR4 CONSTANT REGION E S G Y V F G K G T R V G I G 0 P R A P S V T V L P P S GGAATCGGGC ............ TATGTCTTCGGTAAAGGGACCAGAGTGGGGATTGGTGACCCACGGGCTCCTTCGGTGACTGTCCTTCCGCCGTCAC TTGCGCA--AGTCCGTACACC-TCTG---IGGTA AAGGGAd--A-C--AATT-GAAAA-----AT--------------------------A--L R S S P Y T F L F G K G T K L N L K N P M A P S V T V L P P S P 0 0 I T A K S K A T L V C L V N 0 F T P G V V E V K W T V 0 G S CAGATCAAATCACAGCAAAGAGCAAGGCGACCCTGGTGTGTCTGGTGAATGATTTTACCCCGGGAGTTGTGGAGGTCAAGTGGACTGTGGATGGCAGT ------------------------------------------------r-------e-----------------------P D 0 I T A K S K A T L V C L V N 0 F I P G A V E V K W T V 0 G S A R S E G V E T S A I K 0 E V D N T Y S V S S Y L T L P A S E W E GCCAGAAGTGAAGGTGTTGAAACGAGTGCGATCAAGCAGGAGGTGGACAACACGTACAGTGTGAGCAGTTATCTGACTCTGCCAGCCTCCGAGTGGGA ------------------------------------------------------------------------T------------------------A R S E G V E T S A I K 0 E V D N T Y S V S S Y L T L P A S E W E T H E L Y S C L V K H E A R A D P L K T T I A R S T C V GACACACGAGCTGTATTCCTGTCTGGTCAAACACGAGGCTCGGGCAGATCCTCTTAAAACAACCATCGCCAGATCCACCTGTGTATGATTCCGGGAAC r -----c------------------c-------A-----------------------------------------------------s H A L Y S C l V T H E T R A D P L K T T I A R S T C V TGTACTGAACTGTACTGCCCACGGAGCCTTGTTGAGAAATGTTCAATTCGACATTCCCATATTTTGATAACGAGCCAAACTGTGTCACAGTATCGCGC --------------------------c------------------------c---------r------c -------------A CGTCTGTATTTCGATGCTCATGTAATTCCAATCAAAATGTGGAGTAGTTTGAGTT -------------T -GT------AG-----------------Figure 18. Alignment of H ydrolag u s co lli e i Li g ht Chain Nucle o tide S e quence s f r o m eDNA Clones H c 712 a nd Hc702 Fr a mework, CDR and con s tant regi o n s begin where i n dicated H c 702 FR4 nucleotide s s h o wn in b o ld are dele te d in H c 717 (s ee Figure 3 ). Stop codon s a re ind ic ated by period s (.). H c 702 a min o aci d s sho wn in b o ld r ep resent po s ition s u sed in minimum evolution a nal ys i s.
105 CL H e Hf R e Figure 19. Southern Blot of Hydrolagus colliei (He), Heterodontus francisci (Hf) and Raja eglanteria (Re) Genomic DNA Hybridized with Hydrolagus Type II Light Chain Variable and Constant Region Probes. Position and size (in kilobases) of size standards is shown at left. variable region of clone Hc717 possesses a typical leader sequence with a 109 bp intron. Two octamer motifs (ATTTGCAT) are found 109 and 136 bp upstream of the putative startcodon; however, no well-defined T ATA homology is evident. Clone Hc717 is considered a pseudogene as the constant region contains an in-frame stop codon and the JL region has a frameshift caused by an insertion. The sequence of the variable region
106 of clone Hc718 is truncated in FR I due to a cloning artifact but extend s in a correct reading frame (based on identity with cDNAs) through lv PCR from the V L to C L regions reveals that these elements are separated by 2.6 kb and 2.0 kb in clones Hc717 and Hc718, respectively, similar to earlier findings in Heterodontus (Shamblott and Litman, 1989b). The nucleic acid sequences of clone Hc717 VL and CL segments are shown in figure 20. VL to JL linkage distance was assessed in 21 additional VL+/CL+ clones using a PCR strategy. Of these 18 gave products consistent in size with germline-joining using various FR1, FR2, and FR3 variable region primers and a conserved JL primer. The remaining three clones did not yield products even though the PCR thermal cycling scheme was designed to amplify large products, and known unjoined chondrichthyan light chain genes would be expected to produce fragments of between 500 and 800 bp. Either these three clones possess a different genomic organization or the JL primer is not sufficiently homologous to prime the PCR. Thus at least -85% of the clones analyzed are in germline-joined configuration, however some of these may represent duplicate clones. Among the Hydrolagus genomic and eDNA sequences, constant region amino acid sequence identity varies from 83 to 85 % consistent with the presence of multiple copies of a single CL type. In contrast, two distinct V L types have been identified. The variable region of eDNA H c 712 exhibits only -49-55% amino acid identity to the other four gene s sequenced if only FR 1-3 regions are considered Overall nucleotide similarity between Hc712 VL and the other four VL sequences ranges from 71-73 %. Variable region amino
107 GGTACCCTACCTCTTTCCACATTACGCGGTGGCACACTCGATGGCAGTACCTTAATCCAACTTCACAGAGTTGCTTTGACCGGGGTCAATCCACATCTGl CCTTGGCCTGGGCCAAGCCTAACTCAAGCGCATCTCTGGTGGTGTACTCCCTGGCTCCTGTCATCACCAGTCTGTAACAGAGCAAACACTACATGAATAI TGAATAATATATCACATGCACAATCTCTACGTGCGTTGCTGTCCTCCCTGCTGGTGAACTAACTGAGCGCTGACAACCTTTCAACCTGCTCGAGAATCCC CCCAGGTCCTATGCTTGCGAGGCGACTCCTCCAGCAACTCAGCTGCAGGACGCCATGTACCTTTACTTTGCATTAAATATTTTGTTTCATATTGAACATI AAATGCTTTCAGGAATCAAACTCTTCAGGGAGAAACACTATATAATATTATCATGAAATATCTTTTGAAATATGGGCAGTAAAATGCCTCAAATCTGCAl CACATTTAGGCTGCTGCTGAGAGCAAGTTCTACTATTGGACAATGGAACCCGGAGGATAATTTGCATAATCCATGAATTCAACTGAATTTGCATCGTCTC M ( ATGGTTCTTAAGGAGAAGTTTGGAATCAAACACTGAAGTGGGTTTTGTCTGAAAAATCCAAAATTTCTCTTCAGATCTTCGTCAAACTGTTCACAATGGC LEADER 0 W L R V L A A L L L C L H I CGATTGGCTCCGAGTTCTGGCCGCTTTGCTCCTGTGTCTGCATGGTAAGCGCTGCGGAATCCGTACTGATCTCAATTACCT TCAT T T TGTACAGTTAGTl FRJ \G T T A A V T L T 0 P P 5 I 5 1 CTACAGTGTGACCTCTCTACAGTCTGATTTATCTGTGACTTGTTATTTCTCCAGGAACAACCGCAGCTGTTACCCTGACTCAGCCACCGTCCATCTCAA( CORJ FR2 5 P G N T V K I T C T M 5 G G 5 I G 5 Y Y T 5 W Y 0 0 K P G 5 A 5 CTCCCCAGGAAACACCGTCAAGATCACCTGTACCATGTCCGGAGGCAGCATCGGCAGCTACTACACGAGCTGGTACCAGCAGAAGCCTGGCAGCGCCTCl CDR2 FRJ V L V W Y D Y N 5 A R G 0 G I P D R F T G 5 T G 5 5 D M H L T I R GTGCTTGTTTGGTATGATTATAACTCCGCCAGGGGTCAGGGGATTCCTGATCGATTCACTGGTTCTACGGGATCATCGGACATGCATTTGACCATCAGG! CORJ FR4 V V 0 P E D A A E Y Y C G A R L R N 5 P Y T F V F 0 T E F E I TTGTGCAGCCGGAGGACGCCGCGGAATATTACTGTGGTGCTAGGTTGCGCAACAGTCCGTATACCTTCGTGTTTCAAACTGAATTTGAAAAGTAAGTGA! CTTTGGCACTGCGAAGTTTGTGTTTCGGCTTTACAATAATGATTTTGATCTGGTTCTAAT TGAAACAT TTTACTGTATAAATGTTTATTTGTCTTAGCC CGCTAAGTGTATTATTCTGCAATATTCCTGTTACATTTGAATTTCGCCGTAATCCTCGGAATAGTACATGCCTTATTTAGGTGGAATTC 2 150 bp U NDETERMINED INTERVENIN G SEOUENCE TTTTGCGAATGAGCTTCTTCAGAGATTCAATTAATAATCTAACTTGTTACAGTTGGTTCAATCAAAAAAGCGCCAGAACCATATTAACTGTGTGTGACAGATATTTGATGGTTGTGGTCATTTCTGATCAGTTAAATGTTTTCGTGAGAATCGACAGATGATATAGTAGGCT ATGGGGTGTGTTGCGAGGAAAATATT" \ PRAP5VT CATGTTTTGTTTTGAATTGTCGTTCGGGTCATCAAATAATGGACATTACCTGAAGATAAAAATCTTTATTTCCAGACCCACGGGCCCCTTCGGTGACTGLPP5PD. ITAK5KATLVCLVNDFTPGAVEVKWT CCTTCCGCCGTCACCAGATTAAATCACAGCAAAGAGCAAGGCGACCCTGGTGTGTCTGGTGAATGATTTTACCCCGGGAGCTG TGGAGGTCAAGTGGACV D G 5 A R 5 E G V E T 5 A I K 0 E V D N T Y R V 5 5 Y L T L P A GTGGATGGCAGTGCCAGAAGTGAAGGTGTTGAAACGAGTGCGATCAAGCAGGAGGTGGACAACACGTACCGTGTGAGCAGTTATCTGACTCTGCCAGCC5 E W E T H E L Y S C L V T H E T R G T P L E K P I A R 5 5 C S CCGAGTGGGAGACACACGAGCTGTATTCCTGTCTGGTCACACACGAGACTCGGGGGACTCCGCTGGAAAAACCCATCGCCAGATCCAGTTGTTCCTAAT! AAGAATGTGCAGCGCTGGTCGAATGTACCATTTGCAAGGTACTTTTAATGATTATTATTCTGTTTTCAAATATTTGTTGATTACGTTTGTTTTATCAAAATTTCCCAAATGACTAACATTGCAAGCAAAAACTTAAATTGCCAAATTCTGTAAATTTCCCGACTGAGGGAAGGCATCCTTTAGATTTGTTTGAGTATT AAACTGTAATAATGATACACGGTTCACATCGTTTGCCTAAATCCTTAGTTGGATTCCACATCCTGTGTCATGTTACACAAATACCGAAAAATAATAAAAI GTCTCTGCATTCTCCTCTGTACAGTCACTCGATTTCTTCCAGATT T TCTTCACGATTGTTCAAACCAACCGATTTTTATCTTAGTTTTATTATTCGGTAI ATCAACATAAGCAATGATCGTTGATGGAATTGCTTCACAGATATGTTTAAACGAGGATATGTTACACTAGAAGCAATTCGTAGCAACAAGGGAACAACA! GTTTGGTTCTCTGTTGTGGGATCCAGAGGATTTCAGCTTCAAAATCTCACCTGGTGTCAACTTGTAAGATAACTACATTTTAAACTTTGGATTATCTGGI GGGAATTC Figure 20. Hydrolagus colliei Light Chain Genomic Clone Hc717 Variable and Constant Region Sequence. Octam e r motifs and polyadeny l ation signal sequence are s hown in bold type Slashes indicate splice donor and acceptor sites. Stop codons are indicated by periods (.). Note stop codon within constant region coding sequence. A del e tion in the joining region (* ; see Figure I ) causes a translationa l frameshift acid (FR 1-3 excluding CDRs) and overall nucleotide sequence identities among the four simi l ar genes range from -85-92 % and -94-97%, respectively defining a closely related
108 VL gene family In order to determine the genomic arrangement of this V L family an oligonucleotide complementing the Hc712 V L type CDR2 was used to screen the Hydrolagus colliei genomic library. A number of CL+ clones that were negative for the Hydrolagus VL probe, but positive for the Hc712-type CDR2 probe were detected. V-J and V -C linkage were analyzed in two of these clones by PCR employing the CDR2-specific oligonucleotide in opposition to the conserved JL oligonucleotide or a conserved CL oligonucleotide as primers The sizes of the fragments generated indicate that both genes contain pre-rearranged V and J segments. V -C linkage distances in these two clones were -1.3 kb and -1. 8 kb, smaller than those determined for clones Hc717 and Hc718. The Hydrolagus Light Chain Gene is a Member of a Second Chondrichthyan Light Chain Family A complete Genbank data base search indicated that the Hydrolagus light chain nucleic acid sequences are most similar to a light chain identified previously in the sandbar shark, Carcharhinus plumbeus (Schluter et al., 1989). At the predicted amino acid level, CL identity is 69-83% and VL identity is 65-70% between the Hydrolagus colliei and Carcharhinus plumbeus light chain genes excluding the V L of clone Hc712, which is only -48% identical and, as indicated above represents a unique V L family. In contrast, the Hydrolagus sequences are significantly less related, at the predicted amino acid level, to the V L and CL segments of the Heterodontus francisci light chain gene, i.e., CL identity is only -44-47% and V L identity is only -55-57%. In addition the
109 Hydrolagus CL genes are characterized by an additional amino acid, N-terminal to the first cysteine (codon 134) and by only a single predicted amino acid between the final cysteine (codon 214) and the stop codon, as opposed to two in the Het e rodontus and Raja genes. The Hydrolagus light chain defines a gene family that we here designate as type II (the Heterodontus family described previously is type I). Type II Genes can be Detected in Other Elasmobranchs The high degree of sequence identity between the Hydro la gus colliei and Carcharhinus plumbeus light chain sequences suggests that the type II light chain gene family was present in the elasmobranch ancestor in that it was present in the common lineage giving rise to the holocephalans and elasmobranchs. In order to further determine the phylogenetic distribution of this light chain gene type, Heterod ontus francisci and Raja erinacea genomic and spleen eDNA libraries were screened with H ydrolagus colliei light chain probes. In H eterodontus a spleen eDNA library that yielded multiple clones positive for a Heterodontus type I light chain probe, failed to yield positives with the Hydrola gus (type II) probe However, screening of a genomic library yielded multiple clones that hybridized with both Hydrol agus V L and CL probes. Two of these genomic clones were isolated and the corresponding V L and CL regions were subcloned and sequenced (Hfl A 1 and Hfl B I). In both clones, the V L region is V -J joined as in Hydr o lagus colliei. The V L and CL region s were shown by restriction mapping to be sepa rated by 2900 and 3800 bp re spectively. Clone HflAl contains a stop codon within
110 the constant region and probably is a pseudogene. Both clone s possess octamer motifs and putative TAT A homologs 130 bp and -120 bp upstream of the start codon, respectively. The sequences of the V L and CL regions from clone Hf!B I are shown in figure 21. A Heterodontus francisci genomic Southern blot hybridized with Hydrolagus colliei V L and CL probes is shown in figure 19. Type II Genes are Gennline-Joined Both spleen eDNA and genomic clones in the little skate, Raja erinacea, hybridize with the Hydrola g us VL and C L type II light chain probes. Two cDNAs (clones Rer4 and Rer5) and one genomic V L region from a V L +/CL + clone (Rer2) were sequenced. The genomic sequence contains a typical split leader as well as an octamer and a TAT A homology located -115 bp and -94 bp upstream of the putative start codon, respectively. As in H eterodontus francisci and Hydro lag us colliei this clone is V -J joined (data not shown). The Rer5 eDNA sequence is truncated within the hydrophobic leader, but both sequences are otherwise complete, with uninterrupted reading frames. Polyadenylation signal sequences are located 185 bp (Rer4) and 179 bp (Rer5) downstream of the termination codon. A sequence alignment for Rer4 and Rer5 is shown in figure 22. The sequence of genomic clone Rer2 V L is given in figure 6. A Raja eglanteria genomic Southern blot which was hybridized with H y drolagus colliei V L and CL probes is shown in figure 19.
I I I GCGCGTAATTACATATCAGTCAACACAGGGCATGGGAGCCCGTGACCTGATTTACATGAATGGAAATT T GCATGTAAATTTAATATTCAATGACGTTGCG M AAGGTTGAAACTTTTTAAGGAGAGGTTTAGAAAGAGCCCAGATACCACATTCGTGTCAGGAACTATTCTCCATCTGACCCTCACTAGAACCTTCACCATG LEADER T E W V 0 V L A A L M L I A CTGAATGGGTTCAGGTTCTTGCCGCACTGATGCTCTGTCTCCACAGAAGACAGGCGTTAACCTTCATATCGCTGCTGTTTGCATTGCTCGGAATTTTTT \C T N A 0 TTGTGCTTGTGTTTAACAGACCTGAATTAGCAACAATATTCACAATTGCAGAGTTTGACTCATCAATAACTTGTTATTTCACCAGGCACAAACGCGGATG FRl CDRl A V L S E P G S I S T S P G S T V K I T C T M S G G S I G S Y Y T S CTGTCCTGAGTGAGCCAGGATCCATCTCCACCTCCCCGGGGAGC ACCGTCAAGATC ACCTGCACCATGTCCGGAGGCAGCATCGGCAGCTACTACACGAG FR2 CDR2 FR3 W Y W 0 K P S S A P V L V W S S 0 S G 0 A S G I P 0 R F T G S V E CTGGTACTGGCAGAAACCCAGCAGTGCCCCTGTATTGGTTTGGAGCTCGGACAGCGGTCAAGCTTCGGGGATTCCAGATCGATTTACCGGGTCTGTGGAG CDR3 S S S N 0 L L L T I T N V 0 S E 0 A A 0 Y Y C A A R S S S A V A Y TCATCGAGTAACCAGTTACTTTTAACCATC ACCAATGTGCAGTCGGAGGACGCCGCCGATTATTACTGTGCTGCCAGGAGTAGCAGTGCAGTTGCATATA FR4 I F G R G T K L N L 0 I TAAAATCACTGGCTTGGTAGATTGGATGAGAGCAAAACAGGACTATTTCT CTTAGAATACAGAGCTTGTGAATCAGCGTCCGGTGACATTTAAAAGATAG GCATAACTGCGCCCAATTTTACGCAGG GTATTTCAGTTGGAAGAGAGGACAGAGGCAATTTTTTTAAGATGATACCAGTGCTGCAAGGTTTTAGTTACAA GAGAACGAGAGTGAAGCTCGGTCTACTGGAAGTTTACGGGATGATCTGCTAGAAATGAGCATAGTTCTGAGCAAAATGAAAACAGATATTTTCACTGATG TGTGACATCACACCTCAATATCAGAACCAATTGTCGAGGCTCGGCCTCTCATTTTAGTATTGAAAAAAGTAGTGTTTCAACCCTTTCTTCATTTCTTCAC A TTACATAAACCAGTCGGTAGCCTGAGGACATCAATT CAATGTC C TTGGTAGAAGGACGAAGGGACTGGATAGGATATTTTCTTTACCCAGTGGAGTGTT AGGACTGCAGGCATCAAGCTT 2800 bp U NDETERMINED INTERVENING SEOUENCE TCTTTGCTGTCAGCATTGCCGTCAATATTCTCTAATATGCCCTAATTCTGATTTATTGATTATTACGATTCTTTTTCATCTTTATCTTATTGTCTGAGCG CATTATTTTGCTTAGATTCAAATTTCTCTTAAGAATTTGTATGCGTTTCTTGCCTGAAGGAAAAGTAGGAATCTTTTC T CAGGTTAT A T A TTCCCCGGGA CTGTTGCCCTAGCCATTGGACTCTAACGGGATTCCGTTCTATTAAACTCATTATAGCTATTTAATGTAATTTTAAAAGTTTTCAAAAATGT CAAACATTG CONSTANT REGION \ N P R V P S V S V L P P S S ATTCCTACTTTTATGTGGATGAAAAATGGATCTACTTAAATTTGTTTCTGATTTTCAGATCCACGGGTCCCCTCGGT T TCGGTCCTTCCGCCTTCATCAG 0 0 I A A K N T A T L V C L V S G F N P G A V E I E W T V 0 0 S V R ATCAAATCGCAGCAAAGAACACGGCGACCCTGGTGTGTTTGGTGAGCGGTTTTAACCCG GGAGCTGTGGAGATTGAGTGGACTGTAGATGACAGTGTGAG G N G V E T S R I 0 0 E T 0 N T F S V S S Y L T L P A S E W N S H AGGGAATGGTGTTGAGACCAGTCGGAT CCAGCAGGAGACGGACAACACGTTCAGTGTGAGCAGTT A TCTGACTCTGCCAGCCTCAGAGTGGAACTCACAC E L F S C V V K H E T 0 A N P L K T S I S R S S C I GAGCTTTTCTCCTGTGTGGTTAAACACGAGACTCAGGCAAATCCGCTAAAAACAAGTATCTCGAGGTCCAGTTGTATCTGATTCAAAA TCTTATGTATTG ATTTGTTATTCCAGCAGCAAAGTTTCTTTACAAATATATTCCGCAAACGGCCATTACATGCTTGCTTTTAAGTGGAACATTTCAAATCAGAACAGTGGTG CAGTTTATCCGGTGGATTCCGCTGTCTGTTTTTTACAGTGTGTTCACCCTGAACTGATCAAAGTTTCAGTGTAACCTTACAAACATCACAGTTTAATAAA ACGACTATTCCGCTATCTCGTGTGTTGTCATTTTTCTGTGCCGTTGGGTCTGAAACAATTGTTAGCAACTTCCTACATGTGATTAAATAGCGGTTTTATT TTTGCAACATTCTGCAC A TTTTGTGTAAAGCCTTT G ACATCAGTCAAAATAATCAGTTATGTACACGAAGACGT GGAAGT AATCTGGCAACATATT T TCA AAGATAGACCTGGCTGAATTTCGTCGTGTCGGATGGAGCTGACGTTTGTCAGTGTTTCT Figure 21. Heterodontus francisci Light Chain Genomic Clone HjlB l Variable and Constant Region Sequence. Octamer motif and polyadenyl a tion signal s equences are s hown in bold type Slashes indicate splic e donor and a c ceptor s ite s St o p codon I S indicated by a period ( )
112 A Third Light Chain Type (Ill) is Present in Heterodontus A third light chain type has recently been found in the nurse shark, Ginglymostoma cirratum (Greenberg et al., 1993). To see if this light chain type i s also present in Heterodontu s francisci we hybridized lifts from a spleen eDNA library with a nurse shark light chain constant region probe Three clones that were positive were sequenced. Alignments of two of these sequences (Hfr<4 and Hfr<5) are shown in figure 23 In order to further demonstrate that the three types of elasmobranch light chain genes are members of separate families, genetic distances calculated from amino acid alignments of V L and CL regions (see Tables 2 and 3 for a list of sequences analyzed) were used in phylogenetic analyses employing the minimum evolution method (Rz hetsky and Nei, 1992). Heavy chain variable regions and domain s were used as outgroups in the VL and CL comparisons, respectively In both VL (figure 24) and CL (figure 25) analyses, the type I and type II genes cluster as separate monophyletic groups with the exception of the Hc712 variable region. The Hc712 variable region clusters with the mouse and human preB genes. Within the type II cluster, the elasmobranch genes form a monophyletic group with respect to the H ydrolagus genes for both V L and C L comparisons, but only the CL genes cluster with high probability (>88%). In the analysis of constant region genes, avian, mammalian and amphibian genes group into traditional 'A and K groupings with significant branch lengths ; however both chondrichthyan type light chains fall outside of these groups. In the variable region analysis, the type I
113 LEADER FRl CFHSI NAETTLTOVPS RerS CCTGTTTCCACTCTATAAACGCGGAGACTACCTTGACTCAGGTCCCGTCTA Rer4 CAACATGGCTGTATGGGTTTGGGTTCTCGCCTTGGTGAAGG MAVWVWVLALVTCFHS I NAETTLTOSPA CDR1 FR2 I S 0 S P G T T V K I T C T L S G G S I G S F Y T S W Y W 0 K P S TATCACAATCGCCTGGAACAACCGTGAAAATCACCTGTACACTGTCAGGGGGCAGCATCGGTAGCTTCTACACGAGCTGGTACTGGCAGAAGCCGAGC A G C G GA I S 0 S P G T T V K I T C T L 5 G G S I G 5 L Y T 5 W Y W 0 K P 0 CDR2 FR3 55PVLVWYG T TRGTGI PDRFTG5VV5STNOMH GCAGCCCTGTCTTGGTGTGGTATGGAAo o OCCACCAGAGGCACCGGAATTCCAGACAGATTCACGGGATCCGTGGTCTCATCCACGAACCAAATGCATT C GCT C S S P V F V W Y G S S T R G T G I P 0 R F T G 5 V V S S T N 0 M H COR3 FR4 L T I T N V 0 5 E 0 A A D Y Y C A V R D G N T V T F T F G R G T K TAACCATCACCAACGTGCAATCGGAGGACGCGGCCGATTATTACTGTGCTGTGCGGGATGGCAATACAGTCACATTCACATTCGGGAGAGGAACCAAAC C T CAA G ACTA A C G L T I T N V 0 5 E 0 A A 0 Y Y C A A W D N 5 N Y K F T F G R G T R CONSTANT REGION L N L S 5 P R 5 P T V S I L A P 5 M G E V T A K N T A T L V C L V TGAATCTAAGCAGTCCGCGCTCCCCCACCGTTTCCATTCTGGCACCATCCATGGGTGAAGTCACCGCAAAAAATACCGCCACCCTGGTGTGTTTGGTGA T G A G G L NLSSPRSPTVSILAPSMGEVTAESTATLVCLV S G F N P G A V 0 I K W T V D G S A R S 0 G A A T S R V 0 0 E K 0 GCGGGTTCAATCCGGGCGCTGTGGACATTAAGTGGACCGTGGACGGCAGTGCGAGAAGTGACGGCGCTGCGACCAGCCGGGTCCAGCAGGAGAAGGACA ........ C ..... T ......... 0 S G F N P G V V 0 I K W T V 0 G S A R S 0 G A A T S R V 0 0 E K 0 N S F S A S S Y L T L P A A Y W N S H E L Y S C V V K H E T 0 A T ACTCGTTCAGTGCGAGCAGTTACCTGACTCTGCCAGCCGCCTACTGGAACTCACACGAGCTTTACACCTGTGTGGTTAAACACGAAACCCAGGCAACCC T T N S F S A S S Y L T L P A A Y W N S H 0 L Y S C V V K H E T 0 A T P I K A N I A R S S C L CGATTAAGGCCAACATCGCCAGATCCAGTTGCCTCTAATTCAAAGGATCAAATTCCCGATTTGCTACATTTCTTTGTAAATTAATCCAATATTCAGATA C C G C P I K A N I A R S S C L 0 TCGAATGTTCGCTTTAACAGGGTATTTGTTCAAACGCAGCACGTGTGCATTTTGTTGCCGTGAATGCGCTGTTGGTATTTTACGATGAACTGAAAAACG G C TTAATGTAACATTTAATGCAATAAAACAGGTTAA AATAA Figure 22. Nucleotide and Amino Acid Alignment of Raja e rinacea Type TI Light Chain eDNA Clones Rer4 and Rer5. Polyadenylation signal sequence in s hown in bold type Stop codons are indicated by periods (.)o An insertion/deletion is indicated by three periods ( ... ).
Hfk4 Hfk5 LEADER FRJ 114 P P I 0 L I W P L A F C V A G I S G D T I I M T 0 5 P CCACCCATCCAGCTGATCTGGCCTCTGGCATTCTGCGTCGCAGGTATCAGTGGAGACACCATCATCATGACCCAGTCTCCA A -----------------------------------------------------------------------------TPIOLIWPLAFCVAGISGDT IIMTOSP rnm m P A L S V T L G 0 T A T I T C T A S 0 S V N D Y V V W Y 0 0 R D G CCGGCACTGTCAGTGACACTGGGACAGACCGCAACGATCACCTGTACGGCCAGCCAGTCCGTTAACGATTATGTAGTTTGGTATCAGCAGCGAGACGGTC -------------------------C GGA ---C---A P A L 5 V T L G 0 T A T I T C T A 5 0 5 A G N Y V A W Y 0 0 R D G COR2 FR3 0 K P T P L L Y R G T V R H T R T 5 D R F T G S 0 0 N L K F T L T I AGAAACCCACTCCTCTGCTCTATCGTGGAACAGTTCGACACACACGAACATCCGACCGATTCACCGGCAGTCAACAGAACCTCAAATTCACTCTGACAAT ---------T C A----TAC ---AC----T ---G-T -----A------TTT-----------AAC-----------------0 K P T L L I Y Y A T T R Y T G I 5 E R F F G S 0 0 K T K F T L T I 5 0 V 0 N E 0 V A D Y Y C 0 0 T Y 5 5 P 0 T F G K G T N L R L S CAGTGACGTACAGAATGAGGATGTCGCAGATTATTACTGTCAGCAAACTTACAGCAGC. .CCCCAAACGTTTGGTAAAGGAACCAACCTCAGACTGAGC -----TC--G ------------------------A--GGC --C---A -TAT--TATG-----------------G ---TT-----S 0 L 0 S E D V A D Y Y C 0 0 G H R N Y P M T F G K G T K L 5 L 5 CONSTANT REGION R E K 5 0 P T L T L L P P 5 E E E V K A K G T A T L V C L A D H F CGAGAGAAGTCTCAACCCACACTGACCCTGCTGCCCCCCTCGGAGGAGGAGGTCAAGGCCAAGGGCACTGCCACCCTGGTGTGCCTTGCCGATCACTTCT ------c-----------------A----------cc----------------------------------------------------R E T S 0 P T L T L M P P S P E E V K A K G T A T l V C L A D H F Y P D E V G V E W K K D G A A I S A G V 0 T S N Y L R A S D S T Y S ATCCCGATGAAGTGGGGGTAGAGTGGAAGAAGGACGGTGCAGCCATTTCGGCCGGGGTTCAGACCAGCAACTACCTGCGAGCTTCGGATAGCACCTACAG -----------------G ---------------------------T----A--------------T-----------C ------y P D E V G V E W K K D G A A I l A G V 0 T 5 N Y L R A S D 5 T Y S C S S l l T L S G S D W E S N A R F S C A L I H E T l S S P l S K TTGCAGCAGCCTGCTGACCCT CTCTGGCTCTGACTGGGAGTCCAACGCCCGCTTCTCCTGTGCCCTGATCCATGAGACCCTCTCC TCTCCCCTCAGTAAG -----------------------------------------------------------c-c--c --------G-----------------c S S L l T L S G S D W E S N A R F S C A l T H E T l S S P l S K S V S R S E C V AGTGTGAGCAGATCAGAATGTGTGTAGAGTGTTAGAGACAGTCCGCAGAGGAGACACAACTTTAAATATAACAGGGGTGAGCTGGGAGGATAAAGGTCAT ----------------------------A---------------------------------------------------------------------S V S R S E C V TTTCAACACGGAATTTCAACTCTCTTCCTTCTCG.TGACTGGGTCTGAGACACTTCTGAGGATCAGGGGTTAAGGCTCGAAATTGGGACAGTGTAGAGGG G -G--------------------------G------------------C----------------C ----A T---------------------AGCTTTACTCTGTATCTAACCCGTACTGTACCTGCCCTGGGAGTATTTGATTGGACAATGTAGACGGAGCTTTACTCTGTATCTAACCCGTACTGTACCT G ------------------------------------------------.. ...... -..................................... GCCCTGGGAGTGTTTGATTGGACAATGTAGACTCGTGATGCCAAGTAAAACC-TCTCTAAGATGCTGAAGACTGTCAGCTGAGCTAAAATAATGTGAACC ........... --------------------------------------c -----------------------------------A------TCAGAATTCGCTGCTCAGTAAACCAGACCTCCTTCCTGCTCACTCTGCAGTTACAATTTGAGGTTTCGTCTCTGTTTAAAGCGACAGGATTCTCATGTTA ------G-A -C-----------------------------C--------------A-A--A---------------------------------TTGAGAAAATCTCCACAACTGTT T TAATAAACCTCCATT C TCCCTGTGAATTTCTCCAGCTACCTCTCGCTGGGCTGTTTATTCAAGTCTTTTTTCTGT A -AAT AAAT -----T-A-A-Figure 23. Alignment of Heterodontus francisci Type III Light Chain cDNAs Hfrc4 and Hfrc5. Sequences are truncated in leader region by two amino acids. Framework, CDR and constant regions are delineated. Putative polyadenylation signal sequence is shown in bold type Stop codons are indicated by periods (.). An insertion/deletion is indicated by three periods ( ... ).
115 Table 2. Variable Region Sequences Used in Minimum Evolution Analysis. Genbank LOCUS Name (or Accession Number) Is Shown in Parentheses Outgroup ( heavy chai n V region ) Mus mus c ulus H e ter odrm/lls francisci Light chain CarcharhifluS plumbeus Het eradonrus Raja K chain Ginglymosroma cirra/llm mykiss Mus mu sc ulus R attus n o rvegi c us Homo sapiens Oryc tola g u s Clmic ulu s Pre B H omo sapiens Mus mus c ulu s A. c hain Gallus gallus Cairina moschara Mu s mu scu lus Rat/U s n orvegicus H omo sapiens Oryc tolagu s cunic ulus Sheep (Ovis) MmuV II (MUSIGHVP) HfrFIOI (HF IGFIOI) Cpl (CPLIGL) Hfrl22 (HFLI22) Hfrl41 (HFLI41) Hfr130 1 (HEFIGLAAA ) Rer900 (SKTIGCVE) GciK (LI6765) OmyKE (OMIGLC) OmyK3 (OMIGLAI) OmyKIO (OMIGLAG) MmuKI (MUSIGKACN) MmuK2 (MUSIGKVH2) MmuK2 (MUSIGKWNN) MmuK3 (MUSIGKVR2) MmuK4 (MUSIGKVI) MmuK4 (MMIG29) MmuK5 (MUSIGKAE) MmuK51 (MUSIGKAI) MmuK52 (MMIG71) MmuK53 (MUSIGKCPD) MmuK54 (MUSIGKC85) MmuK6 RnoKI (RNIGKCC IG) RnoK 2 (RNIGKY3) HspK I (HUMIGKVB) H spK2 (HSIGVA23) H spK3 (HUMIGKVL ) HspK4 (HSIGKI7) OcuK (0CIG06) H spPreB (HUMVPREBA) MmuPreB (MMVPREB2) G g aL (CHKIGLA26) CmoLI (CIIIGL V A5) CmoL5 (CIIIGL VA I) MmuL I (MUSIGL VA) MmuL2 ( MUSIGLVB ) MmuLx (MUSIGLAZ) RnoL (RA TIGL VA) H s pLI ( HSIGLVII) H spL2 ( HSIGLK5B ) H spL2 ( HSIGVL21) H spL3 H spL 4 ( HUMIGLAM2) H spL6 (HSIGLBV) H s pL4a CHSIGVL) H spL Tl ( HUMIGLAM3 ) OcuL (M27840) Ovis L
116 Table 3. Constant Region Sequences Used in Minimum Evolution Analysis. Genbank LOCUS Name (or Accession Number) Is Shown in Parentheses Outgroup (heavy chain C4 domain) Mus mu.fculus Heterodontus francisci Light chain Carcltarhinus plumbeus H eterodontus francisci Raja erinaua x chain Ginglymostoma cirratum Gadus morhua Oncorhynchus m ykiss Xenopus laevis Homo sapiens Mus mus c ulus Rattus norvegicus Oryctolagus cuniculus Sus domesticus (Sus scrofa) A chain Gallus domesticus Cairina moschata Homo sapiens Mus musculus Mus spretrts Rattus norvegicus Oryctolagus cuniculus Bos taurus Sus domesticus (Sus scrofa) Sheep (Ovis) MmuM4 (MUSIGCDIO) HfrScrM4 (HFIGHVC) Cpl (CPLIGL) H frl41 (HF1..141) Hfr1301 (HEFIGLAAA) Hfrl501 (HEAGLAAB) Rer900 (SKTIGCVE) Rerl02 (SKTIGCA) Gcik (LI6765) Gmok (GMIGLAF) Omyk (OMIGLAL) XlaK (XELIGLCI) HspK ( HUMIGKC3) MmuK (MUSIGKJC2) RooK (RATIGKJCB) OcuK ( RABIGKCA) SdoK (P IGIGKVJC) GgaL (CHKIGLVJC) CmoL (C HIGL VA 7) HspL I (HUMIGLCB I ) HspL2 (HUMIGLCB2) H spL3 (HUMIGLC84) HspL6 (HUMIGLC88) Hspll4 (HUMIGLCI4) HspLI6 (HUMIGLCI6) MmuL I (MUSIGLCA5) MmuL2 (MUSIGLC83) MmuL3 (MUSIGLCA3) MspLS (MUSIGLCD) RnoLI (RA TIGLCA) RnoL2 (RAT IGLCB ) OcuLI ( RABIGLCA) OcuL2 (RABIGLCB) OcuL3 ( RABIBL CC) OcuL4 (RABIGLCD) OcuL5 (RABIGCMC2) OcuL6 (RA BIG C MD2 ) BtaL (BOV IGLAA ) SdoL (PIG IGL VJC) OvisL
73 7 1 94 99 98 0 1 99 .-----RnoKI llspKI MmuK6 L___ MmuK4 MmuK4. L----------MmuK5 RuoK2 L._ __ MmuKI .__ __ HspK 4 '----H spK2 ___ Mmu K2 L._ _____ MmuK3 M muK52 .--OmyKIO '------1 L ____ Omy KE ._ __ OmyK3 99 1-------MmuK53 99 99 ..._ ___ H sp L 4a llfrK 6 HfrK4 H spL6 '-----Hsp L 2. '-----O vis L 9 9 H spL4 99 99 Cmo L 5 ._ ___ C m oLJ .__ ___ GgaL 97 .-----Hfr iRI L__ Cp l ,_ ___ HfriAI H spL2 HspLTI Rer4 Rer 2 MmuLx MrnuPrcR H sp PreB H co7 1 2 Hfrl 22 Rcr900 Hfr FIOI MmuVII 118 M01mma li;m Kappa ] Bony Fish Kappa Mammalian Kappa J Cartilaginous Fishes III Mammalian Lambda J Avian Lambda J J J J Cartilagin o u s II M01mma lian Lambda Pre B Rat fish Cartilagin o u s Fishes I Outgrou p
Figure 24. Minimum Evolution Tree of Immunoglobulin Light Chain Variable Region Amino Acid Sequences. The tree was rooted with heavy chain variable region sequences. Only amino acids involved in FRl, FR2 and FR3 were used in the analysis. Positions having insertions/deletions in any sequence were excluded from the analysis. Numbers by internal branches represent probabilities of difference from zero length (only those greater than 70% are shown). Scale bar on bottom represents the number of amino acid substitutions per site estimated by Poisson correction. (Alignment and tree constructed by Tatsuya Ota ).
Fig ur e 2:5. Min imum 1rrrcre' (OIIf Immuno globulin Light Chain Constant Region Amino-A-G.1d 1flle tmtt W(lli. with JJtype heavy chain con s tant regi on fourth ( CH4) t
96 99 99 99 93 9 3 95 99 99 99 99 99 99 9 9 MrnuM4 O J MmuL2 99 RnoLl MmuLI MspL5 OvisL CmoL GdoL HspK MmuK RnoK XlaK Gmo K OmyK HfrK4 HfrK5 H f rK6 GciK Rer4 Rer5 Hfr130 1 Hfrl41 Hfr1501 Rer102 Rer900 HfrScrM4 J j J ] j 120 Mammalian Lambda Avian Lambda Mammal i an Kappa Amphibian Kappa Bo n y F i sh Kappa Cartilaginous Fis h es Ill Cartilagino u s Fishes II Cartilaginous Fishes I ] O u tgroup
121 chondrichthyan genes fall outside a cluster consisting of the tetrapod A and K and the c hondrichthyan type II genes. The Heterodontu s francisci type III gene variable regions cluster with the mammalian K variable regions. Type III constant regions however cluster as a sister group to the mammalian K constant regions. The K variable region genes cluster with a significant branch length but the A variable region genes fall into a number of lower probability groupings that taken as a whole are paraphyletic with respect to the K genes. Discussion In comparing a limited amount of peptide sequence data from the light chains of Callorhynchus callorhynchus, a holocephalan, we noted a marked difference relative to the predicted amino acid sequences of light chain genes found in H eterodo ntus francisci (Shamblott and Litman, 1989) and in Raja e rinacea (Anderson et al., 1995). Using anchored PCR with a degenerate FR 1 primer, transcripts of this light chain type were identified in a spleen eDNA library from another holocephalan, Hydr olag us co lliei. The sequences of these genes are most similar to the light chain gene reported for Carcharhinus plumbeus and differ significantly from those de sc ribed previously in H ete rodontus francisci (Shamblott and Litman 1989a ; Shamblott and Litman 1989) (V L -48% and CL -43% amino acid identity). H e ter odo ntus francisci and Carcharhinus plumbeus belong to different galeomorph suborders which diverged from a common ancestor at least 150 million years ago (Carroll, 1988). Previously the dissimilarity of
122 their light chains was attributed to their long independent history (Hohman et al., I 992). With the discovery of light chains that are relatively similar to those of Carcharhinus plumbeus in: 1) a holocephalan which probably diverged from the elasmobranchs some 300 mya (Carroll, 1988) 2) Raja erinacea a member of a separate elasmobranch order and 3) Heterodontus francisci, it is now evident that this conclusion was drawn from the comparison of paralogous genes (i.e the two light chain types diverged prior to the divergence of the elasmobranch suborders compared). It is unknown whether C. plumbeus also has the type I light chain, but because this light chain gene family is found in both a batoid and a second galeorilorph species it would have been present in an ancestor of Carcharhinus. These findings serve to illustrate the potential complications of phylogenetic inferences drawn from pairwise comparisons of members of multigene families in the absence of an organized phylogenetic framework that includes relevant outgroups. In contrast to the immunoglobulin variable region genes that have been sequenced so far in the bony fishes and tetrapods, both light and heavy chain variable region genes of the chondrichthyans exhibit evidence of gerrnline-joining. To date, this type of joining has been observed in the heavy chains of Heterodontus francisci (Kokubu et al., 1988b ), Raja erinacea (Harding et al., 1990b) and in Hydrolagus colliei (1. Rast and C.T. Amemiya, unpublished data) All type I light chains thu s far sequenced have been found to be germline-joined in the skates Raja eglanteria and Raja erinacea (Anderson et al., 1995) However, in Heterodontus francisci, the type I light chain is unjoined in the germline (Shamblott and Litman 1989a). Although it cannot be stated unequivocally that
123 all of the light chain genes of H y drola g us colliei are germline-joined no unrearranged genes have been identified Indeed all type II variable regions in the four species from which they have been described are germline joined. The se obse rvations suggest that joining of some light chain families is phylogenetically variable i.e., some gene type s that appear to be exclusively joined in one species are unjoined in another. It might be argued that th e joined genes represent somatically rearranged genes that are detected in genomic libraries made from liver tissue which could contain a small percentage of lymphoid elements; however this is highly unlikely owing to the high frequency of joined genes. In addition, V L -JL linkage analyses carried out on Hydrolagus colliei testis genomic DNA employing PCR with the conserved V L and JL primers desc ribed above, yield only fragments which are consistent with germline joining. It is apparent that joined and unjoined light chains are associated with a light chain class distinction in Heterodontus The widespread observations of germline joining in elasmobranchs and the species specific variation in the expression of these genes lends support to the view that these pre-rearranged genes play a role in maintaining immunological diversity in the chondrichthyans. Figure 26 summarizes the organization of known chondrichthyan light chain types Recently it has become evident that heavy chain class complexity formerly associated only with the bony fishes and tetrapods, also is present in the most phylogenetically distant jawed vertebrates. At least two heavy chain constant region types are present in the skate Raja erinacea (Kobayashi et al., 1984; Harding et al., 1990a), the ratfish Hydrolagus colliei (J Rast and C.T. Amemiya unpubli shed data ) and probably in
124 the frill s hark Chlamydoselachus anguineus ( Kobayashi et al. 1992). From the studies reported here, it is apparent that at lea st two light chain families also are present in Raja and three in Heterodontus Attempt s to detect the type I ligh t chain in H ydrolagus, both in spleen and genomic eDNA libraries, thus far have been un s uccessful. Both genomic and sple en eDNA libraries have been screened with H eterodon tus Type-III variable and constant region probe s without success in either Hydrola g u s colliei or Raja e rina cea. The phylogenetic distribution of K genes suggests, however that the K variable region was present at least in an ancestor of these groups V L region divergence into distinct gene families (iso type s) has been well documented in mammal s (Kabat et al., 1991) and in amphibians (Schwager et al., 1991 ). By consideration of both nucleotide and predicted amino acid sequence, eDNA clone Hc712 repre se nts an equivalent divergence from the other VL regions in Hydrolagus In addition, this eDNA represents the first example of a chondrichthyan immunoglobulin gene cluster in which variable segments representing differen t gene families are found in association with the same CL type In Heterodontus the one example of family levelV H region divergence is associated with significant difference s in constant region structure (Kokubu et al., 1988b ). Such co-divergence is also evident in H y dro lag us co lliei heavy chain clusters ( J.P Rast and C.T. Amemiya, unpublished observations). In the case of the Hydrolagus colliei light chain, genomic clones that are positive for C L hybridize with either V L family probe, but no clone s hybridize with all three probes. Thus, the different V L types seem to be associated with the same CL type in characteristic clusters The 3' untranslated sequences a ssoc iated with cDNAs encoding the two different V L families are
Unjoined v c Type-1 Heterodontus francisci Type-Ill (K-Iike) Ginglymostoma ciffatum, Heterodontus francisci Germline Joined v J Type-1 Raja erinacsa Type-11 Hydrolagus col/iei Hstsrodontus francisci Csrchsrhinus plumbsus Raja srinscss c 125 Figure 26. Diagram of Chondrichthyan Light Chain Gene Arrangements. Note that while Heterodontus type Til genes are presently known from cDNAs, their joining state is unknown almost identical, indicating that the constant regton genes are the result of recent duplication s unequal cross in g over, or are the targets of gene conversion (Figure 18) The relative levels of expression of the different light chain types varies throughout the cartilaginous fishes investigated. In Hydrolagus colliei and Carcharhinus plumbeus (Hohman et a!., 1992 ), the type II light chain family is a major expressed form. In Heterodontus francisci expre ss ion of the type II light chain is reduced or absent, as indicted by a failure to detect the gene in a sp leen eDNA libr ary in which type I light chain genes are expressed abundantly. In Raja erinacea, the type ll light chain gene
126 family is expressed in a spleen eDNA library at about the same level as the type I family. The reason for variation in expression pre sently is unknown but the po ssibi lity exists that heavy and light chain isotype expression within the cartilaginous fishes is coordi nate and that certain light chain types have coevolved with their heavy chain counterparts. Alternatively, the differential expression of light chain classes in the cartilaginous fishes may be analogous the highly variable levels of relative K and /.. expre ssio n evident in different mammalian spec ies (Hood et al., 1966 ; Home et al. 1992). In this case, light chain isotype expression is independent of heavy chain type. In the phylogenetic analyses, the chondrichthyan light chain V L and CL types cluster as separate monophyletic families as would be expected from a cursory examination of their sequences. While the K genes from all species form a monophyletic group, the /.. genes are paraphyletic with respect to the K light chains, especially if the chondrichthyan genes are included. From the present analysis, the tetrapod /.. CL genes form a monophyletic s ister-group to the K, only if the chondrichthyan genes are excluded whereas the phylogeny of the /.. V L genes are more complex and cannot be monophyletic even if the chondrichthyan genes are excluded. The chondrichthyan type I genes form an outgroup to the type II and tetrapod light chains if the variable region is considered. If the constant region is considered the same relation ship holds with the possible exception that the type II genes may group with the type I genes rather than with the tetrapod sequences. In any event neither chondrichthyan V L nor CL of Type I or Type II can be classed as /.. in any evolutionarily meaningful sense.
127 The findings presented have demonstrated unequivocally that light chain class divergence ha s occurred within or prior to the emergence of the elasmobranchs. The early evolution of immunoglobulin gene diversity supports the hypothesis that gennline diversity is a significant factor in the evolution of immunoglobulin genes, possibly due to the significant role that these genes play in the fitness of individuals. Because most multigene families have undergone rapid concerted evolution which homogenizes members of a gene family, maintenance of diversity over long evolutionary periods may be the result of strong selection for diversity. In support of this hypothesis, phylogenetic analysis of heavy chain variable segments reveals an early origin of diversity and long persistence of several distinct groups of variable segments (T. Ota and M. Nei, unpublished). Exceptionally high levels of diversity among a number of proteins involved in the immune system have been observed for mammals possibly in response to pathogenic subversion of the immune system by molecular mimicry (Murphy, 1993). Alternatively, the emergence of multiple light chain types may result from chromosomal translocation or other events which render ineffective the homogenizing processes of gene conversion or unequal crossing over. The diversity generating mechanisms of the chondrichthyan immune system has previously been thought to be limited. However, at least as far as immunoglobulin light chains are concerned, these animals appear to possess a complexity comparable to the mammals.
128 6. HETERODONTUS FRANC/SCI TCR GENES Introduction It generally is accepted that cellular immunity evolved prior to the development of inducible, antibody-based recognition. In higher vertebrates, specific cellular immune function is mediated by TCRs which are encoded by an extensive diversified family of segmented genes and are rearranged and expressed in tis sue and developmental stage specific manners (Davis and Bjorkman, 1988; Jorgensen et al., 1992; Haas et al., 1993). TCR function involves recognition of antigenic peptides bound to major histocompatibility complex (MHC), class I or II molecules (Jorgensen et al., 1992) and diversity i s achieved by many, but not all, of the mechanism s utilized in generating B cell immunity (Tonegawa, 1983 ; Blackwell and Alt, 1989) Understanding the evolution of TCRs largely has been limited to interpretation of findings regarding gene structure and organization in several mammalian, an avian (Tjoelker et al., 1990; McCormack et al., 1991 ; Gobel et al., 1994) and an amphibian (Fellah et al., 1993a) species. genes have also recently been identified in bony fishes, consistent with the presence of distinct T cell functions (Miller et al., 1986 ; Miller et al., 1985 ) and both MHC I (Hashimoto et al., 1990) and MHC II (Hashimoto et al., 1990; Ono et al. 1992 ) genes. MHC genes have been identified in the more phylogenetically distant ( relative to mammals)
129 cartilaginous fishes (Hashimoto et al., 1992 ; Kasahara et al., 1992; Bartl and Weissman, 1994; Kasah ara et al., 1993) ; however the chronic nature of allograft rejection and lack of other demonstrable T cell-dependent responses question s the existence of a homologous form of T cell immunity in these species (Bartl and Weissman 1994; Smith and Davidson, 1992) The objective of the investigation described in this section was to determine whether TCRs could be identified in a modem representative of the cartilaginous fishes, which are the most phylogenetically primitive extant jawed vertebrates relative to the mammals Results Isolation of Novel Variable Region Elements A previously de s cribed immunoglobulin light chain gene in Heterodontusfrancisci exhibits significant nucleotide sequence identity with certain mammalian TCR genes (Shamblott and Litman, 1989). The data base of light chain gene sequences from lower vertebrate species has been expanded recently both by the findings described in the last chapter and those of others. Comparisons of the predicted peptide sequences of these light chains, higher vertebrate immunoglobulin light chains TCR and CD8 (Davis and Bjorkman, 1988; Johnson, 1987) genes, indicate that two four-amino acid stretches in FR2 of both TCR and immunoglobulin light chain (WYRQ) and FR3 (Y[Y/F]CA) exhibit little phylogenetic variation Specific primers complementing 11 nucleotides in each of these
130 sequences were synthesized; nucleotide sequence degeneracy of the 5' (FR2) primer is 16-fold and the two different 3' primers (A) and (B) are both 8-fold degenerate. The primer sets were expected to yield PCR products in the -179-197 bp range and the 5' primer was designed specifically to not fully complement immunoglobulin heavy chain genes, which typically possess the W(V II)RQ motif. The sequences of three different 193-197 bp products resulting from amplification of Heterodontus genomic DNA in which the 3' (A) primer was employed and one product resulting from amplification in which the 3' (B) primer was employed had sequence identities at the predicted peptide level that were highest with TCRs, ranging from 30-45%. After subtracting the primer contributions, the products are 20-40% related to each other at the predicted peptide level and may represent different TCR classes or extensively diversified families. A Heterodontus Homolog Three of these PCR fragments hybridized to cDNAs when used to screen a Heterodontus spleen eDNA library. The complete nucleotide and predicted peptide sequences of one eDNA clone, HF6, which hybridized to the HFYYC182 probe are shown in figure 27. Thirty-seven of the 40 highest optimized peptide alignment scores are with TCR genes, of which the highest 34 ranking alignments are TCRs. Leader, variable (VT), joining (JT) (including possible diversity [D] segment contributions) and regions can be recognized by comparison to higher vertebrate TCRs. When the VT-and CThomologous portions of eDNA HF6 were analyzed separately against the
Figure 27. The Nucleotide and Predicted Amino Acid (Nu mbered 1-293 ) Sequence of eDNA HF6. A stop codon is indicated as by a period (.) Uppercase nucleotides correspond to the predicted coding sequences; lower case nucleotide s de signate the 5 and 3' untranslated region s. The predicted boundarie s of the lea der variable, joining and cons t ant region s correspond to functional regions of mammali a n TCR s The regions of nucleotide se quence of eDNA HF6 that corre spo nd to the 5 (WYRQ ) a nd 3' (YY CA ) primers are in bo xes.
aatcagctttgagacacaattcctgttgatatcgaagggctccaatcagttagagatttactggtttggttttgctgttgaggacaa gtgtcgttatttagcaccgctgcagttgtgcccagcgattctgggaatcagcacatgacccctgtgctggggcggcagcgaaatgat cgactgtggtgtgcatcgtgagctttaccattaaatctctctctctctccttattgtgtctgttctctttgacccgatctgtttctc M F L H S P V 0 E R S R Y P V G N R L T V A E G K T V E M ATGTTCCTCCACAGCCCGGTCCAGGAGCGCTCTCGTTACCCAGTGGGGAACCGGCTGACGGTCGCAGAGGGGAAAACGGTGGAAATG . 30 H C F 0 N 0 T S 0 S Y M Y W Y R 0 0 S G A G L L L I V T S CACTGTTTCCAGAATGACACCAGCGACAGTTACATGTACffGG!ATCGCCAGCAGAGCGGAGCAGGGTTACTGCTCATAGTGACCTCG 59 I G T S 0 T S P E E G F K E R F K V T R P 0 L K T C S L K ATTGGCACCTCTGATACCAGCCCTGAAGAGGGTTTCAAGGAGAGATTTAAGGTCACCAGACCCGATCTTAAAACCTGCAGCCTGAAA 88 I L R V 0 0 T 0 R A V Y Y C A A S G H P S 0 S N S E A Y F 117 G 0 G T K L V V L G E N 0 T I R P A K V T V F E P S P E E GGAGATGGGACCAAACTGGTTGTTTTAGGCGAAAACGATACGATTAGGCCTGCCAAAGTTACTGTCTTTGAGCCCTCTCCCGAAGAG 1 4 6 I R E K K K A T V V C L V S 0 F Y P 0 N I K I H W L V 0 G ATTAGAGAAAAGAAAAAAGCCACTGTGGTCTGCCTCGTCAGCGACTTCTACCCCGACAACATCAAGATCCACTGGCTTGTCGATGGC 175 K E K 0 A N 0 T N I H T 0 L N A I L S K E N T S Y S I S S AAGGAGAAAGATGCCAATGACACGAACATTCACACTGATCTCAATGCCATCCTATCAAAAGAAAATACATCTTACAGCATCAGCAGC 204 R L R F 0 A L 0 W A R S K N V E C R V 0 L Y T N E S V P T CGGCTGAGATTCGACGCCCTAGATTGGGCTCGGTCCAAGAACGTTGAGTGCAGAGTAGATCTCTACACTAACGAATCAGTGCCTACA 233 T S S S T L A V K A E M C G I S K E A K I 0 S M A T A K L ACAAGCAGTTCAACATTAGCTGTCAAAGCAGAAATGTGCGGTATAAGTAAAGAGGCGAAAATCCAAAGCATGGCAACGGCAAAACTG 262 T Y L I L I C K S I F Y T I F I S T I A W K T K T S Y S K ACATACCTGATCCTCATCTGCAAAAGCATCTTTTATACAATTTTCATCTCAACAATTGCTTGGAAAACTAAGACGTCTTACAGCAAA 291 R F 0 AGGTTTGACTGAtcgtcgcaaattaatgaactcaaaagatggataagcagcagaac w N
133 available data ba ses the 50 highest scori ng alignments are with the respective mammalian TCR V and regions; maximum nucleotide identitie s are 57 % and 56 %, respectively. A sequence region which is homologou s to TCR/immunoglobulin J segments i s recognized; however the short length precludes a more exact homology assignment (see below). Owing to the considerable sequence relatedness between immunoglobulin and TCR variable regions (Yanagi et al., 1984 ; Hedrick et al., 1984 ), eDNA HF6 is designated as a TCR homolog based on the alignment of the putative constant region with the corresponding portions of tetrapod TCR prototypes (Figure 28). The constant region of the shark TCR homolog consists of 168 amino acids as compared with the 154 to 178 amino acid range for the other constant regions; this length excludes classification of the eDNA as a light chain (Rast et al., I 994). In this alignment 31 positions are fully conserved and an additional 22 are shared by the shark sequence and TCR genes from at least three of the five other species. A total of 60 HF6 residues are shared with at least two of the five sequences. The highe s t number of identitie s is found in the extracellular domain. A functionally significant lysine residue, which interacts with CD3 in mammals (Alcover et al. 1990), is present at position 269 in HF6, as well as in all TCR genes. The Heterodontus Gene Organization Resembles that of the lgH Southern blot analyses of CHEF-separated Het erodontus DNA are s hown in figure 29. A high degree of concordance in hybridization with the Vrand Cr-specific probes,
EXTRACELLULAR axo HINGE TRANSMEMBRANE CYTOP LASH I C 221 ,. H 293 H F 6 RIV]D L Y TN E S V --P TT S S s[JufvJK(A:El--C G I S K E A K IIQJSMA LT Y LI LICKS I F T l F I jg1" i K T KT S Y s@R F 0 axo CG TEOSFRSSAKIGRFAYLLL SKSAA GLFVTISM-------CRVKL. c h k I A N F F K N G T 0 0 S I 0 K I I Y G 0 T --c(Q I F K E N Y 0 R S A T A G K F V Y I M LI F K S I L G I F V M G 11 -MY mus OVOFHGLSEEOKWPEGSPKPVT NISAE WGRA CGI SASYOOGVLSA ILYEILLGK TlfAVLVS LVVMAM---VK K N S. rab OVOFFGLTDDDKWTYNSSKPVT NVS HTRGRA CGISSASYOOGVLSA VLYEILLGK Tl AVLVS LVLMAM---VK KNP. hu m I LYE I lLGK TL AV 'ILHAM VK KDSRG. Figure 28. Alignment of the Predicted Peptide Sequence o f Positions 126-293, Corresponding to the Constant Region of the H e t e r odo ntus HF6 TCR Homolog The constant region sequences of five higher vertebrate TCR from axolotol (axo), chicken (chk), mouse (mus), ra bbit (rab) and human (hum) are co mpared. A bsolute identity between HF6 and at least three addit ional TCR P seq uence s is s hown Assig nment of functional region s ( bound a ries indic ated by triangle) is based on mammalian TCR s tructure. St op codons are desig nated by a period(.). Gaps(-) were introduced to maximize sequence identity The GenBank identifier s for the specific sequence s s hown are: axo (AMTCRBC), chk ( CHKTCRBCC), mus ( MUSTCBXH ), rab ( RABTCBXB) hum (HUMTCB YY). V-J
135 which are unre lated at the nucleotide level, is evident in the Nru I and Mlu I digests (as well as with Eag I and Sfi I not illu st rated) consistent with multiple VT-CT linkage groups. In contrast, Not I digestion reveals some non-concordant hybridization and is consistent with the detection of Not I s ites in a s ignificant number of VT+ A c lone s (see below; not illustrated). Based on concordant hybridi zat ion at -50 kb in the Nru I digest, at least some VT segments are likely within this linkage di s tance A Heterodontus genomic A library (Kokubu et al., 1988b ) was screened with PeRgenerated probes complementing the putative VT segment and the first predicted putative CT ex on of the HF6 eDNA. 191 V T+ and 103 CT+ plaques were identified. No dual hybridizing plaques were detected. Twelve VT+ genomic clones were restriction mapped and each is unique; eight of these clones contain two V T segments and 11 hybridize with a JT probe Of 12 CT+ clones characterized s imilarly, 10 are unique and each contains a single region Seven clones are JT and five are JT+ Representative maps of four clones are shown in figure 30A. The large number of CT-containing clones is reminiscent of the large number s of immunoglobulin CH-(25) and CL-containing (Shamblott and Litman, 1989a) clones identified previously in this species, a distinguishing feature of the multi-clu s ter form of gene organization. From the A genomic map analyses, it is evident that: I) single V rcontaining clones contain the greatest number of JT hybridizing elements, up to four if a single V T maps to a A arm; 2) of the five A clones, each lacks the 3' cytoplasmic (determined by hybridization with an exon-specific probe generated from sequence information contained in Figure 27), consistent with the map placements near a A arm and 3) clone s containing
1900 ... 110 0 ... 745 ... 610 ... 45 0 ... 375 ... 295 ... 225 ... 145 ... 97 ... 48 ... Nrul v c 136 Mlul Not I v c v c Figure 29. CHEF Analy s es of Heterodontus DNA with HjU V and C Probe s Ag a ro se embedded erythrocytes were digested to completion with variou s restriction endonuclea ses indicated. V variable region ; C constant region ; number s to left represent size standards in kb Standards are se le c t Saccharom yces ce revisia e chromo so me s and a A. ladder. the 3' cytoplasmic exon do not contain a detectable JT segment. Separate screening with IT-specific probes identified JT+NT/C T clone s, consistent with the other scree ningmapping re s ult s Nucleotide se quence analyse s of representative genomic V T and J T segments indicate typical recombination signal sequences ( 4): 3'VTHF2VA=CACAGfG23 bp s pacer -TCAAAAACA and 5'-JTHFIJA=GATTTTGTG-12 bp s pacer-CAGTGTG Genomic V T clone HF2VA i s 96% identical ( corre s ponding portion ) at the nucleotide
137 A H E EHE E E H E E V 9 II I n1 II n I l e I I I I v v J E E E EH Er-E E V13 I I n I I l e i l ellell v J J J J H EEEH H HEEE C1 I e l I II I Ill 1 1 J J J J C1 E E E E E C 1 4 II 1 1 I II C1 TM CYT H 1 kb B v v D? J 1 J2J3J4J? c <50 kb Figure 30 H e terod o ntus TCRP Genomic Maps. A. Partial restriction maps of four Heterodontus genomic clones selected by hybridization with V T-(V9 and V 13) and CT (C I and C 14)spec ific probes. Cleavage s ite s for Eco RI (E) a nd Hind III (H) are indicated. Len gth estimation is by conventional electrophoresis relative to standards and from s izing of products amplified by PCR VT is designated by 0 J T is designated by e and CT is designated by A probe specific for the extracellular portion of the constant region (C I) was generated u s ing PCR, based on the sequence relationships shown in Figure 2; oligodeoxynucleotide probes specific for the transmembrane (TM) and cytoplasmic (CYT) portions of the constant region s were used directly. B. Tentative organization of a H eterodont u s TCR-homolog loc us ba se d on conventional restriction mapping, intergenic distance est imate s determined u si ng PCR and Southern blot-CHEF analyses (Figure 29). Th e map is a composite of r es ult s that are entirely consistent with available data for 20 unique A genomic clones. V (variable), D ? (possibl e diver s ity), J Uoining) segments and C (constant) region exons are indicated: symbols as in Figure 30A. Placement of D? (presence and number of seg ments not known) i s based on analogy with higher vertebrate systems and is consistent with seq uen ces o f recombination s ignal sequences de sc ribed in thi s manuscript. Overall length is shown below and is estimated at <50 kb, based on CHEF analyses (Figure 29). A A genomic clone has been identifi ed which contains only four J segments (no V or C); additional Js, not recognized by the J spccific probe arc indicated as (J?).
138 level to eDNA HF6. A leader intron is absent from the shark genomic V T segment similar to the chicken TCRa locus (Gobel et al. 1994) Heterodontus Region Family Diversity The question of overall diversity of the TCR homologs has been approached by screening the same eDNA library with a PCR-generated CT-and HF6VT-specific probes. A large number of hybridization plaques, Cr+NTHF6-, were detected of which four were selected for furt her characterization. The predicted amino acids of the four V T segments, of which the sequence of HF15 was truncated are indicated in figure 31. It is evident that a high degree of seq u ence difference exists among these V T segments, including HF6. In that these clones were s elected solely on th e basis of Cr+, VTHF6-_hybridization it is likely that this family of TCR homologs is exten s ively diversified HF6 V region cross hybridizing clones account for 30% of CT+ eDNA clones in thi s lib rary. VT region amino acid identity ranges from 24 to 41% similar to that seen in mammal s (Davis, 1 985) but in contras t to the very closely related sequence s of Heterodontu s V H genes (Kokubu et al., 1988b ) A Heterodontus TCR8 Homolog The two remaining H e t e rodontus variable region PCR probes Hf2A and Hf191 Y for which cDNAs could be found hybridized with two cDNAs clones both of which were
H F 6 o E R s R Y P V G N R L TV A M 0 N 0 T s y@y W Y R 0 L L I V T s I G T s G F K E RfFJK v T mp L K T c I L RIVJo o T !QJrl A v Y Y c H F 1 1 G A V S I S H S P V I E I Y TEAT AS V F N W Y R H L 0 REP EN H F Y S Y P A G T IT P S G E V o[dr A R-R P 18Js H F Y L E S s@l 0 V N 0 SA V Y Y C HF16 RAEVI H PTLVVVNGSEPA LN RNOSRKS! -MLW Y R O AG ERRCSLSVL K HF1 7 NSOT!H PTAVSKLP@E VELK HF15 Fi g ure 31. A li gnment of H eterodontu s T C R eDNA V aria ble Re gion P re d ic t e d Ami no Ac id Se que nce s ( thr o u gh Y [ Y/F ] C) Amino a cid identitie s among thre e o r m o re sequenc e s a r e boxed HF1 5 i s experimen t a ll y d elete d i .e. a 5 '-trun c a t ed c lo n e w \0
140 sequenced. The variable and putative joining regions of the Hj2AcDNA (accession #U22673) are shown in figure 32, and the constant regions of Hj2AcDNA and Hj19l Y eDNA (accession #U22672) are shown in figure 33. The sequence of the corresponding variable and joining regions of Hj191FYcDNA (deposited in GenBank) is truncated but nevertheless is dissimilar to that of Hf2AcDNA over a significantly informative segment. Presumably, this represents a sequence family difference as has been observed for vp segments in this species (Rast and Litman, 1994) (N. Hawke and J Rast unpublished observations) Forty-three identities are evident between the variable regions of Hf2AcDNA and human TCR
HfZA TCRD --------------------Variable Region _Joi nin g Region __ . ANNjYY]L YjiWR(iJY AERE]PQFt/VI4RjRjswNT[EO]oRGEIG]FG.AjRJF!SlAE La$(Kfrs l r Is I EPI OKW T d OS Figure 32. Alignments of the Variable Region of Hf2AcDNA with Be s t Match from a Search of GenB a nk Using FASTDB ( human TCR8, HSTCRDR). Highest scoring matche s are illu s trated to establish that the s equence is TCR8-Iike and is not int e nded esta bli s h homology Joinin g regions are included in th e alignment, h owever, due to the relatively independent natur e of recombination between variable region elements, this may not be the best joining region match.
5nl9l Hum an TCRA Mouse TCRA ChickenTCRA Hfl91Y Hf2A Mouse TCRD HumanTCRO 5n l91 C HumanTCRA Mouse TCRA ChickenTCRA Hfl91Y Hf2A Mouse TCRO HumanTCRO ---------------Extracellul a r Region ________________ NION E AVY L--------KOPRSOOS T C V K PJj] IT P S S Y L T S E 0 D K D L E C IT D Y S P E K L c[Js S V D S K TE TV V EVA T SEN K HE AS Y L Qil -Y K K ** ,NKR!PjENPKLSJFYPSLARSDDLDPDET AV LASEFTPKEIE SVVH DINHKSNVTRSSILLND YISSGFLPFP[gjOOKPV V TCEA HN PSV IM--------K N TNV C V D YPKEVTIS VTCSV HN. RSOPHT PSVFVM------K N TNV C V E YP IRINLV-SSKKITEFOPAIMJSPS VTCSV HD 10 211 GON I---SAG!fl-SE VHSTDFEPYANSFNNE--N VHSTDFEVKTDS!fiDHV--. . ltM I 1111 KL K P KL KP 40 $11 IDCNKSSNGTSAGLN-DTONDLTEvNF M 1211 t:\U 140 Ml SLTV SLTV SLTV SLTV 130 * . '"'' 70 RU * 170 IRU YO Figure 33. Ali g nment of Putative Horned Shark TCR Clone s Hfl91 Y and Hf2A Con s tant Region s w ith Human M o u se, Chicken an d Pufferfi s h TCRa, and Human a n d Mouse TCR8. Das hes indicate gaps introduced to optimize alignment. Re s idue s that are s h ared between fish and TCRa s equen ces a nd between shark and TCR8 sequences are enc l osed in box es. Asterisks i ndicat e r es idu es that are conserved among all s hown TCRa, TCR8, and query sequences. Periods at the end of the se quence s indicate s t o p cod o n s. A lysine that has be e n de sc ribed in all TCRs and has been po stulated to interact with they chain of CD3 is s hown in rev e r se ima ge in the Transmembran e region. All sequences are aligned to each other a s it i s diffic u lt to establish homolo gy to tetrap o d TCRa vs T C R8. GenBank accession numbers of the sequence s used in the alignment are as follows: huma n TCRaD2 HSTCELA I : mou se TCRa, MMU07658; chicken TCRa, GDU04611: mouse TCR 8 MMTRDNU; human TCR8, HUMTCRGC. N
143 The Genomic Organization of Heterodontus TCR8 The genomic organization of the Hf2A-type gene ha s been partially characterized u s ing both Southern blot and library scree ning techniques. Traditional Southern blots of re s triction enzyme digested genomic DNA hybridized with Hf2A specific probes indicates the presence of numerous variable regions and fewer constant region genes (fig ure 34A). The lighter constant region band may represent the related Hfl91 Y c-type constant region CHEF blots employing these same probe s produce a similar pattern and are indicative of a lack of linkage between the two constant region bands (Figure 34B). Genomic library screening produces results consistent with the above findings In a screening of one haploid genomic equivalent, the Hf2A V region probe hybridized with 810 clones while the Hf2A constant region probe hybridized with 16 clones. Pre s umably a number of the 16 clones that hybridized with Hf2A constant re g ion probe represent the Hf191 Y -type constant region. In a similar screening, th e Hf191 Y probe hybridized to 120 clones. eDNA library screening with both constant region types identified numerou s clones many of which presumably contain V region s from different families. Discussion DNA cross hybridization approac hes have proven effective in identifyin g immunoglobulin gene homolog s among lower vertebrates (Litman et al., 1993c; Litm a n et al., 1985) ; however outside of mammal s, only the avian TCR gene h as been
Figure 34. H e t e r od ontu s Genomic Southern Blot s Hybridi z ed with Hf2AcDNA Variabl e and Constant Region Probe s A. Traditional Southern blot. B CHEF blot. Size s t a nd a rd s are indic a t e d in kb
A B 23 ... 1120 ... 400 ... 300 .... 200 ... 100 ... so ... 1 45 v c E B H S X E B H S X c v N o r I Nnt I Mlu I N o r I Nru I Mlu I
146 identified in this manner (Tjoe 1 ker et al., 1990). The considerable sequence diversity among TCR genes in mammalian s pecie s and within the TCR multigene families in a single species (Wilson et al. 1988 ) suggests that direct screening or PCR amplification with degenerate oligodeoxynucleot ide primers may not be generally applicable, a lthou gh this latter approach was successful with one amphibian species (Fe llah et al., 1993a). A similar approach to that reported here was recently independently used to isolate a homolog from the rainbow trout (Partula et al., 1994). A number of direct screening approaches, including the use of degenerate oligodeoxynucleotide probe s as well as PCR amplification using highly degenerate primers have fai led to identify TCR homologs in cartilaginous fishes (Litman et al. 1993b ; Litman et al., 1993c). In addition to the sequences reported here, simi lar PCR amplifications of genomic DNA from two other cartilaginous fish (Raja, skate; Hydrolagus ratfish) and additional vertebrate species (Gallus, chicken; Xenopus, African clawed frog; Brach yda nio zebrafish; Spheroides, pufferfish) yield -190-200 bp products with significant sequence identity to TCRs. This approach has broader applications than these phylogenetic ana l yses as it has the potentia l for detecting related genes and pseudogenes, including additional TCR families, in which these conserved sequence elements (or variants thereof) are separated by relatively short distances. Taken together, the l ibrary screening, conventional restriction enzyme mapping intersegmental linkage distance estimates using PCR (not illustrated ) and the CHEF analyses are consistent with multiple TCR Iike clusters, each consisting of multiple VT and JT segments and a region A model of one such cluster which includes the
147 possibility of additional linked V1 and C1 segments is shown (Figure 29B), including multiple 111C1 segments analogous to mammalian systems. Accurate estimations of the number of clusters as well as the number of V 1 and C1 elements in a cluster are confounded by limitations in the insert sizes of existing genomic libraries and the possibility that like immunoglobulin in this species TCR genes are present at multiple loci on different chromosomes (Litman et al., 1993b; Litman et al., 1993c). The estimated inter V-C linkage distance and presence of multiple V1 and 11 segments associated with a segment differs from immunoglobulin heavy (Kokubu et al., 1988b; Hinds and Litman, 1986) and light (Shamblott and Litman, 1989a) chain gene clusters, which contain single copies of the recombining elements and are -18 kb and -2. 7 kb in overall length, respectively. The close linkage of V1 and 11 differs from mammalian TCR loci (Lindsten et al., 1987; Wilson et al., 1988). From the sequence shown in Figure 1 and by analogy to higher vertebrate TCR genes it is reasonable to assume that a D segment(s) possessing a 12 nucleotide 5' and a 23 nucleotide 3' recombination signal sequences, may be present between V1 and 11 (McCormack et al., 1991; Siu et al., 1984), in addition to junctional insertion of nucleotide sequence (Figure 4B); alternatively, direct V -J joining may occur. The overall complexity and genomic organization of the TCR homologs obtained thus far is characteristic of both the immunoglobulin heavy (Kokubu et al. 1988b ; Hinds and Litman, 1986) and light chain (Rast et al., 1994; Shamblott and Litman 1989a) multi cluster gene families described in this species and other cartilaginous fishes (Litman et al., 1993c; Harding et al., 1990a). Parallel evolution of systems which may possess
148 independent phy s iological role s rai ses que s tion s about broad aspects of chromosomal organization of genes in this species as well as adaptive advantages and genetic regulation of cluster-type genes. Although it is possible that the Hj2AcDNA and Hf191YcDNA genes are of the a-type or could be representative of another type, unique to the cartilaginous fishes, the identitie s of both V and constant equivalent regions to TCR8 are the tentative basis for their assignment to this TCR class. Nevertheless the distinct possibility exists that the aJ8 divergence took place after the divergence of the ancestor of the cartilaginous fishes and the lineage leading to the mammals, rendering the al8 distinction in a s hark an inappropriate one In ongoing investigations aimed at characterizing the TCR diversity in Raja eg lant e ria cDNAs representin g four very different TCR isotypes have been identified ( J. Rast and M Anderson unpubli s hed). Neighbor joining analyses of the V and C region amino acid sequences of these tran sc ript s indicate that they may represent homologs of and TCRy8 from mammal s. Sequence comparisons to the two most similar H e terodontus TCRs demonstrate that two of these genes are homo logs of the shark and TCR8 genes. Further analyses aimed at characterizing more fully the variable region diver si ty of these genes s hould allow for better definition of their relationship to mammalian genes. It now appears likely that the divergence took place prior to the divergence of chondrichthyans and osteichthyans. Further investigations are necessary to determine whether the corresponding TCRy() functional characteristics are also present in the chondrichthyans.
149 The presence of TCR-Iike genes at an early stage of vertebrate evolution provides strong evidence for an integral component of T cell-type immunity in the primitive piscine vertebrates. The remarkable degree of intergenic variation associated with the respective variable regions further suggests functional homology. Assigning a physiological role for the TCR homologs is less secure because two properties characteristic ofT cell immunity: acute allograft rejections (Smith and Davidson, 1992) and affinity maturation (T cell-dependent B cell maturation) (Makela and Litman, 1980; Litman et al., 1982) are not associated with immune responses in these chondrichthyan species. Although convincing evidence has been presented for a somatic mutation or some type of gene conversion in Heterodontus (Hinds-Frey et al., 1990), it appears that the mechanism(s) for selecting such mutants is not operative, at least within the context of the analyses that have been carried out to date. We have ascribed this to an absence of germinal centers (Hinds-Frey et al., 1990) which are the selection site of somatic mutation in higher vertebrates (Jacob et al., 1991; Berek et al., 1991 ). Thus it seems that the cellular selection mechanisms for somatic mutations were a later adaption than the mechanism(s) that actually generate diversity. The physiological roles of both the and TCR-like systems in the primitive vertebrates remain uncertain but their evident presence at such an early stage of vertebrate phylogeny corroborates the hypothesis that the structurally homologous gene may be responsible for other forms of historecognition in the more phylogenetically distant jawless vertebrates, protochordates, and invertebrates. The results of these studies as well as other data being accumulated in ongoing parallel investigations of other species is providing information as to the
150 nature and extent of variation in short, shared sequence regions that are amenab l e to the typ e of amplification reactions de s cribed h e r e affording a possible mean s for detecting TCR-Iike homologs in more dis tant forms.
151 7. CONCLUSIONS While it is not difficult to draw inferences regarding the evolutionary history within each class of immune system protein discussed in this text, higher order conclusions regarding the origin of these genes or the i n terrelationships among the classes are speculative Because these genes evidently originated and diversified into their major classes prior to the divergence of the vertebrate classes in which they have been found a phyloge n etic analysis, which takes advantage of information of evolutionary intermediates trapped within the genome of diverging species, cannot be used to elucidate their evolution Instead, relationships among these genes must be inferred from sequence comparisons a difficult task given the extreme evolutionary rates associated with these genes and the length of time since their divergence The findings of TCR gene homo logs in cartilaginous fishes described previously a l ong with the discovery of novel rearranging genes (Greenberg et al., 1995) have the potential for contributing new data to the reso l ution of this l ongstanding iss u e. Possib l y, additiona l related and potentially informative genes can be found among the jawed vertebrates, even if analyses in the agnathans do not yield demonstrable homologs In a ddition the ever increasing data base of TCR and Ig sequences from diverse jawed vertebrate species may lead to stronger inferences in the future. Presently, the se high e r order analyses are mainly speculation.
152 The Origins of Rearranging Genes The emergence of the IgffCR gene sy s tems may have involved the segmentation of a primordial exon encoding a single V type Ig domain into separate "V and "J" element s Thi s initial event may have been mediated by a transpo s on bordered by RSS Iike sequences If this integration were reversible then one outcome of such an event would be a genetic system that is capable of rearrangement. In spite of the extensive variation in the organization of rearranging gene systems among the jawed vertebrates, the genetic elements and recombination mechanisms appear to be highly conserved The absence of intron s in the RAG genes is a prokaryotic and lower eukaryotic genetic characteristic and it has been speculated that these genes may have been acquired through a process of horizontal gene transfer possibly through a viral intermediate (Bartl et al. 1994). Thus far while the RAG genes appear to be relatively conserved even between chondrichthyans and mammals (Greehalgh and Steiner, 1995) no homologs have been identified in the agnathans The introduction of enzymatic deletion and nontemplated nucleotide addition proces s es would regionally diversify the recombined gene. Investigations on the phylogeny of the enzyme respon s ible for N region addition (TdT) which appears to be used only in the VDJ joining process and its relation to other known genes should shed light on the origins of thi s phenomenon The primary segmentation event may have been followed by a second event resulting in the formation of diversity regions This secondary segmentation may have occurred only once in the ancestor of both TCR P and 8 cha i ns and the Ig heavy chain s
153 or more than once in the lines leading to each. The acquisition and loss of D elements probably is likely over evolutionary time scales Variation in the number of diversity segments employed in recombination can be seen between systems with one D osteichthyan-tetrapod IgH) two Ds (chondrichthyan IgH TCR8) and three Ds (nurse shark antigen receptor, NSAR). Further processes resulting in the loss of D elements are evident in the various joining states of chondrichthyan IgH In addition sequence similarities between TCRa and 8, and TCRy and both pairs with and without D regions would support a gain or loss of D elements Further rounds of duplication and divergence ultimately resulted in the emergence of the Ig heavy chain, Ig light chain and the and TCRyo gene systems, although their order of appearance is unclear. The acquisition of additional domains in the constant region of one member of a heterodimer could transform a monovalent TCR-type protein into a divalent, Y shaped Ig type protein The emergence of a SEC mechanism could lead to a system of humoral immunity like that of the Igs either before or after the transition to a divalent state, although it would seem likely that the transition to a divalent state would be more useful after a secretory mechanism had evolved Notably, there is some evidence that TCRs may function to a limited extent in a soluble form (Onda et al., 1995). Throughout the evolution of this system processes that allow the selection of advantageous and the elimination of detrimental specificities also were evolving Paradoxically theoretical considerations have suggested that the considerable diversity of these systems is the product of avoidance of self specificity as opposed to a necessity for universal binding capability (De Boer and Perelson 1993)
154 Combinatorial diversity is the least universal diversifying mechanism among Igs. The finding of TCR homologs in Heterodontus reported here demonstrate that a highly diversified, multiple element VT and JT locus is probably associated with a single CT region establishing that combinatorial systems either were present prior to the emergence of the lgs and/or have evolved multiple times (as may be the case with mammalian V L and V H systems). Preliminary data regarding the Heterodontus TCR8 locus indicates that it also employs recombinatorial diversity. Phylogenetic analyses comparing Heterodontus V 13 families with those of mammals suggest that some families were established prior to the osteichthyan/chondrichthyan divergence. Thus, combinatorial mechanisms of diversity generation in some instances are probably very old. Relationships Among Rearranging Genes While it may never be possible to reconstruct the events that lead to the emergence of the initial rearranging variable region gene, the relationships among the TCR and lg genes may be inferred as more data from a variety of diverse representative species accumulates. Among the TCRs, TCRa and TCR8 share greater identity with each other than either does with TCRB or TCRy. The organization of the mammalian TCR a/8 locus (see figure 3D) can be explained by a duplication of the (D)-J C portion of a primordial gene as is the case for mouse and human TCRB. Subsequent divergence of these constant regions along with a gain of a D region in TCR8 or loss ih TCRa may have then taken place. In sequence analyses, TCRB appears to be most closely related
155 to TCRy. Notably, a significant number of receptors are the result of "trans" rearrangements between the and TCRy loci in which a TCRy V segment is joined to a (D)J region and spliced to (Davodeau et al. 1994 ) These TCRs are expressed on the cell surface and appear to be functional strengthening the notion that and y TCRs are related structurally An initialy unrestricted assortment among TCRalfCR() and proteins, as is the case for lgK IgA. and IgH, could eventually develop into a co-segregating repertoire through a process of coevolution (a phenomena which may be partially precedented by the Xenopus Ig light chain segregation relative to lgH isotypes ; Hsu et al., 1991 ). The relationships between the Igs and TCRs are much harder to decipher. Presumably, a divergence of this type would require the duplication of a number of genes simultaneously as might be expected in a polyploidization event (Ohno, I 970 ; Ruddle et al., 1994 ) exhibits clear similarities to some light chains (Shamblott and Litman, 1989b) however heavy chains are difficult to align with any TCRs. Answers to these problems may be aided by analyses involving large numbers of variable region genes as more data are collected from diverse species. Unfortunately the derived nature of constant region sequences allows for very few amino acid position s to be compared in IgHrrCR alignments The emergence of lg from TCR, if that is indeed the order of its appearance, may serve as a model for the divergence of not just individual genes but complex genetic systems. As illustrated in figure 3 membrane Ig and TCRs appear to have no cell signal1ing ability of their own but act in concert with a number of other cell surface and cytoplasmic molecules To varying extents these accessory proteins are
156 specific to TCR or Ig. Whether these groups of genes actually coevolved with the antigen receptors since their divergence or have become associated more recently remains an open question. Sequence divergence in a representative chondrichthyan has been demonstrated to take place more slowly than in mammals (Martin et al. 1992). If this is the case, it is possible that information regarding the interrelationships of these genes is better retained in the chondrichthyans allowing for more informative comparisons among their Ig and TCR genes. The Acquisition of Novel Function and Cell Types Cell types in the mammalian immune system are often defined by the proteins that are expressed on their membranes The relationship between the diversification of the receptors described here and the emergence of new cell types and functions may serve as a model for evolutionary cell differentiation. Clues to the order of emerging function possibly will come from an understanding of the ontogeny of the cells with which the functions are associated However there is no necessary relationship between progressive acquisition of functional specificity and ontogenetic development of cell types and any conclusions must be made with caution. While the divergence of species and genes is a process that requires the loss and gain of protein coding information, cell types within an organism are typically equal at least with regard to the DNA that they possess. This makes reversals with regard to protein expression much more likely to take place. Thus genetic systems can be "commandeered" and used for purposes and at developmental
157 stages that are separate from the evolutionary circumstances in which they arose. An example of such an event is the targeting of the complement system by antibody complexes (Parries and Atkinson, 1991 ). The complement system most likely emerged long before the Ig system, yet an intimate and systematic interrelationship between these two systems has subsequently developed. Much of the evolution of immunity may involve novel targeting of pre-existing defense systems (Janeway, 1992) Four mammalian cell types and their relation to the evolutionary emergence of the receptors discussed here are of interest: T cells, yo T cells, B cell and natural killer (NK) cells. NK cells are lymphocytes that, like T cells, possess cytotoxic function but express neither TCRs nor CD3 proteins and appear to operate by more generalized recognition mechanism In part recognition may involve "seeing" the absence of MHC class I molecules on the surface of target cells The receptors for this cell type are begining to be isolated Although an Ig gene superfamily protein has been described, it bears little resemblence to Igs or TCRs (Colonna and Samaridis 1995). It is possible that this same cytotoxic system was coupled to TCR signaling following the emergence of TCRs and thus became redirected at new targets Alternatively the reverse order may be true. Similar arguments for effector function redeployment can be made for yo and T cells, in light of differences in the antigens to which they are directed. New clues to the origins of these systems will arise as NK and TCRyo function is clarified. Phylogenetic analyses comparing the proteins involved in the signaling pathways of these systems between cell types may also help to answer some of these questions. This dissertation has outlined the similarities and differences between the lg and
158 TCR genes within chondrichthyans in comparison to species representing divergent points in osteichthyan/tetr a pod evolution. It is remarkable that the gene element s encoding TCRs and lg heavy and light chains which have nearly parallel structures and functions exhibit s uch varied chromosomal arrangement s among different taxa In a recent survey of the molecular sequence data bases, a comparison of divergence between homologous mouse and human proteins revealed that those associated with the immune system evolve at a higher rate than other protein categories and it is suggested that this is in avoidance of molecular mimicry and subsequent subversion of the immune system by pathogenic organisms (Murphy, 1993) Thus the immune system may evolve especially quickly because it is reacting to organisms which unlike the typical factors to which living systems must adapt (e.g changes in the physical environment), are able to counter-adapt in response. Knowledge of the molecular genetics of antigen-binding molecules such as Igs and TCRs from widely divergent species has important implications in understanding the overall nature and developmental regulation of adaptive immunity. Detection of invertebrate analogs to lg and their corresponding genes represent a formidable challenge and may require functional definition before structural characterization
159 REFERENCES Abbas A.K., Lichtman, A.H., and Pober, J.S. (1994). Cellular and Molecular Immunology (Philadelphia,PA: W.B. Saunders Co.) Alcover, A. Mariuzza, R.A. Ermonval M and Acuto 0 ( 1990). Lysine 271 in the transmembrane domain of the T -cell antigen receptor beta chain is necessary for its assembly with the CD3 complex but not for alpha/beta dimerization. J. Bioi. Chern 265,4131-4135. Amemiya, C.T Haire, R.N and Litman, G W. (1989). Nucleotide sequence of a eDNA encoding a third distinct Xenopus immunoglobulin heavy chain isotype Nucleic Acids Res. 17, 5388 Amemiya C.T Ohta, Y., Litman, R.T Rast J.P ., Haire, R.N and Litman, G W (1993) VH gene organization in a relict species the coelacanth Latimeria chalumnae : evolutionary implications Proc. Natl. Acad. Sci. USA 90, 6661-6665. Amemiya C.T and Litman, G.W. (1991) Early evolution of immunoglobulin genes Amer. Zool. 31, 558-569 Amemiya, C.T. and Litman, G.W. (1990). Complete nucleotide sequence of an immunoglobulin heavy-chain gene and analysis of immunoglobulin gene organization in a primitive teleost species Proc. Natl. Acad Sci. USA 87 811-815. Anderson, M Amemiya, C.T. Luer, C.A. Litman, R.T., Rast, J P., Niimura, Y. and Litman G.W. (1994). Complete genomic sequence and patterns of transcription of a member of an unusual family of closely related, chromosomally dispersed immunoglobulin gene clusters in Raja International lmmunol. 1 I, 1661-1670. Anderson, M.K., Shamblott M.J ., Litman R.T. and Litman G.W. ( 1995) Diversity of germline-joined type I light chain genes in Raja erinacea is not associated with somatic rearrangement an exception to a central paradigm of B cell immunity. J Exp. Med. Anderson, S.J. Miyake S., and Loh, D .Y. (1989). Transcription from a murine T-cell receptor V-beta promoter depends on a conserved decamer motif similar to the cyclic AMP response element. Mol. Cell. Bioi. 9, 4835-4845.
160 Andersson, E. and Matsunaga T (1993) Complete eDNA sequence of a rainbow trout IgM gene and evolution of vertebrate IgM constant domains. Immunogenetics 38, 243-250. Atchison, M.L., Delmas, V ., and Perry, R.P. (1990). A novel upstream element compensates for an ineffectual octamer motif in an immunoglobulin V promoter. EMBO J. 9, 3109-3117 Bartl, S., Baltimore D and Weissman I.L. (1994). Molecular evolution of the vertebrate immune system. Proc. Natl. Acad. Sci. USA 91, 10769-10770. Bartl, S. and Weissman, I.L. (1994). Isolation and characterization of major histocompatibility complex class liB genes from the nurse shark. Proc. Natl. Acad. Sci . USA 91, 262-266. Bengten E., Leanderson, T ., and Pilstrom, L. (1991). Immunoglobulin heavy chain eDNA from the teleost Atlantic cod (Gadus morhua L.): nucleotide sequence of secretory and membrane form show an unusual splicing pattern Eur. J. Immunol. 21, 3027-3033. Berek, C., Berger, A., and Apel, M. (1991). Maturation of the immune response in germinal centers. Cell 67, 1121-1129. Blackwell, T.K. and Alt, F.W. (1989). Mechanism and developmental program of immunoglobulin gene rearrangement in mammals Ann Rev. Genet. 23, 605-636. Boismenu, R. and Havran W.L. (1994). Modulation of epithelial cell growth by intraepithelial yo T cells. Science 266, 1253-1255. Borst, P. and Greaves D.R. ( 1987). Programmed gene rearrangments altering gene expression Science 235 658-667. Buss, L.W. Moore, J L., and Green, D.R. (1985). Autoreactivity and self-tolerance in an invertebrate Nature 313, 400-402. Buss, L.W. (1987). The Evolution of Individuality (Princeton University Press). Carroll, R.L. (1988). Vertebrate Paleontology and Evolution (New York : W.H. Freeman and Co.). Chess, A., Simon, I. Cedar, H and Axel, R. (1994) Allelic inactivation regulates olfactory receptor gene expression. Cell 5, 823-834.
161 Chien, Y.-H., Iwashima, M., Wettstein, D.A. Kaplan, K.B., Elliott, J.F., Born, W., and Davis, M.M. (1987). T-cell receptor 0 gene rearrangements in early thymocytes. Nature 330, 722-727. Ciachino, C., Padovan E., and Lanzavecchia A. (1995). K+A+ dual receptor B cells are present in the human peripheral repertoire. J. Exp. Med. 181, 1245-1250. Cohn, M and Langman, R.E. ( 1990). The protecton : the unit of humoral immunity selected by evolution Immunol. Rev. 7-147. Colonna M and Samaridis. ( 1995). Cloning of immunoglobulin superfamily members associated with HLA-C and HLA-B recognition by human natural killer cells. Science 268, 405-408. Daggfeldt, A., Bengten, E., and Pilstrom L. (1993). A cluster type organization of the loci of the immunoglobulin light chain in Atlantic cod (Gadus morhua L.) and rainbow trout (Oncorhynchus mykiss Walbaum) indicated by nucleotide sequences of cDNAs and hybridization analysis. Immunogenetics 38, 199-209 Davis, M.M. (1985). Molecular genetics of the T cell-receptor beta chain. Ann. Rev. Immunol. 3 537-560. Davis, M M. and Bjorkman, P.J. (1988). T-cell antigen receptor genes and T-cell recognition. Nature 334, 395-402. Davodeau, F., Peyrat, M.-A., Houde, 1., Hallet, M -M., De Libero, G., Vie, H., and Bonneville, M. (1993 ) Surface expression of two distinct functional antigen receptors on human yo cells. Science 260, 1800-1802. Davodeau, F. Peyrat M.A. Gaschet, J. Hallet M.M., Triebel F., Vie, H. Kabelitz, D., and Bonneville, M. (1994 ). Surface expression of functional T cell receptor chains formed by interlocus recombination on human lymphocytes. J. Exp. Med. 180, 1685 1691. De Boer, R.J. and Perelson A.S. (1993). How diverse should the immune system be? Proc R. Soc. Lond. B 252, 171175. De Ioannes, A.E. and Aguila H.L. ( 1989). Amino terminal sequence of heavy and light chains from ratfish immunoglobulin J. lmmunogenet. 30 175-180. Desiderio, S.V. (1993). Organization and A s sembly of Immunoglobulin Genes In Developmental Immunology. E. Cooper and E. Nisbet-Brown eds. (New York: Oxford Univ. Press) pp 129-152.
162 Devlin P E., Ramachandran, K. L., and C a te R.L. (1988). Southern analysis of g e nomic DNA with unique and de g enerate oligonucleotide probe s : a method for reducing probe degener a cy. DNA 7, 499-507 Dover G ( 1982). Molecular drive : a cohesive mode of specie s evolution Nature 299 111-117 Du Pa s quier L. (1989). Evolution of the immune system. In Fundamental Immunology. W E Paul, ed. (Rav e n Pre s s) Evans G A., Lewis C. and Rothenberg B .E. (1989) Hi g h efficiency vector s for cosm i d mi c ro c l o ning and genomic analysi s Gene 79, 9-20 Parries T C. and Atkinson J.P (1991 ) Evolution of the complement system. Immunology Today 12, 295-300 Feinberg A.P. and Vogel s tein B (1983) A technique for r a diolabeling DNA restriction endonuclease fragments to high specific activity. Anal. Biochem I 32 6-13 Fellah J S. Wiles M V., Charlemagne J., and Schwager J. (1992) Evolution of vertebrate IgM : Complete am i no acid sequence of the constant region of Ambystoma mexicanum f.! chain deduced from eDNA s equence. Eur. J Immunol. 22, 2595-2601. Fellah J S Kerfoum F Guillet F ., and Charlem a gne J (1993a) Conserved structure of amphibian T cell antigen receptor beta chain Proc Natl. Acad Sci USA 90, 6811-6814. Fellah J S Kerfourn F., Wiles M V Schwager J., and Charlemagne, J. (1993b ) Phylogeny of immunoglobulin h e avy ch a in isotypes: structure of the constant region of Amby s toma mexicanum ups ilon chain deduced from eDNA s equence Immunogenetic s 3 8 311-317 Ghaffari S H and Lobb C.J (1989). Nucleotide s equence of ch a nnel catfi s h heavy chain eDNA and g e nomic blot analy s es : Implications for the phylogeny of immunoglobulin heavy chain s. J Immunol. 143, 2730-2739 Ghaffari, S H and Lobb C.J (1993 ) Structure and genomic organization of immunoglobulin light chain in the channel catfish. J. Immunol. 151, 6900-6912 Ghaffari S .H. and Lobb C.J (1991 ) Heavy chain variable region gene fanilies evolved early in phylogeny: Ig complexity in fish J Immunol. 146, 1037-1046
163 Gobel, T.W.F., Chen, C.-L.H., Lahti, J., Kubota, T ., Kuo, C.-L., Aebersold, R., Hood L., and Cooper, M.D. (1994 ). Identification of T cell receptor alpha-chain genes in the chicken. Proc Natl. Acad Sci. USA 91, 1094-1098. Gojobori, T and Nei, M (1984). Concerted evolution of the immunoglobulin VH gene family Mol. Bioi. Evol. 1, 195-212. Good, R.A., Finstad J., Pollara, B. and Gabrielsen A.E (1966) Morphological studies on the evolution of the lymphoid tissues among the lower vertebrates. In R.T. Smith P.A. Miescher and R.A. Good, eds. (Gainsville ,FL: University of Florida Press), pp 149-170. Greehalgh, P. and Steiner, L.A. (1995). Recombination activating gene 1 (Rag 1) in zebrafish and shark Immunogenetics 41, 54-55 Greenberg, A.S. Steiner, L., Kasahara, M., and Flajnik, M.F (1993) Isolation of a shark immunoglobulin light chain eDNA clone encoding a protein resembling mammalian type kappa light chains: implications for the evolution of light chains Proc. Natl. Acad Sci. USA 90, 10603-10607. Greenberg, A.S., Avila D., Hughes, M., Hughes, A., McKinney, E C., and Flajnik, M.F. (1995). A new antigen receptor gene family that undergoes rearrangement and extensive somatic diversification in sharks. Nature 374, 168-173. Haas, W ., Pereira, P., and Tonegawa S (1993). y8 cells Ann. Rev. Immunol. 11, 637-685 Haire, R., Shamblott, M.J., Amemiya, C.T., and Litman, G.W. (1989) A second Xenopus immunoglobulin heavy chain constant region isotype gene Nucleic Acids Res 17 1776 Haire, R.N., Ohta Y., Litman R.T., Amemiya, C.T., and Litman, G.W. (1991). The genomic organization of immunoglobulin V H genes in Xen o pus laevis shows evidence for interspersion of families Nuc. Acids Res. 19, 3061-3066 Haire R.N Amemiya C.T. Suzuki, D., and Litman, G W (1990). Eleven distinct VH gene families and additional patterns of sequence variation suggest a high degree of immunoglobulin gene complexity in a lower vertebrate, Xenopus laevis Journal of Experimental Medicine I 71, 1721-1737 Hamers-Casterman, C., Atarhouch, T ., Muyldermans, S ., Robinson G ., Hamers C., Songa, E B. Bendahman, N., and Hamers R. (1993) Naturally occuring antibodies devoid of light chains Nature 363 446-448.
164 Harding, F.A., Amemiya, C.T. Litman, R.T., Cohen, N and Litman, G.W. (l990a). Two distinct immunoglobulin heavy chain isotypes in a primitive, cartilaginous fish Raja erinacea. Nucleic Acids Res. 18, 6369-6376. Harding, F.A., Cohen, N., and Litman G.W. (1990b). Immunoglobulin heavy chain gene organization and complexity in the skate, Raja erinacea. Nucleic Acids Res. 18 1015 -1020. H ash imoto K., Nakanishi, T., and Kurosawa Y. (1992). Identification of a shark sequence resembling the major histocompatibility complex class I a3 domain. Proc. Natl. Acad Sci. USA 89, 2209-2212. Hashimoto K., Nakanishi, T., and Kurosawa, Y. (1990) Isolation of carp genes encoding major histocompatability complex antigens Proc. Natl. Acad. Sci. USA 87, 6863-6867. Hayman, J R., Ghaffari, S.H., and Lobb, C.J. (1993) Heavy chain joining region segments of the channel catfish. Genomic organization and phylogenetic implications J Immunol. 151, 3587-3596. Hedrick S.M. Nielsen E.A., Kavaler J., Cohen D I. and Davis M.M. (1984). Sequence relationships between putative T-cell receptor polypeptides and immunoglobulins. Nature 308, 153-158. Higgins, D.O., Bleas by, A.J ., and Fuchs, R. (1992) CLUSTAL V: improved software for multiple sequence alignment. Comput. Appl. Biosci. 8 189-191. Hildemann, W.H., Johnson, I.S., and Jokiel, P.L. (1979). Immunocompetence in the lowe st metazoan phylum: Transplantation immunity in spon ges. Science 204, 420422. Hinds, K.R. and Litman G.W. (1986). Major reorganization of immunoglobulin VH segmental elements during vertebrate evolution. Nature 320, 546549. Hinds-Frey, K R. Nishikata H. Litman, R.T. and Litman G W. (1993). Somatic variation precedes extensive diver s ification of germline seque nces a nd combinatorial joining in the evolution of immunoglobulin h eavy chain diversity. J Exp. Med. 178, 825 834. Hinegardner, R.T. (1968). Evolution of ce llular DNA content in teleost fishes. Amer.Nat. 102, 517-523. Hohman, V.S. Schluter S F ., and Marchalonis J.J. (1992) Complete sequence of a eDNA clone specifying sandbar shark immunoglobulin light chain: gene
165 organization and implications for the evolution of light chains. Proc. Nat!. Acad. Sci. USA 89, 276-280. Hohman, V.S. Schuchman D.B., Schluter, S.F., and Marchalonis J.J. (1993) Genomic clone for the sandbar shark lambda light chain: generation of diversity in the absence of gene rearrangement. Proc. Nat!. Acad. Sci USA 90 9882-9886. Hohman, V.S., Schluter S.F. and Marchalonis, J.J. (1992). Complete sequence of a eDNA clone specifying sandbar shark immunoglobulin light chain: gene organization and implications for the evolution of light chains. Proc Natl. Acad. Sci. USA 89, 276-280. Holmgren, A., Kuehn, M.J. Branden, C.-I., and Hultgren, S.J (1992). Conserved immunoglobulin-like features in a family of periplasmic pilus chaperones in bacteria. EMBO J. 11, 1617-1622. Home, W .A., Ford, J .E., and Gibson D M. ( 1992). L chain isotype regulation in the horse I. characterization of Ig lambda genes. J. Immunol. 149, 3927-3936. Hood, L.E. Gray W.R., and Dreyer, W.J. (1966). On the mechanism of antibody synthesis: a species comparison of L-chains Proc Nat!. Acad. Sci. USA 55, 826 Hordvik, 1., Voie, A.M Glette, J Male R. and Endresen, C. (1992). Cloning and sequence analysis of two isotypic IgM heavy c hain genes from Atlantic salmon, Salmo salar L .. Eur. J. Immunol. 22, 2957-2962. Hozumi, N. and Tonegawa, S. (1976) Evidence for somatic rearrangement of immunoglobulin genes coding for variable and constant regions. Proc. Natl. Acad. Sci. USA 73, 3628 3632. Hsu, E Lefkovits, 1., Flajnik M., and Du Pasquier L. ( 1991 ) Light chain heterogeneity in the amphibian Xenopus. Mol. lmmunol. 28, 985-994. Hunkapiller, T. and Hood, L. (1989). Diversity of the immunog l obulin gene s uperfamily. Adv lmmunol. 44 1-62. Ishiguro, H., Kobayashi K., Suzuki, M., Titani, K. Tomonaga, S. and Kurosawa Y. ( 1992) Isolation of a hagfish gene that encode s a complement component. EMBO J. 11' 829-837. Jacob J., Kelsoe, G., Rajewsky K., and Weiss, U. (1991). Intraclonal generation of antibody mutants in germinal centres. Nature 354, 389-392
166 Janeway Jr., C .A. (1992). The immune system evolved to discriminate infectious nonself from noninfectious self. Immunology Today I 3 11-16. Johnson P. (1987) A human homolog of the mouse CD8 molecule, Lyt-3 : genomic sequence a nd expression. Immunogenetics 26, 174-177 Jones J C., Ghaffari, S.H and Lobb, C.J (1993) Immunoglobulin heavy chain constant and heavy chain variable region genes in phylogenetically diverse species of bony fish J. Mol. Evol. 36, 417-428. Jorgensen, J .L., Reay, P A., Ehrich E W., and Davis, M M. (1992). Molecular components ofT-cell recognition. Ann Rev. Immunol. IO, 835-873. Kabat, E A., Wu, T T. Foeller, C., Perry H M and Gottesman, K. (1991). Sequences of Proteins of Immunological Interest (Washington, D.C U.S. Dept. Health and Human Services) Karasuyama, H. Rolink A., Shinkai, Y Young, F., Alt F W., and Melchers, F (1994) The expression of VpreB/lambda5 surrogate light chain in early bone marrow precursor B cells of normal and B cell-deficient mutant mice Cell 77 133-143. Kasahara, M ., Vazquez, M., Sato, K., McKinney, E.C., and Flajnik, M.P. (1992). Evolution of the major histocompatibility complex : isolation of class II a eDNA clones from the cartilaginous fish Proc Natl. Acad. Sci USA 89 6688-6692. Kasahara M McKinney E.C., Flajnik, M.P., and Ishibashi, T (1993) The evolutionary origin of the major histocompatibility complex: polymorphism of class II alpha-chain genes in the cartilaginous f i sh Eur. J. lmmunol. 23 2160-2165 Kobayashi, K., Tomonaga, S. and Kajii T. (1984) A second class of immunoglobulin other than lgM present in the serum of a cartilaginous fish the skate Raja Kenojei : Isolation and characterization Mol. Immunol. 21, 397-404 Kobayashi, K., Tomonaga, S., Teshima, K., and Kajii T ( 1985). Ontogenetic studies on the appearance of two classes of immunoglobulin-forming cells in the spleen of the Aleutian skate Bathyraja aleutica a cartilaginous fish Eur. J. Immunol. I 5, 952-956. Kobayashi K., Tomonaga, S and Tanaka, S (1992) Identification of a second immunoglobulin in the most primitive shark, the frill shark Dev. Comp. Immunol. 16, 295-299. Kobayashi k., Tomonaga, S., and Tanaka, S. (1992). Identification of a second immunoglobulin in the most primitive shark, the frill shark Dev. Comp. Immun
167 16, 295-299. Kokubu, F., Hinds, K., Litman R., Shamblott, M.J ., and Litman G.W. (1988a). Complete structure and organization of immunoglobulin heavy chain constant region genes in a phygenetically primitive vertebrate EMBO J. 7, 1979 1988. Kokubu, F., Litman, R., Shamblott, M.J. Hinds, K., and Litman, G.W. (1988b). Diverse organization of immunoglobulin VH gene loci in a primitive vertebrate EMBO J 7 3413-3422 Kuma, K.-1., Iwabe N ., and Miyata T (1995). Functional constraints against variations on molecules from the tissue level: slowly evolving brain specific genes demonstrated by protein kinase and immunoglobulin supergene families Mol. Bioi. Evol. 12, 123-130. Lindsten, T. Fowlkes, B.J. Samelson L.E ., Davis M.M., and Chien, Y.-H. (1987). Transient rearrangements of the T cell antigen receptor a locus in early thymocytes. J. Exp. Med. 166, 761-775. Litman, G.W., Wang A.C., Fudenberg, H.H., and Good R.A. (1971). N-Terminal amino acid sequence of african lungfish immunoglobulin light chains. Proc. Natl. Acad. Sci. USA 68, 2321-2324. Litman G.W., Stolen J., Sarvas, H.O., and MakeHi, 0. (1982). The range and fine specificity of the anti hapten immune response: Phylogenetic studies. J Immunogenet. 9, 465-474. Litman, G.W., Berger, L., Murphy, K., Litman, R.T., Hinds K.R., Jahn C.L., and Erickson, B.W (1983) Complete nucleotide sequence of an immunoglobulin V H gene homologue from Caiman a phylogenetically ancient reptile. Nature 303, 349-352. Litman, G.W. Berger, L., Murphy, K., Litman R., Hinds, K.R. and Erickson, B.W. (1985). Immunoglobulin VH gene structure and diversity in Heter o dontus a phylogentically primitive shark. Proc Natl. Acad. Sci USA 82, 2082-2086. Litman G.W., Amemiya, C.T., Hinds Frey K.R., Litman, R.T ., Kokubu, F., Suzuki, D., Shamblott, M.J., Harding, F.A., and Haire R.N. (1993a). Evolutionary origins of immunoglobulin genes. In Developmental Immunology. E.L. Cooper and E. Nisbet-Brown, eds. (New York: Oxford University Press) pp 108-128. Litman G.W. Rast J.P. Hulst M.A., Litman, R.T. Shamblott, M.J., Haire R.N ., Hinds -F rey, K.R. Buell, R.D., Margittai, M., Ohta, Y. Zilch, A.C., Good, R.A. and Amemiya C.T. (1993b ). Evolutionary origins of immunoglobulin gene diversity In Progress in Immunology. AnonymousHeidelberg : Springer Verlag),
168 pp. 107-114. Litman, G.W. Rast, J.P. Shamblott, M.J. Haire, R.N. Hulst M. Roes s, W., Litman R.T., Hinds-Frey, K.R., Zilch, A., and Amemiya, C.T. (1993c ) Phylogenetic diversification of immunoglobulin genes and the antibody repertoire. Mol. Bioi. Evol. 10, 60-72. Litman, G W., Amemiya, C.T., Haire, R.N., and Shamblott M.J. (1990). Antibody and immunoglobulin diversity. Biosci. 40, 751-757. Luer, C.A. (1989). Elasmobranchs (sharks, skates, and rays) as animal models for biomedical research In Nonmammalian Animal Models for Biomedical Research. A.D. Woodhead, ed (Boca Raton: CRC Press, Inc.) pp. 121-147. Maisey, J.G. (1984). Chondrichthyan phylogeny: a look at the evidence J. Vert. Pa1eont. 4, 359-371. Martin, A P., Naylor, G.J. and Palumbi, S.R. (1992). Rates of mitochondrial DNA evolution in sharks are slow compared with mammals Nature 357 153-155. Matsuda, F Shin, E.K. Nagoaka, H., Matsumura R., Haino, M., Fukita, Y., Taka-i s hi, S., Imai, T., Riley, J.H., Anand, R., Soeda, E., and Honjo, T. (1993) Structure and physical mapping of 64 variable segments in the 3' 0.8-megabase region of the human immunoglobulin heavy-chain locus. Nature Genetics 3, 88-94. Matsui, K., Boniface, J.J., Reay, P.A., Schild H ., Fazekas de St.Groth, B., and Davis M M. (1991). Low affinity interaction of peptide-MHC complexes with T cell receptors. Science 254, 1788 -1791. Matsunaga, T., Chen, T. and Tormanen, V (1990). Characterization of a complete immunoglobulin heavy-chain variable region germ-line gene of rainbow trout. Proc. Nat!. Acad Sci USA 87, 7767-7771. Mattison A. and Fange, R. ( 1982) The cellular structure of the Leydig organ in the shark Etmopterus spinax L. Bioi. Bull. 162, 182-194. Makela 0. and Litman, G.W. (1980) Lack of heterogeneity in anti-hapten antibodies of a phylogenetically primitive shark. Nature 28 7 639-640 McCormack, W.T ., Tjoelker, L.W. Stella, G ., Po s tema C.E., and Thompson, C.B. ( 1991) Chicken T-cell receptor bet a -chain diver s ity : an evolutionarilyconserved Dbeta-encoded glycine tum within the hypervariable CDR3 domain. Proc. Natl. Acad. Sci. USA 88, 7699-7703.
169 Michelson, J (1993) Biology of disease : cellular selection in the genesis of multicellular organization. Laboratory Investigation 69, 136-151. Miller, N W. Sizemore, R.C. and Clem, L.W. (1985). Phylogeny of lymphocyte heterogeneity: The cellular requirements for in vitro antibody responses of channel catfish leukocytes. J. Immunol. 134 2884-2888. Miller, N W., Deuter, A., and Clem, L.W (1986). Phylogeny of lymphocyte heterogeneity: the cellular requirements for the mixed leucocyte reaction with channel catfish. Immunology 59, 123-128 Murphy, P.M. (1993). Molecular mimicry and the generation of host defense protein diversity. Cell 72, 823-826. Needleman, S and Wunsch C. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Bioi. 48, 444-453. Nizetic, D., Drmanac, R., and Lehrach, H (1991) An improved bacterial colony lysis procedure enables direct DNA hybridization using short (1 0,11 bases) oligonucleotides to cosmids Nucleic Acids Res. 19, Oettinger M.A. Schatz, D.G. Gorka C. and Baltimore, D. (1990) RAG-1 and RAG-2, adjacent genes that synergistically activate V(D)J recombination. Science 248, 1517-1523. Ohno, S., Muramoto, J Stenius, C., Christian, L., Kittrell, W.A. and Atkin N.B. (1969) Microchromosomes in holocephalan, chondrostean and holostean fishes Chromosoma 26, 35-40 Ohno, S (1970). Evolution by Gene Duplication (Heidelberg: Springer-Verlag) Onda, T., Brunner, T., Messier, H., Bissonnette, R., Echeverri, F Baier, G., Fotedar, A and Green, D R. (1995). Immunoregulatory activity of a T-cell receptor alpha chain demonstrated by in vitro transciption and translation Proc Natl. Acad Sci. USA 92, 3004-3008 Ono, H Klein, D Vincek V., Figueroa, F., O'hUigin, C., Tichy, H., and Klein, J (1992) Major histocompatibility complex class II genes of zebrafish. Proc. Natl. Acad. Sci. USA 89, 11886-11890 0rvig, T (1985). Histologic studies of the ostracoderms, placodenns and fossil elasmobranchs. 5 ptyctodontid tooth plates and their bearing on holocephalan ancestry: the condition of chimaerids Zoologica Scripta 14, 55-78.
170 Ota, T. and Nei, M (1994). Divergent evolution and evolution by the birth-and-death process in the immunoglobulin VH gene family. Mol. Bioi. Evol. II, Ota T and Nei, M. (1995). Evolution of immunoglobulin V H pseudogenes in chickens Mol. Bioi. Evol. 12, 94-102 Overduin M Harvey T.S., Bagby, S Tong, K.I. Yau, P., Takeichi, M and Ikura M ( 1995). Solution structure of the epithelial cadherin domain responsible for selecive cell adhesion Science 267, 386-389. Parslow, T G., Blair D.L. Murphy W.J and Granner, D .K. (1984) Structure of the 5' ends of immunoglobulin genes : A novel conserved sequence. Proc Natl. Acad Sci USA 8I, 2650-2654 Partula, S., Fellah, J.S., de Guerra, A., and Charlemagne, J. (1994). Caracterisation d ADNc de Ia chaine de recepteur des lymphocytes T chez Ia truite arc-en-ciel. C. R. Acad Sci. Paris, Sciences de Ia vie 3I7, 765-770. Percy, R. and Potter, I C. (1976). Blood cell formation in the river lamprey Lampetra fluviatilis. J. Zool. Lond I78, 319-340 Perey D Y.E Finstad, J Pollara, B and Good, R.A. (1968) Evolution of the immune response VI. First and second set skin homograft rejections in primitive fishes. Laboratory Investigation 19, 591-597. Raison, R.L., Gilbertson P and Wotherspoon, J. (1987). Immunol. Cell. Bioi. 65, 183-188 Ramsden D A., Baetz, K., and Wu G.E (1994). Conservation of sequence in recombination signal sequence spacers. Nucleic Acids Research I 0, 1785-1796. Rast J P., Anderson, M K., Ota T Litman, R.T., Margittai M. Shamb1ott M J. and Litman G.W. (1994) Immunoglobulin light chain class multiplicity and alternative organizational forms in early vertebrate phylogeny. Immunogenetics 40, 83-99. Rast, J P., Haire, R.N., Litman R.T., Pross, S. and Litman G W (1995) Identification and characterization ofT cell antigen receptor-related genes in phylogenetically diverse vertebrate species Immunogenetics Rast, J P. and Litman, G.W (1994) T cell receptor gene homologs are present in the most primitive jawed vertebrates. Proc. Natl. Acad Sci USA 20 9248-9252. Reynaud C.-A., Dahan, A Anquez, V., and Weill, J.-C. (1989) Somatic hyperconversion diversifies the single V H gene of the chicken with a high incidence in the D
171 region. Cell 59, 171-183. Rock E P., Sibbald P.R., Davis, M M., and Chien, Y.-H. (1994). CDR3 length tn antigen-specific immune receptors. J. Exp. Med. 179, 323-328. Rombout J H.W.M., Taveme_Thiele A.J and Villena, M .I. (1993). Dev.Comp. Immunol. 17, 55-66 Ruddle, F.H. Bentley K.L., Murtha, M.T., and Risch, N (1994) Gene loss and gain in the evolution of vertebrates. Development 155-161. Rzhetsky, A and Nei, M. ( 1992) A simple method for estimating and testing minimum evolution trees. Mol. Bioi. Evol. 9, 945-967 Saint-Ruf, C. Ungewiss, K., Groettrup, M. Bruno, L., Fehling, H G., and von Boehmer, H. ( 1994 ). Analysis and expression of a cloned preT cell receptor gene. Science 226, 1208-1212. Sambrook, J Fritsch, E.F., and Maniatis, T. (1989). Molecular Cloning (New York: Cold Spring Harbor Laboratory Press). Sanger, F., Nicklen, S., and Coulson, A.R. (1977) DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci USA 74, 5463-5467 Schacke, H Mi.iller, W.E.G., Gamulin, V., and Rinkevich, B. (1994). The Ig superfamily includes members from the lowest invertebrates to the highest vertebrates. Immunology Today 15, 497-498. Schild, H., Mavaddat, N., Litzenberger, C., Ehrich, E W., Davis M.M., Bluestone, J A., Matis, L., Draper, R K., and Chien, Y.-H. (1994). The nature of major histocompatibility complex recognition by y8 T cells. Cell 76, 29-37. Schittek B., Unkelbach, E., and Rajewsky, K. (1989) Violation of allelic exclusion of the T cell receptor genes in a helper T cell clone. International Immunol. 1, 274-280 Schluter, S.F ., Hohman V.S., Edmundson, A.B., and Marchalonis, J.J. ( 1989) Evolution of immunoglobulin light chain genes: eDNA clones specifying sandbar shark constant region. Proc. Natl. Acad Sci. USA 86, 9961-9965. Schluter, S.F. Hohman V.S., Edmundson A.B., and Marchalonis J.J. (1989). Evolution of immunoglobulin light chain genes : eDNA clones specifying sandbar shark constant region. Proc Natl. Acad. Sci. USA 86, 9961-9965
172 Schwager, J., Mikoryak, C.A. and Steiner L.A. (1988) Amino acid sequence of heavy chain from Xenopus Iaevis lgM deduced from eDNA sequence: Implications for evolution of immunoglobulin domains. Proc Nat!. Acad. Sci USA 85, 2245-2249 Schwager, J., Burckert, N., Schwager, M ., and Wilson, M. (1991). Evolution of immunoglobulin light chain genes: analysis of Xenopus lgL isotypes and their contribution to diversity EMBO J 10, 505-511. Scofield, V L., Schlumpberger, J M., West, L.A., and Weissman, I.L. (1982). Protochordate allorecognition is controlled by a MHC-like gene system Nature 295 499-502. Shamblott, M.J and Litman, G.W (1989a). Genomic organization and sequences of immunoglobulin light chain genes in a primitive vertebrate suggest coevolution of immunoglobulin light and heavy chain gene organization. EMBO J. 8 3733 3739 Shamblott, M.J. and Litman, G.W. (1989b) Complete nucleotide sequence of primitive vertebrate immunoglobulin light chain genes. Proc. Nat!. Acad. Sci. USA 86, 4684-4688. Shamblott M.J. and Litman, G.W. (1989) Complete nucleotide sequence of primitive vertebrate immunoglobulin light chain genes. Proc. Nat!. Acad Sci. USA 86 4684-4688 Siu, G., Kronenberg, M ., Strauss, E., Haars R., Mak, T.W., and Hood, L. (1984) The structure, rearrangement and expression of DB gene segments of the murine T -cell antigen receptor. Nature 311,344-350. Smith, L.C. and Davidson, E.H ( 1992). The echinoid immune system and the phylogenetic occurrence of immune mechanisms in deuterostomes. Immunology Today 13, 356-362. Stewart, J. (1992). Immunoglobulins did not arise m evolution to fight infection. Immunology Today 13, 396-399. Stock, D.W. and Whitt, G.S. (1992). Evidence from 18S ribosomal RNA sequences that lampreys and hagfishes form a natural group Science 257, 787-789. Sun, S., Lindstrom, 1., Boman, H G., Faye, 1., and Otto S. ( 1990) Hemolin: an insect-immune protein belonging to the immunoglobulin superfamily. Science 250 1729-1732 Tanaka T. and Nei, M. (1989) Positive Darvinian selection observed at the variable region genes of immunoglobulins Mol. Bioi. Evol. 6, 447-459
173 Thompson C.B and Nieman, P E (1987). Somatic diversification of the chicken immunoglobulin light chain gene is limited to the rearranged variable gene segment. Cell 48 369-378. Tjoelker L.W., Carlson, L.M. Lee K., Lahti, J., McCormack, W.T Leiden J.M. Chen, C -L.H ., Cooper, M D. and Thompson C B. (1990). Evolutionary conservation of antigen recognition : the chicken T cell receptor beta chain. Proc Natl. Acad. Sci. USA 87, 7856-7860. Tomonaga, S., Kobayashi, K., and Hagiwara, K. ( 1985). GET TITLE Dev Comp. Immunol. 9, 617-626. Tonegawa, S. (1983) Somatic generation of antibody diversity. Nature 302, 575-581. Williams A.F Tse A.G.D., and Gagnon J (1988) Squid glycoproteins with structural similarities to Thy-1 and Ly-6 antigens Immunogenetics 27, 265-272. Williams, A.F. and Barclay, A.N (1988) The immunoglobulin superfamily-domains for cell surface recognition Ann. Rev Immunol. 6 381-405. Wilson, M Hsu E., Marcuz A., Courtet, M. Du Pasquier, L., and Steinberg, C. ( 1992) What limits affinity maturation of antibodies in Xenopus-the rate of somatic mutation or the ability to select mutants? EMBO J 11, 4337 4347. Wilson, M.R., Marcuz A ., van Ginkel, F., Miller N.W., Clem, L.W., Middleton, D., and Warr, G W (1990) The immunoglobulin M heavy chain constant region gene of the channel catfish, Ictalurus punctatus : an unusual mRNA splice pattern produces the membrane form of the molecule Nucleic Acids Res. 18 5227-5233. Wilson R.K., Lai, E., Concannon P Barth, R.K ., and Hood, L.E (1988). Structure organization and polymorphism of murine and human T -cell receptor a and chain gene families. Immunol. Rev 101, 149-172. Yanagi Y., Yoshikai, Y ., Leggett, K ., Clark S.P., Aleksander, I. and Mak T W (1984) A human T cell-specific eDNA clone encodes a protein having extensive homology to immunoglobulin chains. Nature 308 145-149. Zapata, A. ( 1980) Ultrastructure of elasmobranch lymphoid tissue I. thymus and spleen. Dev Comp. Immunol. 4 459-472. Zapata, A. (1981 ). Ultrastructure of elasmobranch lymphoid tissue II. Leydig 's and epigonal organs. Dev Comp Immunol. 5 43-52 Zapata, A., Ardavin, C F., Gomariz, R.P and Leceta J (1981). Plasma cells in the ammocoete of Petromyzon marinus. Cell Tissue Res 221, 203-208