Diabetes 51:3318-3325, 2002 © 2002 by the American Diabetes Association, Inc. Linkage and Association With Type 1 Diabetes on Chromosome 1q42
1 Department of Genetics, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania
Type 1 diabetes is a complex disorder with multiple genetic loci and environmental factors contributing to disease etiology. In the current study, a human type 1 diabetes candidate region on chromosome 1q42 was mapped at high marker density in a panel of 616 multiplex type 1 diabetic families. To facilitate the identification and evaluation of candidate genes, a physical map of the 7-cM region surrounding the maximum logarithm of odds (LOD) score (2.46, P = 0.0004) was constructed. Genes were identified in the 500-kb region surrounding the marker yielding the peak LOD score and evaluated for polymorphism by resequencing. Single-nucleotide polymorphisms (SNPs) identified in these genes as well as other anonymous markers were tested for allelic association with type 1 diabetes by both family-based and case-control methods. A haplotype formed by common alleles at three adjacent markers (D1S225, D1S2383, and D1S251) was preferentially transmitted to affected offspring in type 1 diabetic families (nominal P = 0.006). These findings extend the evidence supporting the existence of a type 1 diabetes susceptibility locus on chromosome 1q42 and identify a candidate region amenable to positional cloning efforts.
Type 1 diabetes arises from tissue-specific autoimmune destruction of the insulin-secreting pancreatic islet cells and results in life-long dependence on exogenously administered insulin. Both genetic and environmental factors contribute to disease etiology. Genome-wide scans for linkage in type 1 diabetes have identified >20 candidate susceptibility regions (16). However, IDDM1 (the HLA region on chromosome 6p21) and IDDM2 (the insulin gene region on chromosome 11p15) remain the only sites for which consistent and significant evidence supporting the presence of one or more diabetes susceptibility genes has been reported (rev. in 7).
We have previously reported the results of a genome-wide multipoint linkage analysis of 438 microsatellite markers in type 1 diabetic families. This genome scan was performed in two stages. In the initial stage, 212 affected sib pairs (ASPs) were studied for linkage with markers spaced at Even after genotyping the additional 5 markers and 467 ASPs, the region of localization on chromosome 1 was still quite large. In a subsequent study (6), we merged the raw genotype data from our genome scan with those derived from a collection of type 1 diabetic multiplex families of U.K. origin (4). This merged dataset contained more families than our initial genome scan (831 ASPs in 767 families, 667 with full genome-scan data) but did not greatly increase the density of markers genotyped in the chromosome 1q42 region. The maximum LOD score in the region was reduced to 2.2, and there were two peaks with this LOD score located within 5 cM.
In the current study, we sought to refine the localization of the putative type 1 diabetes susceptibility locus on 1q42 by genotyping additional microsatellite and single-nucleotide polymorphism (SNP) markers in the same collection of ASPs previously used (5,6). To establish the correct marker order, we constructed a genomic map of an
Linkage analysis. The type 1 diabetic multiplex families used in the linkage study were obtained from the Human Biological Data Interchange (8), the British Diabetic Association-Warren Registry (9), and The Childrens Hospital of Philadelphia (10). This material was previously described in detail (5,6). In addition to 6 markers previously genotyped, 31 new microsatellite markers and 4 SNP markers on chromosome 1q were genotyped and scored in these families using standard methods (5). Multipoint linkage analysis was carried out using the pairs option in MERLIN (11). Marker order for most of chromosome 1 was as described (6). For newly genotyped markers, map distances and orders were obtained from the Marshfield Center for Medical Genetics (http://research.marshfieldclinic.org/genetics/) sex-averaged map of chromosome 1. Marker order was adjusted to agree with the physical map derived in the current study, and map distances were estimated as being proportional to physical distance within intervals flanked by markers with known genetic map positions.
Sequence-tagged site markers and isolation of bacterial artificial chromosome and P1 artificial chromosome clones. Bacterial artificial chromosome (BAC) clones (n = 104) were isolated by screening high-density filters of the Roswell Park Cancer Institute (RPCI)-11 segment 2 Human Male BAC library (http://www.chori.org/bacpac/) with probes prepared from the initial 36 markers. Filter hybridizations were carried out as described by Cheung et al. (12). P1 artificial chromosome (PAC) clones (n = 309) that map to the region of chromosome 1 bounded by SHGC-30224 and D1S437 were isolated from the RPCI-4 and -5 Human Male PAC libraries and provided to us by The Wellcome Trust Sanger Institute. Additional BAC clones were isolated from the California Institute of Technology (CITB) library by PCR screening of clone pools as per the manufacturers instructions (Research Genetics). Initial screening was performed using the D1S1617 marker. All identified clones were end-sequenced. New primer sets were generated from end sequences and used to screen all previously isolated clones. Primer sets that failed to amplify from previously isolated clones were used in subsequent rounds of library screening. Additional contigs were initiated by library screening with the following microsatellites: D1S1644, D1S439, D1S1656, and D1S2712.
Sequencing of BAC and PAC clones. Internal sequence from BAC and PAC clones for which the corresponding genomic sequence was not available in GenBank at the time of isolation was obtained by sample sequencing. Briefly, BAC or PAC clones were separately digested to completion with several different restriction enzymes having 6-nucleotide recognition sequences (EcoRI, BamHI, and HindIII). The digested products were size fractionated on agarose gels and products >2 kb in size were cloned into plasmids. Colonies (n = 3040) were picked at random from each library, and single-pass sequences were determined using flanking SP6 and T7 primers. Sequences were used for STS and SNP development as described below. All sequences were also repeat-masked and used in BLASTN and BLASTX searches (15) of GenBank to identify possible coding regions.
STS content mapping.
SNP identification.
Gene resequencing.
Testing for association.
Physical map of chromosome 1q42. To find new SNP markers, confirm the order of markers used in the linkage analysis, and identify candidate genes in the 1q42 region, we constructed a physical map of the region of maximum LODs surrounding D1S1617. A total of 104 BAC clones were isolated by hybridizing high-density filters with a panel of 36 STS markers mapping to the 1q42 region. These clones were supplemented with 309 PAC clones from the same region isolated by The Wellcome Trust Sanger Institute. The 413 BAC and PAC clones were tested for the presence of 104 STS markers by dot blot hybridizations and PCR. The order was confirmed and refined by a combination of PCR analyses with new STSs derived from the BAC and PAC clone end sequences and BLASTN (15) searches of finished and draft sequence databases. A detailed view of the map covering 10.5 Mb on 1q42 corresponding to the region of maximum LOD scores is available at http://genomics.med.upenn.edu/clonedb/index.htm. The portion of this map containing the region between D1S1617 and D1S251 appears in Fig. 1 (also see the section below entitled "Multilocus TDT Analyses"). The figure indicates the relative positions of the 17 microsatellite markers and SNPs and 21 known or hypothetical genes mapping to this region.
Multipoint linkage analysis of type 1 diabetes and 37 microsatellite markers mapping to chromosome 1q42. To delineate more precisely the region of linkage on 1q42, we genotyped 31 new microsatellite markers in the same collection of U.K. and U.S. type 1 diabetic families and added the data to those used by Cox et al. (6). Because many of these markers were not reliably positioned in publicly available genetic maps, we used our physical map of the region to establish the map order for linkage analysis. When the complete data set for chromosome 1 was analyzed for linkage to type 1 diabetes, the maximum multipoint LOD score detected was 2.46 (P = 0.0004) and occurred between markers D1S1617 and D1S2847 at 1q42 (Fig. 2 and Table 1).
Resequencing of candidate genes. To identify potential etiologic variants, we searched for polymorphism within coding regions in the 1q42 region by resequencing genes that met either of two criteria: 1) genes were considered candidates if they were located in a 500-kb region centered around D1S1617 (Fig. 1); and 2) genes located within the broader 1q42 region but not so close to D1S1617 were considered candidates if their function provided adequate rationale for a possible role in type 1 diabetes susceptibility. For each of these candidates, the coding regions and putative promoter regions were amplified and sequenced in genomic DNA derived from eight affected members of type 1 diabetic families used for linkage analysis. Table 3 lists the genes that were resequenced for this study and a description of their products, where known. Only some exons were sequenced for the candidate genes ADPRT, SERPINA8/AGT, PSEN2, and CHS1.
Testing for association. Thirty SNPs identified by resequencing were genotyped in 160 unrelated type 1 diabetes cases and 160 control subjects and evaluated for association with type 1 diabetes (Table 2). Cases were selected from type 1 diabetic families previously used for linkage analysis. Among the 30 SNPs, 18 were located in genes. Nominally significant evidence of association was found at three markers: KIAAx4.11 in exon 4 of the KIAA0133 gene, VM005P1 in an intron of the PAF65B histone deacetylase complex gene, and VM041P1 in an intron of the CAPN9 cysteine protease subunit gene. Only the results for VM005P1 remained significant after correction for the testing of multiple markers. We sought confirmation of the finding for VM005P1 by genotyping type 1 diabetic families that were not included in the preceding analysis and testing for association by TDT. No excess of transmissions was observed in this analysis (51% of 647 transmissions, 2 = 0.26). Additional SNP markers (labeled as anonymous in Table 2) identified by resequencing but not located within any known or predicted transcription unit were genotyped in type 1 diabetes cases and control subjects. Two of these markers displayed nominal evidence of association, which was not significant after correction for the number of markers tested.
Multilocus TDT analyses.
We next performed the TDT on all markers in the 20-cM interval surrounding the peak in our multipoint linkage analysis and found modest evidence of linkage disequilibrium at five markers whose locations span
For the original TDT to provide a valid test for association (as opposed to linkage), families with more than one affected offspring must not be included. To solve this problem, we used an alternative formulation (18) that treats each sib pair as a unit. In the 445 sibships with exactly two affected sibs, the results were: allele 4 at D1S225:
We also considered the possibility that preferential transmission occurs in the region studied but is unrelated to type 1 diabetes. We genotyped members of 40 large families from CEPH (Center d Étude du Polymorphisme Humain), in which there are no known familial diseases. For the alleles of interest, the values of
An examination of founder haplotypes in our collection of pedigrees revealed significant evidence of linkage disequilibrium overall both between alleles at D1S225 and D1S2383 and between alleles at D1S2383 and D1S251 (P < 10-7). To explore the relationship between type 1 diabetes and these associated markers as a block, we assessed the transmission of the three-marker haplotype containing the 4, 2, and 1 alleles at D1S225, D1S2383, and D1S251, respectively, by the TDT. This haplotype was preferentially transmitted to affected offspring (65.6% of 102 transmissions,
Type 1 diabetes was the first genetically complex disease to be studied by the approach of genome-wide scanning for linkage in ASPs (1), and a number of independent type 1 diabetes genome scans have been reported (36). Although >20 putative type 1 diabetes loci have been proposed based on these genome scans and on more targeted studies of specific chromosomal regions, the underlying susceptibility genes have not been identified. In part, progress has been limited by the difficulty in obtaining confirmation for loci reported in different studies. We have carried out two previous linkage analyses in type 1 diabetic families. In the first (5), with 679 ASPs, the only region besides HLA (IDDM1) with LOD scores >1.8 was a novel region on chromosome 1q42 (LOD-3.31). A second study (6) added more families to the analysis but did not increase the marker density in the 1q42 region. In this second analysis, with data from 831 ASPs, the region of localization on 1q42 broadened and there were two peaks with LOD scores of 2.2 within a 5-cM region. Because these results were obtained with the 831 ASPs that constitute essentially all of the multiplex type 1 diabetic families available in public repositories, it is unlikely that we can soon obtain a collection of families with sufficient power to confirm our initial finding. Therefore, in the current study, we focused on two other ways to extend the findings on chromosome 1. First, we increased the information content for linkage by genotyping a dense map of markers spanning the region. Second, we sought evidence of allelic association with type 1 diabetes at these and other markers.
In the current study, we constructed a physical map of the 1q42 region and used it to establish the map order for 35 new markers (31 microsatellites and 4 SNPs). With the addition of genotypes for these markers, the information content statistic for this region, as calculated by Genehunter, ranges from 0.83 to 0.91 for the full collection of 831 ASPs. Multipoint linkage analyses in this dataset revealed a single peak in the region with a maximum LOD score of 2.46 (P = 0.0004). A 1-LOD support interval for the localization spans To undertake a systematic search of the region for genes that might be involved in type 1 diabetes susceptibility, we constructed a physical map spanning the 7 cM that flanked D1S1617, which had the peak LOD in our linkage analysis. In the immediate surrounding region, we resequenced the coding regions of nine genes in affected and unaffected individuals in order to identify polymorphisms. With the exception of SPHAR and CAPN9, all of these genes (Table 3) are expressed in either pancreas or lymphoid tissue (blood, bone marrow, thymus, and spleen), where it might be anticipated that a gene involved in autoimmune destruction of islet cells would be expressed. A function is known for seven of the genes, but none suggests an obvious connection with type 1 diabetes. Therefore, we also considered a broader region and carried out partial resequencing of four additional genes for which some functional rationale, as type 1 diabetes candidates could be found. For example, ADPRT is expressed in pancreas, and it has been reported that null alleles of Adprt in mice protect against streptozotocin-induced diabetes (2022). CHS1 is mutated in Chediak-Higashi syndrome, a disorder with immune manifestations (23). These results were also negative. Because our physical mapping efforts had identified 34 putative transcripts in the region of interest, and because a survey of the nine genes closest to the site of the peak LOD score in the region did not yield evidence of association with type 1 diabetes, we broadened our search for linkage disequilibrium to include anonymous markers spanning the 1q42 region. In addition to testing the SNP markers found by resequencing, we also tested for linkage disequilibrium at each of the markers genotyped in families for our linkage studies. Since these latter markers had already been genotyped in >600 families, we could test for linkage disequilibrium by the TDT, eliminating concerns about population structure. Three of these markers yielded nominally significant results for a common allele. Whereas none of these findings would be significant if corrected for the 40 markers tested, it is striking that the three markers are adjacent in our physical map (Table 1 and Fig. 1). This finding suggested that the TDT results might reflect the inheritance of a three-marker haplotype that contains these specific alleles and confers elevated risk of type 1 diabetes. A test for transmission of the entire haplotype to affected offspring in type 1 diabetic families was consistent with this possibility. The three markers and the haplotype that show association with type 1 diabetes are located 7351,200 kb telomeric to the region (D1S1617, D1S2847) with the maximum multipoint LOD, although still within a 1-LOD support interval surrounding the peak. Unlike mapping in a Mendelian disease, where recombinants can be identified and provide precise, though possibly broad, localization, the peak LOD for a disease like type 1 diabetes defines a region that is both broad and imprecise. Thus, a single putative susceptibility locus for type 1 diabetes could be responsible for the association found near D1S225-D1S251 and also account for the evidence for linkage seen as an LOD with a maximum that occurs 7351,200 kb away. From the current findings, it is not possible to determine whether the association observed near D1S225-D1S251 accounts for all, or only some, of the evidence of linkage observed in the 1q42 region. Analysis of the linkage data for chromosome 1 in the 56 families segregating for the associated (4-2-1) haplotype yields a regional maximum LOD score of 0.94, compared with the LOD of 2.46 obtained in the full panel of 767 families. None of the three markers that make up the associated haplotype are likely to contribute directly to type 1 diabetes susceptibility. Therefore, the association of this haplotype with one or more putative etiologic variants in the region is likely to be incomplete, and analysis with just the associated haplotype might well underestimate the contribution of a putative type 1 diabetes locus in this region to the evidence for linkage. Genes in the vicinity of D1S251 have been studied previously because of the report of cosegregation between a chromosomal translocation in the region and major psychiatric disorders in an extended Scottish pedigree (24,25). As a result of these studies, two genes in the D1S225 through D1S251 interval that are expressed in the pancreas have been described: TSNAX, a translin-associated factor (26), and EGLN1, a putative prolyl hydroxylase (27). Neither these genes nor the one other known gene in the interval, GNPAT (glyceronephosphate-O-acyltransferase) (28), are obvious candidates for type 1 diabetes susceptibility genes. However, current genome sequence data suggest that there may be as many as seven additional transcription units in this interval and several more immediately flanking it (NCBI, Ensembl, and Celera). These genes will need to be evaluated as candidate type 1 diabetes susceptibility genes in future studies.
This work was supported by grants from the Juvenile Diabetes Research Foundation (to P.C.) and from the National Institutes of Health (DK46635 to P.C. and DK46618 to R.S.S.). The authors thank Melissa Arcaro for sequencing some BAC ends, Nancy Cox and Warren Ewens for advice regarding biostatistical issues, Bob Hemphill and Lucy Southworth for help with computing, and Mary West for expert assistance with manuscript preparation. In addition we thank the Human Biological Data Interchange, the British Diabetic Association, and the many type 1 diabetic patients and their families who contributed to these repositories.
Address correspondence and reprint requests to Patrick Concannon, Molecular Genetics Program, Virginia Mason Research Center, 1201 Ninth Ave., Seattle, WA 98101. E-mail: patcon{at}vmresearch.org; or Richard Spielman, Department of Genetics, University of Pennsylvania School of Medicine, Philadelphia, PA 19104. E-mail: spielman{at}pobox.upenn.edu. Received for publication 5 April 2002 and accepted in revised form 9 July 2002. C.S. is employed by Merck Pharmaceuticals. ASP, affected sib pair; BAC, bacterial artificial chromosome; LOD, logarithm of odds; PAC, P1 artificial chromosome; RPCI, Roswell Park Cancer Institute; SNP, single-nucleotide polymorphism; STS, sequence-tagged site; TDT, transmission/disequilibrium test.
This article has been cited by other articles:
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||