Next Generation Sequencing Reveals the Association of DRB3*02:02 With Type 1 Diabetes
The primary associations of the HLA class II genes, HLA-DRB1 and HLA-DQB1, and the class I genes, HLA-A and HLA-B, with type 1 diabetes (T1D) are well established. However, the role of polymorphism at the HLA-DRB3, HLA-DRB4, and HLA-DRB5 loci remains unclear. In two separate studies, one of 500 subjects and 500 control subjects and one of 366 DRB1*03:01–positive samples from selected multiplex T1D families, we used Roche 454 sequencing with Conexio Genomics ASSIGN ATF 454 HLA genotyping software analysis to analyze sequence variation at these three HLA-DRB loci. Association analyses were performed on the two HLA-DRB loci haplotypes (DRB1-DRB3, -DRB4, or -DRB5). Three common HLA-DRB3 alleles (*01:01, *02:02, *03:01) were observed. DRB1*03:01 haplotypes carrying DRB3*02:02 conferred a higher T1D risk than did DRB1*03:01 haplotypes carrying DRB3*01:01 in DRB1*03:01/*03:01 homozygotes with two DRB3*01:01 alleles (odds ratio [OR] 3.4 [95% CI 1.46–8.09]), compared with those carrying one or two DRB3*02:02 alleles (OR 25.5 [3.43–189.2]) (P = 0.033). For DRB1*03:01/*04:01 heterozygotes, however, the HLA-DRB3 allele did not significantly modify the T1D risk of the DRB1*03:01 haplotype (OR 7.7 for *02:02; 6.8 for *01:01). These observations were confirmed by sequence analysis of HLA-DRB3 exon 2 in a targeted replication study of 281 informative T1D family members and 86 affected family-based association control (AFBAC) haplotypes. The frequency of DRB3*02:02 was 42.9% in the DRB1*03:01/*03:01 patients and 27.6% in the DRB1*03:01/*04 (P = 0.005) compared with 22.6% in AFBAC DRB1*03:01 chromosomes (P = 0.001). Analysis of T1D-associated alleles at other HLA loci (HLA-A, HLA-B, and HLA-DPB1) on DRB1*03:01 haplotypes suggests that DRB3*02:02 on the DRB1*03:01 haplotype can contribute to T1D risk.
The Type 1 Diabetes Genetics Consortium (T1DGC) (1) is an international collaboration that has collected thousands of multiplex family samples, as well as case and control samples, and has carried out linkage and association analysis for genome-wide single nucleotide polymorphisms (SNPs), candidate gene SNPs, and major histocompatibility complex (MHC) SNPs, as well as for alleles and haplotypes at the highly polymorphic HLA class I and class II genes (2–7). More than 40 different loci have been associated with T1D (3); however, linkage and association analyses have identified the HLA region as the major genetic determinant of T1D risk. The most strongly T1D-associated MHC markers are defined by the HLA-DRB1, -DQA1, and -DQB1 haplotypes (4), although alleles at other HLA loci, notably HLA-A and HLA-B, as well as HLA DPB1 (5–11) and other non-HLA loci across the genome, also contribute to T1D risk (3).
Although the association of HLA-DRB1 alleles with T1D is well established, the role of polymorphism at the HLA-DRB3, HLA-DRB4, and HLA-DRB5 loci has been studied less frequently, partly due to technical issues in genotyping. Most high-resolution genotyping strategies for HLA-DRB1 depend on DRB1-specific PCR primers to minimize confounding signals from coamplified secondary DRB loci. All copies of chromosome 6 have a DRB1 locus, and most, but not all, have a functional second DRB locus. This second DRB locus is DRB3 for DRB1*03, *11, *12, *13, and *14 haplotypes, DRB4 for DRB1*04, *07, and *09 haplotypes, and DRB5 for DRB1*15 and *16 haplotypes. DRB1*01, DRB1*08, and DRB1*10 haplotypes do not have a second DRB locus.
The clonal sequencing property of next generation sequencing systems, such as the Roche 454 GS FLX and GS Junior Systems, allows the use of generic DRB primers to coamplify and sequence exon 2 from DRB1, DRB3, DRB4, and DRB5 loci (12–14). We have used the Roche 454 amplicon sequencing system with “fusion primers” containing the 454 adaptor (A and B) sequences and 10 base multiplex ID tags (MIDs) to amplify and sequence exon 2 from different individuals (13,14). The genotyping software consolidates these clonal sequence readings, sorts them to individual DRB loci and to individual samples, and compares them to the IMGT/HLA Sequence Database to assign specific genotypes at the HLA-DRB1, HLA-DRB3, HLA-DRB4, and HLA-DRB5 loci.
To assess the potential role of these secondary HLA-DRB loci in T1D, association analyses were carried out at the two DRB locus haplotype levels, focusing on the DRB1*03:01 haplotypes bearing DRB3*01:01 or DRB3*02:02. The role of HLA-DRB3 polymorphism in T1D risk was initially evaluated in a case/control study and then examined in a targeted replication study of informative patients from multiplex HLA-genotyped families, allowing analysis of HLA haplotypes and the potential role of T1D-associated alleles at other HLA loci.
RESEARCH DESIGN AND METHODS
The initial set of DNA samples from 500 case and 500 control subjects was provided by the JDRF/Wellcome Trust Diabetes and Inflammation Laboratory (case subjects) and the British 1958 Birth Cohort (control subjects) from a study described previously (15).
To test the hypothesis generated in the case/control study, DNA samples from the T1DGC and the Human Biological Data Interchange (HBDI) multiplex family collections were used. Samples from 280 T1D-affected family members were selected if they had a DRB1*03 haplotype. Samples were selected based on the known T1D risk DRB1 genotypes, DRB1*03:01/DRB1*03:01 (n = 54 T1D subjects; 105 independent DRB1*03:01 haplotypes), DRB1*03:01/DRB1*04:01 (n = 154 DRB1*03:01 haplotypes), and DRB1*03:01/DRB1*04:04 (n = 73 DRB1*03:01 haplotypes), with the aim of testing the hypothesis that DRB3*02:02 and *01:01 alleles may confer differential T1D risk. Parent samples with a DRB1*03:01 allele known not to be transmitted to an affected T1D patient were also selected as control subjects from the T1DGC and the HBDI collections (affected family-based association controls [AFBACs], n = 115 haplotypes) (16). These AFBAC control haplotypes were used to determine population frequencies of DRB3*02:02 and DRB3*01:01 alleles on DRB1*03:01 haplotypes.
HLA genotyping using next generation sequencing (Roche 454).
HLA sequence data were generated on the Roche 454 GS FLX and GS Junior Systems and analyzed using Conexio Genomics HLA ASSIGN ATF genotyping software to interpret the sequence files as HLA genotypes (13,14). Amplicons were generated from genomic DNA using DRB generic exon 2 454 fusion primers. The 454 fusion primers consist of a locus-specific sequence on the 3′ end, a 10-bp MID tag, and an “A” or “B” 454-specific primer sequence on the 5′ end. The MID tag serves as a sample barcode recognized by Conexio ASSIGN ATF genotyping software.
Amplicons were purified with AMPure beads (Becton Dickinson, Franklin Lakes, NJ), quantified using the Quant-iT PicoGreen dsDNA reagent (Life Technologies, Foster City, CA), and mixed with capture beads after dilution. Individual DRB exon 2 amplicon molecules captured by these beads were amplified in an emulsion PCR amplification and DNA-containing beads subsequently analyzed by pyrosequencing to obtain sequence readings originating from a single molecule (12–14). Sequencing on the GS FLX and GS Junior System was performed, as described (13,14). HLA genotypes were assigned to samples using Conexio ASSIGN ATF genotyping software, as described (13,14).
DRB3 allele frequencies on DR3 haplotypes in DR3 homozygotes, DR3/DR4 heterozygotes, and AFBACs were compared using a Fisher exact test where the overall sample size was less than 50 or a Pearson χ2 statistic.
A total of 500 case and 500 control DNA samples from the Wellcome Trust/JDRF Diabetes Inflammation Laboratory collection were amplified with 454 DRB fusion primers containing 32 MID tags and sequenced in two GS FLX System runs using PicoTiterPlates fitted with 16-region gaskets. The long read lengths of >400 bp spanned the amplicon in both directions and allowed setting phase (haplotyping) for the polymorphisms within exon 2. This provided, in most cases, unambiguous genotype assignments for HLA-DRB1, HLA-DRB3, and HLA-DRB5. For HLA-DRB4, however, several different genotypes were consistent with the observed sequence reads (ambiguity string). (The HLA-DRB4 exon 2 sequence reads were identical for the DRB4*01:01, DRB4*01:03, and DRB4*01:06 alleles.) This DRB4 exon 2 sequence was present on all DRB1*04, *07, and *08 haplotypes; thus, the role of DRB4 alleles for association with T1D or for defining specific linkage disequilibrium (LD) patterns could not be evaluated.
For HLA-DRB3 and HLA-DRB5, the allele assignments were unambiguous and, as expected, a pattern of very strong LD between HLA-DRB1 and the secondary DRB locus was observed. The T1D association analyses were performed on the two-locus haplotypes (Table 1). In these data, DRB1*15:01 was linked exclusively to DRB5*01:01, and although the numbers were small (n = 6), DRB1*16:01 was linked exclusively to DRB5*02:02. Three common DRB3 alleles (*01:01, *02:02, *03:01) were observed. Of those haplotypes carrying a DRB3 locus, most DRB1 alleles were linked to a unique DRB3 allele in both case and control subjects (Table 1). The DRB1*03:01 and the DRB1*13:01 haplotypes, however, were linked to one of two different DRB3 alleles (*01:01 or *02:02).
Thus, the role of DRB3*01:01 versus DRB3*02:02 on DRB1*03:01 haplotypes could be evaluated. DRB1*03:01 haplotypes carrying DRB3*02:02 had a nominally higher risk for T1D; this difference in T1D risk was observed in the DRB1*03:01/*03:01 homozygotes with two DRB3*01:01 alleles (odds ratio [OR] 3.4 [95% CI 1.46–8.09]) compared with those with one or two DRB3*02:02 alleles (25.5 [3.43–189.2]; P = 0.033; Table 2). For DRB1*03:01/*04:01 heterozygotes, however, there was no difference in the T1D risk between DRB1*03:01 haplotypes distinguished by the DRB3 allele (OR 7.7 vs. 6.8; P = 0.29; Table 2). That the apparent difference in risk for DRB1*03:01 haplotypes bearing DRB3*02:02 versus DRB3*01:01 is not evident in DRB1*03:01/DRB1*04 heterozygotes may reflect the very high risk associated with this genotype and attributed, in part, to the trans-complementing DQ-α (*05:01) or DQ-β (*03:02) heterodimer (4).
Although many samples were sequenced in this case/control study, the number of informative haplotypes (DRB1*03:01) and genotypes was limited; therefore, the statistical power of the association with DRB3:02:02 was very modest. Access to the T1DGC collection of HLA-genotyped families allowed selective targeting of informative genotypes to directly address the hypothesis of an effect of DRB3*02:02 in DRB1*03:01/*03:01 homozygotes and replicate the results of the case/control study.
Targeted replication study of informative samples from the T1DGC family collection.
To further evaluate the putative role of HLA-DRB3 polymorphism on DRB1*03:01 haplotype risk, a panel of DRB1*03:01/*03:01 homozygotes (n = 54 T1D family members, 105 nonshared chromosomes) and DRB1*03:01/DRB1*04-DQB1*03:02 (n = 227) heterozygous T1D members were selected from the T1DGC families. Of the DRB1*03:01/DRB1*04 subjects, 154 were *04:01 and 73 were *04:04. The distribution of DRB3 alleles in these T1D subjects is reported in Table 3. The frequency of the DRB3*02:02 allele in the DRB1*03:01/*03:01 T1D members was significantly greater than in those with DRB1*03:01/*04-DQB1*03:02 (42.9 vs. 27.6%, P < 0.005), consistent with the observations in the previous case/control study (Table 2). The distribution of DRB3 alleles on control DRB1*03:01 chromosomes was evaluated by examining the nontransmitted AFBAC (16) DRB1*03:01 chromosomes from heterozygous parents. The frequency of DRB3*02:02 in those chromosomes not transmitted to a T1D-affected child was 22.6%, significantly less than the 42.9% (P = 0.001) observed in the DRB1*03:01/DRB1*03:01 homozygous patients but not significantly different from the frequency in DRB1*03:01/DRB1*04:01 heterozygous T1D patients (27.3%).
Does DRB3*02:02 allele modify risk or is it simply a marker for a high-risk DRB1*03:01 haplotype?
On the basis of these two studies, the DRB3*02:02 allele appears to mark a higher-risk DRB1*03:01 haplotype than the DRB1*03:01-DRB3*01:01 haplotype, and the effect of this high-risk haplotype can be discerned primarily in the DRB1*03:01/*03:01 homozygotes. T1D risk heterogeneity in DRB1*03:01 haplotypes has been previously reported with HLA-A*30:02, HLA-B*18:01, and DPB1*02:02 alleles distinguishing the higher-risk from the lower-risk DRB1*03:01 haplotypes (17). To investigate whether the DRB3*02:02 allele might contribute to T1D risk or simply mark a high-risk DRB1*03:01 haplotype in which HLA class I (or other alleles) was a modifying risk, we examined the distribution of these and other T1D-associated alleles at other HLA loci on these two (DRB3*01:01 and DRB3*02:02) DRB1*03:01 haplotypes. The increase of DRB3*02:02 on DRB1*03:01 haplotypes in DRB1*03:01/*03:01 case subjects versus DRB1*03:01/DRB1*04 case subjects and AFBAC control subjects could reflect LD with alleles at other high-risk loci. The analysis of the case subjects selected from T1DGC families with informative HLA-DRB1 genotypes included parents; thus, haplotypes across the HLA region could be determined.
Eight locus haplotypes were phased and assigned based on familial inheritance. In the absence of parental genotyping data for the DRB3 locus, only a fraction of these could be assigned phase unambiguously with regard to the remaining eight loci. In total, 327 DRB1*03:01 haplotypes were used for this analysis of extended HLA haplotypes. The DRB1*03:01 haplotype counts in the three groups (AFBAC, DRB1*03:01/DRB1*04, DRB1*03:01/DRB1*03:01) for selected DPB1, HLA-B, -C, and -A alleles are reported in Table 4 and Supplementary Table 1. These alleles were selected based on previously published reports of T1D association. Table 4 and Supplementary Table 1 compare the distribution of DRB3*01:01 and *02:02 alleles on DRB1*03:01 haplotypes bearing various high-risk alleles at other HLA loci in DRB1*03:01/*03:01 versus DRB1*03:01/*04 patients.
We note that the alleles A*01:01, B*08:01, C*07:02, and DPB1*01:01 are in very strong LD with DRB3*01:01. These alleles on this extended haplotype mark a “lower-risk” DRB1*03:01 haplotype. In addition, the LD between alleles on the extended DRB1*03:01 haplotype, A*30:02-B*18:01-DRB1*03:01-DRB3*02:02-DPB1*02:02 is sufficiently strong that it does not permit evaluation of the independent contribution of DRB3*02:02. One haplotype, the DRB1*03:01-DPB1*03:01, however, is informative; DRB1*03:01 haplotypes carrying both DPB1*03:01 and DRB3*02:02 are significantly increased (P = 0.0006) in DRB1*03:01/*03:01 versus DRB1*03:01/*04 case subjects compared with those who carry DRB3*01:01 (Table 4). This effect cannot be explained by the presence of A*30:02 or B*18:01, which are also markers of high-risk DRB1*03:01 haplotypes on these 16 DPB1*03:01-DRB3*02:02 haplotypes. Only 4 of these 16 haplotypes carry A*30:02, and of the 9 DPB1*03:01-DRB3*02:02 haplotypes that also carry B*18:01, 5 are found in DR3/DR4 and 4 in DR3/DR3 case subjects. These observations, taken together, suggest that the DRB3*02:02 allele may contribute to T1D risk rather than simply marking a high-risk DRB1*03:01 haplotype.
DRB1*13:01 is the only other DRB1 haplotype that can carry DRB3*01:01 or DRB3*02:02 (Table 1). We note that the OR of the DRB1*13:01-DRB3*02:02 haplotype is nominally higher than the DRB1*13:01-DRB3*01:01 haplotype (0.29 vs. 0.15; Table 1). However, the CIs overlap, indicating that much larger sample sizes will be necessary to evaluate the risk of these two haplotypes.
Long-read next generation clonal sequencing allows genotyping the secondary HLA-DRB loci (HLA-DRB3, HLA-DRB4, and HLA-DRB5) as well as HLA-DRB1 by using DRB generic 454 fusion primers for exon 2. Using the Roche 454 GS FLX and GS Junior Systems, we have used this capability to examine the role of the secondary HLA-DRB loci in T1D susceptibility. Most HLA-DRB1 alleles are linked exclusively to a specific allele at the secondary DRB locus (Table 1); however, DRB1*03:01 and DRB1*13:01 can carry DRB3*01:01 or *02:02. DRB1*03:01 haplotypes carrying DRB3*02:02 appear to confer higher risk than those carrying DRB3*01:01. The difference in T1D risk is observed in DRB1*03:01/*03:01 homozygotes but not in DRB1*03:01/DRB1*04-DQB1*03:02 heterozygotes. The very high risk for T1D associated with this heterozygous genotype has been attributed to the trans-complementing DQ heterodimer formed by the DQ-α chain encoded by the DQA1*05:01 allele on the DRB1*03:01 haplotype and the DQ-β chain encoded by the DQB1*03:02 on the DRB1*04-DQB1*03:02 haplotype (4). One possible explanation of the different effect of HLA-DRB3 in this heterozygote and in the DRB1*03:01/DRB1*03:01 homozygote is that the putative risk conferred by the trans-complementing DQ heterodimers in the heterozygote is sufficiently large so that the risk difference between the “higher-risk” and “lower-risk” DRB1*03:01 haplotypes has a minimal effect on the overall T1D risk for the heterozygote. A recent study on the effect of other MHC markers (HLA-B, HLA-A, HLA-DPB1, and TNF-α) on DRB1*03:01 haplotype risk reported a similar observation—these markers were associated with differential risk in DRB1*03:01/DRB1*03:01 homozygotes but not in the DRB1*03:01/DRB1*04-DQB1*03:02 heterozygotes (17).
Does DRB3*02:02 play a role in T1D risk or does it only mark the higher-risk DRB1*03:01 haplotype? To address this question, eight locus DRB1*03:01 haplotypes were phased and assigned based on familial inheritance, and the distribution of the DRB3*01:01 and *02:02 alleles on extended haplotypes bearing high-risk alleles at other HLA loci was compared (Table 4 and Supplementary Table 1). The linkage disequilibrium between A*30:02, B*18:01, DPB1*02:02, and DRB3*02:02 was so great (virtually 100%) that the effect of these alleles on this high-risk DRB1*03:01 haplotype could not be distinguished. Some DRB1*03:01-DRB3*02:02 haplotypes, however, do not carry these alleles, associated with the high-risk DRB1*03:01 haplotype, but still demonstrate a significant increase in DRB1*03:01/*03:01 case subjects versus DRB1*03:01/*04. In particular, DRB1*03:01 haplotypes bearing DPB1*03:01 can carry DRB3*01:01 or DRB3*02:02; those that carry DRB3*02:02 are dramatically increased among the DRB1*03:01/*03:01 case subjects (P = 0.0006). These observations are consistent with the hypothesis that DRB3*02:02 is not simply a marker of high-risk haplotypes but, in fact, increases the risk of DRB1*03:01 haplotypes, particularly in DRB1*03:01/*03:01 homozygotes.
The amino acid sequence differences between DRB3*01:01 and DRB3*02:02 are substantial (10 differences). Consequently, DRB3*02:02 is likely to have a different repertoire of peptide binding and presentation and may thus confer greater T1D risk than DRB3*01:01 in certain genotype contexts (i.e., DRB1*03:01/*03:01 homozygotes) through altered peptide binding and T-cell repertoire. The crystallographic structure of the DRB3*01:01 (DR52a) and the DRB3*03:01 (DR52c) molecules have been reported recently (18,19). The P9 pocket of the DRB3*02:02 (DR52b) differs substantially from that of DRB3*01:01 (DR52a); in particular, it contains Tyr-37, Ala-38, Asp-57, and Tyr-60, making the pocket more accommodating to smaller, polar, or charged peptide residues. A structural model with bound peptide of the DRB3*02:02 molecule with the differences highlighted is shown in Supplementary Fig. 1.
The results presented here support the conclusion that the T1D risk of a given HLA haplotype is determined by specific combinations of alleles at a variety of HLA loci with genotype-dependent effects and support a role for the DRB3 locus in T1D susceptibility conferred by the DRB1*03:01 haplotype.
This research used resources provided by the T1DGC, a collaborative clinical study sponsored by the National Institute of Diabetes and Digestive and Kidney Diseases, National Institute of Allergy and Infectious Diseases, National Human Genome Research Institute, National Institute of Child Health and Human Development, and JDRF and was supported by U01 DK-062418. This work also was supported in part by National Institutes of Health Grant DK-61722 (J.A.N.).
No potential conflicts of interest relevant to this article were reported.
H.A.E. designed the initial study, designed the second study with the support of the T1DGC Steering Committee, and drafted the manuscript. A.M.V. provided statistical analyses and edited the manuscript. S.L.M. contributed to the manuscript and with L.A.B. and K.R.M., they all contributed to generating sequence data. B.B.S. contributed to generating sequence data and edited the manuscript. J.A.T. provided samples for the first study and edited the manuscript. S.S.R. edited the manuscript. J.A.N. designed the second study with the support of the T1DGC Steering Committee and edited the manuscript. H.A.E. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
The authors are grateful to the Wellcome Trust and JDRF support for J.A.T., and also to Neil Walker and Helen Stevens, of the JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, for providing DNA samples and HLA typing information.
This article contains Supplementary Data online at http://diabetes.diabetesjournals.org/lookup/suppl/doi:10.2337/db12-1387/-/DC1.
H.A.E. is currently affiliated with the Children's Hospital Oakland Research Institute, Oakland, California.
*A complete list of the members of the T1DGC can be found in the Supplementary Data online.
- Received October 5, 2012.
- Accepted February 27, 2013.
- © 2013 by the American Diabetes Association.
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. See http://creativecommons.org/licenses/by-nc-nd/3.0/ for details.