Polygenic Type 2 Diabetes Prediction at the Limit of Common Variant Detection
- Jason L. Vassy1,2,3,
- Marie-France Hivert1,4,5,
- Bianca Porneala6,
- Marco Dauriz1,6,7,
- Jose C. Florez1,8,9,
- Josée Dupuis10,11,
- David S. Siscovick12,
- Myriam Fornage13,
- Laura J. Rasmussen-Torvik14,
- Claude Bouchard15 and
- James B. Meigs1,6⇑
- 1Harvard Medical School, Boston, MA
- 2Section of General Internal Medicine, VA Boston Healthcare System, Boston, MA
- 3Division of General Internal Medicine and Primary Care, Brigham and Women’s Hospital, Boston, MA
- 4Department of Population Medicine, Harvard Pilgrim Health Care Institute, Boston, MA
- 5Division of Endocrinology, Department of Medicine, Université de Sherbrooke, Sherbrooke, Quebec, Canada
- 6General Medicine Division, Massachusetts General Hospital, Boston, MA
- 7Division of Endocrinology and Metabolic Diseases, Department of Medicine, University of Verona Medical School and Hospital Trust of Verona, Verona, Italy
- 8Diabetes Research Center (Diabetes Unit), and Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA
- 9Program in Medical and Population Genetics, Broad Institute, Cambridge, MA
- 10Department of Biostatistics, Boston University School of Public Health, Boston, MA
- 11National Heart, Lung, and Blood Institute's Framingham Heart Study, Framingham, MA
- 12Cardiovascular Health Research Unit, Departments of Medicine and Epidemiology, University of Washington, Seattle, WA
- 13Center for Human Genetics, University of Texas Health Science Center at Houston, Houston, TX
- 14Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL
- 15Human Genomics Laboratory, Pennington Biomedical Research Center, Louisiana State University System, Baton Rouge, LA
- Corresponding author: James B. Meigs, .
Genome-wide association studies (GWAS) may have reached their limit of detecting common type 2 diabetes (T2D)–associated genetic variation. We evaluated the performance of current polygenic T2D prediction. Using data from the Framingham Offspring (FOS) and the Coronary Artery Risk Development in Young Adults (CARDIA) studies, we tested three hypotheses: 1) a 62-locus genotype risk score (GRSt) improves T2D prediction compared with previous less inclusive GRSt; 2) separate GRS for β-cell (GRSβ) and insulin resistance (GRSIR) independently predict T2D; and 3) the relationships between T2D and GRSt, GRSβ, or GRSIR do not differ between blacks and whites. Among 1,650 young white adults in CARDIA, 820 young black adults in CARDIA, and 3,471 white middle-aged adults in FOS, cumulative T2D incidence was 5.9%, 14.4%, and 12.9%, respectively, over 25 years. The 62-locus GRSt was significantly associated with incident T2D in all three groups. In FOS but not CARDIA, the 62-locus GRSt improved the model C statistic (0.698 and 0.726 for models without and with GRSt, respectively; P < 0.001) but did not materially improve risk reclassification in either study. Results were similar among blacks compared with whites. The GRSβ but not GRSIR predicted incident T2D among FOS and CARDIA whites. At the end of the era of common variant discovery for T2D, polygenic scores can predict T2D in whites and blacks but do not outperform clinical models. Further optimization of polygenic prediction may require novel analytic methods, including less common as well as functional variants.
Type 2 diabetes (T2D) is a common complex disease with genetic and environmental determinants. Risk factors, including overnutrition, sedentary behavior, and lack of physical exercise, make the disease amenable to prevention through lifestyle modification (1,2), but the most effective behavior change programs can be cost-intensive (3). Because the genome-wide association study (GWAS) era has discovered dozens of genetic loci associated with T2D risk, there has been hope that genotype might help clinicians and public health practitioners target limited prevention resources to those at greatest risk. Although genotype predicts incident T2D (4–9), studies using limited genetic information from the first waves of GWAS have demonstrated that the addition of genotype to T2D prediction models based on routinely measured clinical risk factors (6,10,11) does not substantively improve risk stratification (4,8,9).
The DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Consortium recently published the largest T2D GWAS meta-analysis to date (DIAGRAMv3), identifying many additional common variants associated with T2D and bringing the total number of independent T2D loci to 65 (12). Together, these loci explained ∼5.7% of the variance in genetic susceptibility to T2D. DIAGRAMv3 also modeled the theoretical existence of 488 additional common variants likely associated with T2D on the arrays used in their analyses but with effect sizes too small for detection. These hundreds of single nucleotide polymorphisms (SNPs) would increase the proportion of explained T2D susceptibility to 10.7%. Subsequent models using genome-wide complex trait analysis suggested that 63% of T2D susceptibility might be attributable to common genetic variation in the full set of GWAS SNPs (12). Still, current GWAS methodology is likely nearing its limit (13,14) to identify the additional specific common SNPs associated with T2D. Recent analyses have suggested that even a tripling of the GWAS discovery sample size would not materially increase the C statistic of polygenic T2D models (15). Ongoing next-generation sequencing efforts may identify additional variants with major allele frequency >1%, although SNP genotype and imputation data from GWAS arrays have likely already captured most of this common variation.
Thus, the 65 DIAGRAMv3 loci may represent the majority of common and significant T2D-association genetic variants expected to be identified. If so, it is opportune to evaluate the performance of currently available genetic information for T2D risk prediction and classification. The additional loci discovered in DIAGRAMv3 may improve polygenic T2D prediction over previous attempts using polygenic models with fewer loci (4,5,9,16). Because GWAS use a cross-sectional case-control design, it is important to determine how well these loci prospectively predict incident T2D. Moreover, polygenic models may be improved by taking into consideration the biological pathways underlying these T2D-associated loci. Although most of these remain to be elucidated, some functional studies and analyses of more specific metabolic phenotypes have implicated some loci in pancreatic β-cell dysfunction or, less commonly, insulin resistance (IR) (17,18). Individuals carrying a high genetic burden for β-cell dysfunction and IR might be at especially high risk of developing T2D. Finally, although DIAGRAMv3 used data from populations of mostly European ancestry, it is important for clinical practice and public health to know whether these associations hold in nonwhite populations.
Research Design and Methods
We used data from the Framingham Offspring (FOS) and the Coronary Artery Risk Development in Young Adults (CARDIA) studies to examine the performance of updated polygenic prediction models for T2D among young and middle-aged adults of European and African ancestry. We tested three primary hypotheses. First, we hypothesized that an updated total genotype risk score (GRSt) with up to 65 T2D-associated risk loci improves the prediction of incident T2D in young and middle-aged adulthood compared with previously published scores with fewer loci. We examined genotype-only and also genotype-plus-clinical prediction models. Second, because β-cell dysfunction and IR represent two distinct pathways in the pathogenesis of T2D, we hypothesized that separate GRS consisting of SNPs postulated to influence β-cell (GRSβ) or IR (GRSIR) independently predict incident T2D. In subsidiary analyses, we investigated whether GRSβ and GRSIR together exhibit a multiplicative effect on T2D risk and whether the association between T2D risk and GRSβ or GRSIR varies between lean and obese individuals. Third, we hypothesized that the relationships between incident T2D and GRSt, GRSβ, or GRSIR do not differ between black and white individuals.
FOS and CARDIA are both large well-described prospective cohort studies (19–21). The FOS began in 1971 and consists of offspring of the original Framingham Heart Study participants and their spouses. At the first examination, FOS participants were between 5 and 70 years of age. They were examined again after 8 years and then every 4 years thereafter through examination 8 (2005–2008). The CARDIA Study is a multicenter prospective study of 5,115 white and black participants recruited in 1985–1986 from four United States cities (20,21). Participants were aged 18 to 30 years at the baseline examination and were invited to participate in serial follow-up examinations over the subsequent 25 years. Written informed consent was obtained from all FOS and CARDIA participants, and the institutional review board at each participating center approved the original studies. We limited the present analyses to FOS and CARDIA participants with at least two study examinations, genotype information, and baseline data available for all predictors of interest. We excluded any participant with diabetes or pregnancy at the baseline examination. CARDIA participants who reported diabetes treatment exclusively with insulin during the observation period were considered to have type 1 diabetes and were also excluded from analyses. We did not apply this exclusion to the older FOS cohort; greater than 99% of the FOS diabetes cases are T2D (11). The Partners Human Research Committee approved these analyses.
The primary outcome was incident T2D during the observation period. Each FOS examination included an assessment of medical history, a physical examination, and a fasting blood sample (22). All CARDIA Study visits included an updated medical history assessment, including medications, and fasting glucose was measured at years 0, 7, 10, 15, 20, and 25. We defined T2D in FOS and CARDIA by a fasting plasma glucose ≥7.0 mmol/L (≥126 mg/dL) or report of taking diabetes medications (9,10).
Clinical Risk Factors and Covariates
Data collection methods in FOS and CARDIA have been described previously (19,21). We considered a study participant to have a positive parental history of diabetes if he or she reported on a family history questionnaire that one or both parents had diabetes (23). Fasting plasma glucose and lipid levels were measured as described previously (22,24). All FOS participants were white, and in CARDIA, black or white race was determined by self-report.
Genotyping and GRS
Details of the genotyping and quality of FOS and CARDIA samples have been published previously (25–27). In previous reports, we calculated GRSt consisting of all the T2D-associated loci known at the time: 17- and 40-SNP GRSt in FOS and a 38-SNP GRSt in CARDIA (4,9,16). In the present analyses, we updated these GRSt to include as many of the 65 index SNPs or their proxies as were available at the confirmed or newly identified loci from DIAGRAMv3 (12) (Table 1 and Fig. 1), using previously reported methods (4,9,16). For each locus for each individual, we prioritized inclusion of the following information into the GRSt, in order: genotyped data at the index SNP, imputed data at the index SNP, and then genotyped data at a suitable proxy for the index SNP. We used SNP Annotation and Proxy Search (SNAP; http://www.broadinstitute.org/mpg/snap/) to identify proxy SNPs, as needed, defined as being in linkage equilibrium with the index SNP (r2 ≥ 0.5) in the HapMap II release 22 Northern and Western Europe (CEU) reference population. Of the 65 loci, genotyped or imputed data were available for 62 of the index SNPs for the FOS and CARDIA studies. No genotype information was available for rs11063069 at CCND2, rs11651052 at HNF1B (TCF2), or rs8108269 at GIPR. Whites and blacks in CARDIA had genotyped or imputed data for these same 62 loci. For FOS and CARDIA whites, we calculated GRSt as the weighted sum of the number of risk alleles (zero, one, or two) at each of the available loci, weighted by its effect size (β) from DIAGRAMv3. Because no sufficiently large T2D GWAS in people of African ancestry exists from which to derive locus effect sizes, we used an unweighted GRSt for CARDIA blacks, calculated by summing the risk alleles across the loci.
Additionally, we used prior genetic and physiologic evidence to categorize the loci as associated predominantly with β-cell function or IR (Supplementary Table 1). We identified 20 predominantly β-cell–associated SNPs by 1) their significant effect on homeostasis model assessment (HOMA)-β (β < –0.008; P < 0.05) in the most recent Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC) (12) and/or 2) a significant effect (P < 0.05) on one of the β-cell function indices (18): insulinogenic index or acute insulin response. We identified 10 predominantly IR–related SNPs by 1) their significant association with HOMA-IR (P < 0.05) in the MAGIC data (12), 2) significant association with fasting insulin in the MAGIC GWAS conditional on BMI or BMI-SNP interaction (28), and/or 3) evidence of association with IR-related traits such as lower HDL cholesterol, higher triglycerides, higher BMI, and higher waist-to-hip ratio (18). Similar to the GRSt, we calculated separate GRSβ and GRSIR, with each locus weighted in whites by the same effect size as in the GRSt. For CARDIA blacks, we calculated unweighted GRSβ and GRSIR.
We constructed logistic and proportional-hazards regression models for incident T2D using similar statistical methods as in our previous FOS and CARDIA analyses, respectively (Supplementary Methods) (4,9,16). In each study, we constructed regression models for incident T2D as a function of GRS, sex, and age (demographic model) and GRS, sex, age, and risk factors routinely measured in clinical practice (clinical model: parental history of diabetes [yes vs. no], BMI, systolic blood pressure, fasting plasma glucose, and log-transformed HDL cholesterol and triglyceride levels). We used C statistics and continuous net reclassification improvement (NRI) indices to compare prediction models with and without genotype information (29–32). To examine the relationship between β-cell and IR genotype, we also performed the models above with 1) GRSβ alone, 2) GRSIR alone, 3) GRSβ and GRSIR, and 4) GRSβ, GRSIR, and a GRSβ × GRSIR interaction term. Further, we examined the relationship between genotype and BMI in two ways: 1) the inclusion of an interaction term between each GRS and an indicator variable for obesity (BMI ≥30 kg/m2 vs. BMI <30 kg/m2) and 2) analyses stratified by BMI category (BMI ≥30 kg/m2 vs. BMI <30 kg/m2). To test the hypothesis that the association between each GRS and T2D risk does not differ between whites and blacks, we meta-analyzed the regression β coefficients from FOS and CARDIA whites and then used a t test to compare the result with the corresponding β in CARDIA blacks. We considered odds ratios and hazard ratios as statistically significant at P < 0.05.
Participant Characteristics and Incident T2D
Among the 3,869 FOS participants, 11,358 person-periods from 3,471 individuals were eligible for the present analyses. In CARDIA, 1,650 white and 820 black individuals with 50,309 total person-years of follow-up were eligible. Table 2 reports the baseline participant characteristics. In FOS, there were 446 incident cases of T2D (cumulative incidence 12.9%) over a mean 25.6 years of follow-up. In the younger CARDIA cohort, there were 97 T2D cases (cumulative incidence 5.9%) among whites over a mean follow-up of 24.2 years, and 118 cases (cumulative incidence 14.4%) among blacks over a mean follow-up of 23.4 years.
GRSt and Prediction of Incident T2D
The mean 62-SNP GRSt was greater among T2D cases than noncases in FOS (P < 0.001), CARDIA whites (P < 0.001), and CARDIA blacks (P = 0.01; Table 3). Among all three cohorts, each GRSt was significantly associated with incident T2D in both the demographic and clinical prediction models (Tables 4 and 5). In the demographic models in FOS, each additional weighted allele in the 17-, 40-, and 62-SNP GRSt was associated with an increased odds for incident T2D of 11% (7–15%), 8% (6–11%), and 8% (6–10%), respectively. Among CARDIA whites, each additional weighted allele in the 38- and 62-SNP GRSt was associated with an increase in the adjusted hazard for incident T2D of 12% (6–18%) and 7% (3–12%); the corresponding increases among CARDIA blacks were 5% (0–11%) and 5% (1–9%). The addition of each successive SNP to the GRSt lowered the per-allele odds ratio for incident T2D in FOS (Fig. 1). The addition of the 62-SNP GRSt to the demographic and clinical prediction models in FOS weakly improved risk reclassification (continuous NRI 0.286 [95% CI 0.192–0.380] and 0.256 [95% CI 0.162–0.351], respectively) (Table 4).
Reclassification was moderate among FOS individuals younger than 50 years and weak among those 50 years or older (Table 4). Reclassification was not markedly higher in the younger CARDIA cohort. Among CARDIA whites, the addition of the 62-SNP GRSt to the demographic and clinical models resulted in a continuous NRI of 0.311 (95% CI 0.088–0.525) and 0.306 (95% CI 0.073–0.517), respectively. Similarly, the resulting NRI among CARDIA blacks were 0.243 (95% CI 0.031–0.455) and 0.296 (95% CI 0.098–0.513), respectively. Compared with our previous GRSt consisting of fewer loci, the 62-SNP GRSt increased model C statistics but did not increase the NRI in FOS (Table 4); NRI indices in CARDIA whites and blacks were generally higher than with the 38-SNP GRSt but still indicated weak reclassification improvement (Table 5). The effect size of the 62-SNP GRSt did not differ between whites (meta-analyzed between FOS and CARDIA) and CARDIA blacks in the demographic or clinical model (all P > 0.05) (Supplementary Table 1). The demographic models with the 17-, 40-, and 62-SNP GRSt explained only 2.0%, 2.1%, and 2.2% of the variance in T2D risk in FOS. The 38- and 62-SNP GRSt explained 1.7% and 1.5% of risk variance, respectively, in CARDIA whites, and explained 1.5% and 1.6%, respectively, in CARDIA blacks. Figure 2 shows the C statistics for the demographic and clinical models with and without the 62-SNP GRSt.
GRSβ and GRSIR and T2D Prediction
Among FOS and CARDIA whites, those with incident T2D had a higher mean GRSβ (P < 0.05 for both cohorts), but not GRSIR, compared with noncases. In contrast, CARDIA blacks with incident T2D had a higher mean GRSIR (P = 0.03), but not GRSβ, than noncases (Supplementary Table 1). Among whites in FOS and CARDIA, GRSβ was associated with incident T2D in the demographic and clinical models (Supplementary Tables 3 and 4). The GRSβ was not associated with T2D among CARDIA blacks, although the between-race difference in effect size was not statistically significant (Supplementary Tables 5 and 6). The GRSIR was associated with T2D among whites after meta-analysis of the FOS and CARDIA results in the demographic model only. It was not associated with T2D among CARDIA blacks, although this effect did not statistically differ from that in whites (Supplementary Tables 3–6). We found no evidence of a multiplicative interaction between GRSβ and GRSIR for T2D risk (all P > 0.05).
In BMI-stratified models in FOS and CARDIA, GRSβ was associated with incident T2D among nonobese and obese subgroups (Supplementary Tables 7 and 8). In contrast, GRSIR was not significantly associated with T2D in either subgroup in either study. In models adjusted for age, sex, and (for CARDIA) race, there were no statistically significant interactions between obesity and GRSt, GRSβ, or GRSIR (Supplementary Tables 9 and 10). The effect sizes of GRSβ were 1.14 (95% CI 1.09–1.19) and 1.10 (95% CI 1.05–1.15) in lean and obese individuals in FOS, respectively, and 1.08 (95% CI 1.04–1.11) and 1.10 (95% CI 1.06–1.14) in the lean and obese in CARDIA, respectively.
In clinical medicine and public health, there is great interest in identifying individuals and population subgroups at increased T2D risk before disease onset. Genotype has a certain appeal as a risk predictor because the germline genetic code is fixed from birth. The largest T2D GWAS meta-analysis to date (12) may include all of the common T2D-associated loci of at least modest effect size that can be expected to be specifically identified. If so, it marks an appropriate time to evaluate the contribution of known common genetic variation to such risk stratification. Using data from two large well-characterized prospective cohort studies, we have shown that a polygenic score, GRSt, consisting of 62 of the known T2D-associated loci, is significantly associated with incident T2D during 25 years of observation.
First, we hypothesized that the inclusion of a greater number of T2D-associated loci in the GRSt would improve T2D prediction compared with less inclusive GRSt and with a clinical prediction model. Our prior analyses in FOS and CARDIA demonstrated that GRSt consisting of up to 40 loci do predict incident T2D from young and middle adulthood but do not improve upon clinical models, as measured by C statistics and NRI indices (4,9,16). An updated risk score might improve prediction for at least two reasons. First, a greater number of loci should explain a larger proportion of the heritability of T2D. Second, we updated the weight we used for each locus in our GRSt based on the effect sizes from the largest T2D GWAS meta-analysis to date (12). For each locus discovered in previous smaller GWAS, the larger sample size of the DIAGRAMv3 discovery set should reduce the error around its effect size on T2D risk (33). The greater precision of these weights might improve the ability of the composite GRSt to distinguish future T2D cases from noncases.
In the present analyses, we found that the addition of a greater number of loci to the GRSt steadily improved the C statistic of the simple demographic prediction model in FOS but not in CARDIA. These polygenic models, using only data available from birth (sex, genotype, and age), achieved C statistics of 0.6–0.7, comparable with other nongenetic T2D prediction models (5–7). However, the inclusion of multiple clinical risk factors to the prediction models overwhelmed any additional improvement in discrimination from genotype information, even though all GRSt remained significantly associated with incident T2D after adjustment for these factors.
Moreover, we did not find evidence that additional SNPs improved risk reclassification over the less inclusive GRSt. Indeed, among FOS participants, the updated 62-SNP GRSt lowered the NRI in the demographic and clinical models compared with a 40-SNP GRSt, although it did perform better than the 17-SNP GRSt. An exception to this observation occurred among black young adults in CARDIA. Compared with our previous 38-SNP GRSt, the 62-SNP GRSt increased the NRI from 0.083 to 0.243 in the demographic model and from 0.164 to 0.296 in the clinical model. Nonetheless, the magnitudes of these NRI indices still indicate weak reclassification improvement. Moreover, the relatively small number of cases among CARDIA blacks likely makes these NRI estimates more susceptible to imprecision.
Compared with demographic and clinical prediction models without genotype information, the addition of the 62-SNP GRSt resulted in relatively small risk reclassification in most of the subgroups examined. Prediction models use risk factors to assign each individual a probability of having the event of interest: here, incident T2D. The continuous NRI measures one model’s ability to improve upon the risk classification predicted by another model. Compared with nongenetic models, the addition of a 62-SNP GRSt generally achieved NRI indices of 0.1 to 0.3, indicative of weak reclassification improvement. The exception was among FOS participants younger than 50 years old at baseline, among whom the 62-SNP GRSt achieved moderate reclassification improvement (NRI 0.376 compared with the clinical model). Reclassification was much weaker among older FOS participants. This observation suggests that, when added to routine clinical risk factors, genotype information may have greater predictive utility among younger age-groups in whom risk factors, such as obesity and impaired fasting glucose, might not yet be fully manifest compared with among older adults. However, we did not observe that the addition of a GRSt to prediction models among even younger adults in CARDIA resulted in similar reclassification improvement. Because T2D-associated loci included in the GRSt were discovered in cohorts of largely middle-aged and older adults, they may exert their greatest effect on T2D risk in those decades of life. These loci may only improve T2D prediction among younger adults when the prediction time horizon is extended beyond the 25 years of follow-up available in the CARDIA Study.
Our second hypothesis was that separate β-cell and IR polygenic scores independently predict incident T2D. The earliest discoveries among common T2D-associated genetic variants pointed toward genes involved in β-cell function. With the DIAGRAMv3 publication and examination in MAGIC of more refined phenotypes among individuals without diabetes, there are now ∼10 loci possibly implicated in insulin action as well (18). We also hypothesized that GRSβ might have a stronger effect in leaner individuals than in obese individuals. In 2010, the DIAGRAM investigators reported that 23 of 30 T2D loci investigated showed greater effect sizes among individuals with a BMI ≤30 kg/m2 compared with those with a BMI >30 kg/m2, although this difference was statistically significant only for TCF7L2 and BCL11A (34). BMI-stratified GWAS analyses by Perry et al. (35) replicated different sets of previously identified T2D associations among the lean to the obese and identified a novel association with T2D at LAMA1 only among lean individuals. A polygenic score of 36 known T2D loci had a stronger association with T2D among the lean compared with the obese. On the other hand, genetic variants associated with fasting insulin were more easily detected in MAGIC data when BMI was included in the models, and the effect sizes were generally larger in individuals with a higher BMI (28). Given this heterogeneous genetic architecture of T2D and related traits, we examined whether the association between T2D risk and GRSβ and GRSIR might differ by obesity status. Among whites in FOS and CARDIA, GRSβ and GRSIR were associated with incident T2D. Neither score met statistical significance among CARDIA blacks, but the between-race differences were not statistically significant. In contrast to the cross-sectional analyses by Perry et al. (35) that examined subgroups with BMI <25 kg/m2 and BMI ≥30 kg/m2, we found no evidence that GRSt has a different effect size on incident T2D among individuals with a BMI <30 kg/m2 compared with those with a BMI ≥30 kg/m2. This difference may be due to the lower power from the smaller sample sizes of our analyses, the larger number of loci used in our GRSt, or our use of prospective data instead of the case-control design used by Perry et al. (35).
The third aim of our analyses was to examine whether polygenic prediction of T2D differs between individuals of self-reported white and black race. The DIAGRAMv3 meta-analysis consisted predominantly of populations of European ancestry (12). Genome-wide analyses in African populations have been limited by smaller sample sizes (25,36). First efforts have replicated the association between TCF7L2 and T2D in populations of African ancestry (36) but have otherwise been largely unrevealing about the genetic architecture in this group. Examinations of the association between individual European-derived loci and T2D among African populations have inconsistently replicated only a small fraction of these (37,38), but polygenic scores consisting of these same European-derived loci are nonetheless associated with T2D among African Americans (8,9,38). The biracial composition of the CARDIA Study allowed us to compare the association of the 62-SNP GRSt with T2D between the two subgroups. The GRSt was significantly associated with incident T2D among blacks and whites in the demographic and clinical models, and the effect sizes of the GRSt, GRSβ, and GRSIR did not differ between the two racial groups. We observed this consistency of effect despite the higher BMI among CARDIA blacks compared with whites (17.3% vs. 6.6% with baseline obesity) and their higher cumulative incidence of T2D (14.4% vs. 5.9%). Most individual European-derived SNPs are only proxies for the true causal variants driving the associations between given loci and T2D, and differences in linkage disequilibrium between ancestral groups likely magnify this imprecision when examining the relationship between these SNPs and T2D in populations in which they were not originally discovered. Although this imprecision may explain why individual European-derived SNPs may not replicate in populations of African ancestry, why a composite polygenic score consisting of these imprecise markers would significantly predict T2D in these same populations remains unclear. It is likely that the same loci, if not the specific SNPs themselves, are implicated in T2D across ancestral groups (39) and that our unweighted GRSt in CARDIA blacks essentially represents a count of these loci.
Some key lines of inquiry may overcome the limitations of the present analyses and move the field of polygenic risk prediction forward. Polygenic scores such as ours are simple weighted counts of T2D risk alleles across the genome. Such scores significantly predict incident T2D in a number of studies (40). However, other methods of combining genetic risk markers, which do not assume the independence of loci or the additivity of their effects, may improve the performance of prediction models (41,42). Improved polygenic models may also need to account for epistatic genetic effects and the interactions between loci and environmental factors such as diet and physical activity, although some analyses have suggested that the incremental predictive value of such models may be limited (43). Our examination of the differential effects of β-cell and IR polygenic scores on T2D risk is a first attempt to account for potential differences at a physiologic level, but more complex molecular pathways may need to be considered. The use of sequencing to identify the causal variants at each T2D-associated locus, for which most of the SNPs included in our GRS are imperfect proxies, should also further improve the predictive ability of polygenic models (33). In the meantime, except perhaps in younger subgroups, polygenic prediction of T2D using most of the common genetic variation expected to be found in the GWAS era has modest clinical value.
Funding. This work was supported by National Institutes of Health grants (R01-DK-078616 and K24-DK-080140 to J.B.M. and U01-HG-006500 and L30-DK089597 to J.L.V.). The CARDIA Study is conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with the University of Alabama at Birmingham (HHSN268201300025C and HHSN268201300026C), Northwestern University (HHSN268201300027C), University of Minnesota (HHSN268201300028C), Kaiser Foundation Research Institute (HHSN268201300029C), and Johns Hopkins University School of Medicine (HHSN268200900041C). CARDIA is also partially supported by the Intramural Research Program of the National Institute on Aging. This manuscript was reviewed by CARDIA personnel for scientific content. The Framingham Heart Study was supported by the NHLBI (contract number N01-HC-25195) and its contract with Affymetrix, Inc., for genotyping services (contract number N02-HL-6-4278). A portion of this research was conducted using the Linux Clusters for Genetic Analysis (LinGA) computing resources at the Boston University Medical Campus.
Duality of Interest. No potential conflicts of interest relevant to this article were reported.
Author Contributions. J.L.V., M.-F.H., B.P., M.D., J.C.F., J.D., and J.B.M. conceived the analyses. J.L.V., M.-F.H., B.P., and J.D. performed the analyses. J.L.V., M.-F.H., B.P., M.D., J.C.F., J.D., D.S.S., M.F., L.J.R.-T., C.B., and J.B.M. analyzed the results. J.L.V., M.-F.H., and J.B.M. wrote the manuscript. J.L.V., M.-F.H., B.P., M.D., J.C.F., J.D., D.S.S., M.F., L.J.R.-T., C.B., and J.B.M. reviewed the manuscript. J.B.M. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
This article contains Supplementary Data online at http://diabetes.diabetesjournals.org/lookup/suppl/doi:10.2337/db13-1663/-/DC1.
- Received October 31, 2013.
- Accepted February 3, 2014.
- © 2014 by the American Diabetes Association.
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. See http://creativecommons.org/licenses/by-nc-nd/3.0/ for details.