OBJECTIVE Genome-wide association studies have begun to elucidate the genetic architecture of type 2 diabetes. We examined whether single nucleotide polymorphisms (SNPs) identified through targeted complementary approaches affect diabetes incidence in the at-risk population of the Diabetes Prevention Program (DPP) and whether they influence a response to preventive interventions.
RESEARCH DESIGN AND METHODS We selected SNPs identified by prior genome-wide association studies for type 2 diabetes and related traits, or capturing common variation in 40 candidate genes previously associated with type 2 diabetes, implicated in monogenic diabetes, encoding type 2 diabetes drug targets or drug-metabolizing/transporting enzymes, or involved in relevant physiological processes. We analyzed 1,590 SNPs for association with incident diabetes and their interaction with response to metformin or lifestyle interventions in 2,994 DPP participants. We controlled for multiple hypothesis testing by assessing false discovery rates.
RESULTS We replicated the association of variants in the metformin transporter gene SLC47A1 with metformin response and detected nominal interactions in the AMP kinase (AMPK) gene STK11, the AMPK subunit genes PRKAA1 and PRKAA2, and a missense SNP in SLC22A1, which encodes another metformin transporter. The most significant association with diabetes incidence occurred in the AMPK subunit gene PRKAG2 (hazard ratio 1.24, 95% CI 1.09–1.40, P = 7 × 10−4). Overall, there were nominal associations with diabetes incidence at 85 SNPs and nominal interactions with the metformin and lifestyle interventions at 91 and 69 mostly nonoverlapping SNPs, respectively. The lowest P values were consistent with experiment-wide 33% false discovery rates.
CONCLUSIONS We have identified potential genetic determinants of metformin response. These results merit confirmation in independent samples.
The number of common genetic variants reproducibly associated with type 2 diabetes is growing (1). Well-powered candidate gene association studies and, more recently, genome-wide association studies (GWASs) have identified over two dozen loci robustly and reproducibly associated with type 2 diabetes or related quantitative glycemic traits. While these discoveries have advanced our understanding of the genetics of type 2 diabetes, they only explain a small fraction of the overall genetic contribution to the disease. Furthermore, in most cases, the genes involved in type 2 diabetes risk have not yet been identified: the majority of the associations detected thus far merely mark genomic regions where a certain variant is overrepresented in diseased cases versus unaffected controls, and subsequent fine-mapping and functional studies are necessary before a molecular mechanism can be ascribed to each locus.
Nevertheless, progress in the translation of genetic discoveries to clinical practice can advance along parallel paths. On the one hand, knowledge of the specific gene or variant causing the molecular phenotype is not needed to determine whether the associated region can aid prediction or affect the response to therapy; and on the other hand, targeted approaches can be applied in vivo in humans that shed light on the function of genes of interest, as a way to narrow the regions to study. These two objectives can be achieved by controlled interventions in randomized clinical trials. Such pharmacogenetic or gene-environmental strategies can fulfill both complementary roles, testing whether genetic variants predict response to therapy and whether a particular pharmacologic or lifestyle intervention affects the mode of action of specific risk loci.
In type 2 diabetes, the field of pharmacogenetics remains in its infancy. Although pharmacogenetic investigation has yielded clinically actionable results in neonatal diabetes and maturity-onset diabetes of the young, extending these studies to common type 2 diabetes has been more arduous (2). With regard to metformin, intriguing results were obtained by Shu et al. (3) in their investigation of common variants in SLC22A1, which encodes the liver-specific organic cation transporter 1 responsible for the absorption of metformin into hepatocytes: in a study of 20 human participants, carriers of reduced-function polymorphisms of SLC22A1 had a 17% higher area under the glucose curve after an oral glucose tolerance test when treated with metformin, indicating decreased responsiveness. Unfortunately, these results have not been confirmed in the long-term follow-up of a large observational cohort of patients treated with metformin monotherapy (4). Recently, a preliminary association was discovered between a variant in SLC47A1, which encodes the multidrug and toxin extrusion protein 1 (involved in the excretion of metformin into the bile and urine) and the glucose-lowering effect of metformin (5).
The Diabetes Prevention Program (DPP) can help answer some of these questions (6). The strengths of this randomized clinical trial include its enrollment of participants at high risk of developing diabetes, multiethnic composition, comprehensive longitudinal measures, and standardized behavioral and pharmacologic interventions. Extensive in-depth phenotyping and the use of behavioral and pharmacologic interventions allow characterization of the effects of known type 2 diabetes variants on diabetes incidence and response to therapy. We therefore designed a large-scale genotyping study by which we tested 1,590 variants identified through prior genetic studies of type 2 diabetes or related traits, as well as those capturing all common variation in 40 biological candidate genes, for association with diabetes incidence or response to preventive interventions (lifestyle modification or metformin) in the DPP.
RESEARCH DESIGN AND METHODS
The DPP was a 27-center randomized clinical trial in the U.S. that assessed whether metformin or lifestyle interventions prevent or delay development of diabetes in high-risk individuals. The DPP enrolled 3,234 overweight or obese people without diabetes but with impaired glucose tolerance and elevated fasting glucose and randomized them to placebo, metformin (850 mg twice daily), or a lifestyle intervention program consisting of individual and group counseling sessions conducted by dietary and exercise professionals aimed at ≥7% weight loss and ≥150 min of physical activity per week. A fourth arm of 585 subjects assigned to troglitazone (400 mg daily) was stopped because of hepatotoxicity (7). The primary end point was development of diabetes, ascertained by semi-annual measurement of fasting glucose or an annual 75-g oral glucose tolerance test, either of which was confirmed on a second occasion. The metformin and lifestyle interventions reduced the incidence of diabetes by 31% (95% CI 17–43) and 58% (95% CI 48–66), respectively, versus placebo (6). The 2,994 participants in the placebo, metformin, and lifestyle arms who gave informed consent for genetic investigation are the subjects of this study, which was approved by institutional review boards at each of the 27 participating sites. Their demographic characteristics are shown in Table 1.
We selected SNPs in two ways: 1) SNPs in high-likelihood candidate genes and 2) SNPs identified by ongoing GWASs for type 2 diabetes or related metabolic traits. The 40 candidate genes were tentatively associated with type 2 diabetes, implicated in monogenic forms of diabetes, known to encode type 2 diabetes drug targets or drug-metabolizing/transporting enzymes, or involved in cellular metabolism, hormonal regulation, or response to exercise (Table 2). We used Tagger (8) to capture (at r2 ≥ 0.8) all common (minor allele frequency >5%) variations in European (CEU) and African (YRI) HapMap populations in these candidate genes. For seven additional genes (ACE, CASQ1, GCKR, IRS1, KCNQ1, LIPC, and NOS3), rather than attempting full coverage of genetic variation, we selected a limited number of SNPs previously associated with the phenotypes of interest. As the study evolved, it became obvious that previous reports of genetic association provided an equally compelling—or perhaps even higher—prior probability of true association with type 2 diabetes traits than biological function alone; thus, we also focused on GWASs whose results were available at the time this custom-made genotyping array was designed: SNPs associated with type 2 diabetes in the Diabetes Genetics Initiative (9), DIAGRAM (10), or three smaller 100K SNP GWASs in which we participated (11–13); SNPs tentatively associated with quantitative glycemic traits (fasting glucose, the insulinogenic index, and insulin resistance by homeostasis model assessment) in the Diabetes Genetics Initiative; or SNPs associated with obesity (14,15) or lipid traits (16–18). For quality control and analytical reasons, we also included some SNPs previously genotyped in these samples, as well as ancestry-informative markers to derive a global proportion of geographic ancestry in African American (19) or Hispanic (20) participants. Finally, we included a small number of SNPs provided by investigators leading ancillary studies approved by the DPP ancillary studies and genetics subcommittees. The total number of SNPs analyzed for each category is shown in Table 2.
We initially designed a 1,536-SNP oligonucleotide pool array for the Illumina BeadArray platform (Illumina, San Diego, CA). In the 1,445 SNPs that passed quality control metrics, the sample pass rate was 99.8% and the average genotyping call rate per SNP was 98.5%. Because 91 SNPs failed genotyping on the oligonucleotide pool array, we assessed the adequacy of the coverage afforded by the successfully genotyped SNPs in each region. To rescue relevant SNPs, we used linkage disequilibrium (LD) to select proxy SNPs highly correlated to those that had failed and genotyped them on a Sequenom iPLEX platform. After quality control, 1,590 SNPs were available for analysis.
We tested the effect of each SNP on diabetes incidence under an additive genetic model by Cox proportional hazards models, using age, sex, ethnicity, and treatment arm as covariates and including treatment (metformin or lifestyle) × genotype interaction terms. In secondary analyses, we stratified participants by treatment arm; if the interaction P value was nominally significant, only stratified analyses were considered. We used the MACH software (21) and the HapMap CEU population to impute allelic calls at SNPs not directly genotyped in the DPP. Because of concerns regarding the accuracy of imputation methods in admixed populations, we restricted this procedure to individuals of self-described non-Hispanic white ethnicity. Genotype-phenotype correlations on imputed data were considered confirmatory of prior associations, as well as an initial fine-mapping exploration. Using the program STRUCTURE (22), we applied these markers trained on the HapMap populations to assign a proportion of global European ancestry to each DPP participant.
We considered two sequential approaches to correct for multiple hypothesis testing based on the number of SNPs examined (23). We first ran 1,000 permutations in which diabetes outcome was randomly assigned to an individual's genotype within each ethnicity and treatment group (keeping sex and age together with genotype, and BMI with diabetes outcome). The P value for the overall null hypothesis is the fraction of permutations (n/1,000) for which the scalar statistic is at least as extreme as that observed for the data (24). To estimate the expected proportion of type I errors among the rejected hypotheses, we also computed false discovery rates (FDRs) as in Benjamini and Hochberg (25).
Supplementary Table 1 (available in an online appendix at http://diabetes.diabetesjournals.org/cgi/content/full/db10-0543/DC1) shows that we achieved adequate coverage of all 40 genes in the two targeted populations, with 37 genes reaching at least 80% of common variants captured at r2 ≥ 0.8 in Europeans and all 40 reaching at least 70% of common variants captured at that level (comparable numbers were obtained in Africans). The average proportion of European ancestry among the DPP self-described white participants, as determined by ancestry-informative markers, was 98.9%, and the average proportion of West-African ancestry among DPP self-described African American participants was 89.3%. Given these results, we used self-described ethnicity as a covariate for these analyses. The full set of results is available in supplementary Table 2.
Table 3 shows the candidate gene regions harboring variants nominally associated with diabetes incidence in the treatment-adjusted models for the full study (i.e., there was no evidence for interaction with either intervention); only the top SNP within each gene region (out of 85 nominal associations) is given. The most significant associations occurred at SNPs in the AMP kinase (AMPK) subunit gene PRKAG2 (hazard ratio [HR] 1.24, 95% CI 1.09–1.40, P = 7.0 × 10−4 for the top SNP rs5017427, which is consistent with an experiment-wide 34% FDR). Twelve other PRKAG2 SNPs were nominally associated with diabetes (five in the top ten). Although most of them are in moderate to high LD with the index SNP (r2 ranging from 0.49 to 1.0 in HapMap CEU), at least two of them (rs954482 and rs2727537) are only weakly correlated with rs5017427 (r2 0.07 and 0.05, respectively). Nevertheless, the consistency of the association signal in this region provides reassurance with regard to the absence of genotyping artifacts in our dataset. Of SNPs previously associated with type 2 diabetes in the 100K Amish, Framingham, or Pima GWASs, three (rs1422930 in ODZ2, rs1859441 near COL2A1 and SENP1, and rs385909 near SH3YL1) had consistent nominal associations with diabetes incidence in the DPP, and two had nominally significant associations (rs10520926 and rs3136279) in the opposite direction. On the other hand, none of the six SNPs selected from the DIAGRAM meta-analysis (original odds ratio [OR] ranging from 1.05 to 1.15) were nominally significant in the DPP. Fifteen SNPs in genes that cause either maturity-onset diabetes of the young or neonatal diabetes were nominally associated with diabetes incidence; one of them, rs11868513 in HNF1B (not in LD with the previously type 2 diabetes–associated SNP rs757210), was strongly associated with diabetes incidence in the placebo arm (HR 1.69, 95% CI 1.36–2.10, P = 2 × 10−6). Finally, 14 SNPs in genes that encode metformin transporters (SLC22A1, SLC22A2, and SLC47A1) were nominally associated with diabetes incidence. Of the 85 nominal associations with diabetes incidence in DPP, only two SNPs (rs651164 in SLC22A1 and rs3736265 in PPARGC1A) were nominally associated with type 2 diabetes in DIAGRAM in a consistent direction (OR 1.08, 95% CI 1.02–1.16, P = 0.01, and OR 1.15, 95% CI 1.01–1.31, P = 0.04, respectively), with 60 other SNPs not being nominally significantly associated in DIAGRAM and 23 SNPs not captured in that dataset.
Table 4 shows the candidate gene regions harboring variants that have a nominally significant genotype × metformin interaction; only the top SNP within each gene region is given (out of 91 nominal associations). The best result was consistent with a study-wide 33% FDR. At rs8065082 in SLC47A1, there was a nominal interaction with metformin (P = 0.006), with the minor allele associated with lower diabetes incidence in the metformin arm (HR 0.78, 95% CI 0.64–0.96, P = 0.02) but not in the placebo arm (1.15, 0.97–1.37, P = 0.11). At this locus, major allele homozygotes did not benefit from metformin with regard to diabetes prevention (HR 1.07, 95% CI 0.77–1.50, vs. placebo, P = 0.68), whereas minor allele carriers did (0.58, 0.46–0.73, vs. placebo, P < 0.001; Fig. 1). We also noted a nominally significant interaction of a missense SNP in SLC22A1 (rs683369, encoding L160F) with metformin, with the major allele protecting from diabetes in the metformin arm (HR 0.69, 95% CI 0.53–0.89, P = 0.004) but not the placebo arm (1.01, 0.79–1.30, P = 0.91); the major allele is therefore associated with 31% risk reduction in diabetes incidence but only under the action of metformin. In this arm, the likelihood of developing diabetes depended on the number of phenylalanine alleles (HR 0.72, 95% CI 0.59–0.88, vs. placebo for LL homozygotes; 0.92, 0.66–1.28, for heterozygotes; and 1.44, 0.56–3.67, for FF homozygotes). There were five nominally significant interactions at SNPs encoding putative drug targets for metformin, in the gene encoding the AMPK kinase STK11 and the AMPK subunit genes PRKAA1, PRKAA2, and PRKAB2, respectively. A total of 22 SNPs in the ABCC8-KCNJ11 region also had nominally significant interactions with metformin, including rs5215, which is tightly linked to the widely replicated type 2 diabetes–associated missense SNP rs5219 (E23K) in KCNJ11.
Table 5 shows the candidate gene regions harboring variants that have a nominally significant interaction with the lifestyle intervention; only the top SNP within each gene region is given (out of 69 nominal associations). The best result was consistent with an experiment-wide 84% FDR. Twelve of the top findings were in four AMPK subunit genes (PRKAA2, PRKAB2, PRKAG1, and PRKAG2), and 11 SNPs clustered around the peroxisome proliferator–associated receptor γ coactivators 1α and 1β (PPARGC1A and PPARGC1B, respectively).
Review of 1,609 SNPs imputed in non-Hispanic white DPP participants (supplementary Table 3) revealed the nominal association of other PRKAG2 SNPs with diabetes incidence (best P = 5 × 10−5). Imputed SNPs in the PRKAA1, PRKAA2, and ABCC8-KCNJ11 regions also had nominally significant interactions with metformin.
We conducted a large-scale genotyping study in the DPP, with the aim to test whether common variants in candidate genes involved in major spheres of human physiology predict diabetes incidence or response to preventive interventions in a multiethnic at-risk population. Our secondary purpose was to characterize the mechanism of action of previously associated variants. We provide evidence supporting a previously reported association of variants in the metformin transporter gene SLC47A1 with weaker metformin response, here defined as the reduced ability of metformin to lower diabetes incidence (5). We identified a number of nominal associations with diabetes incidence or metformin response in several compelling candidate genes; however, none stand strict statistical correction for multiple hypothesis testing by FDRs.
Correction for multiple tests requires careful consideration in genetic association studies (26). When large numbers of SNPs are tested, methods that are valid in the presence of correlations due to LD, such as permutation methods or evaluation of FDR, are preferred over those that assume independence of SNPs. The scope of the present analysis is guided by technological convenience, and it might be argued that the number of distinct scientific hypotheses formulated, rather than the physical size of the genotyping array, is most relevant to the interpretation of results. However, what constitutes a single hypothesis (e.g., an SNP, a gene, an entire pathway, or a constellation of phenotypes) is subjective. On the other hand, correcting for the equivalent of the universe of independent common variants in the human genome (empirically estimated at ∼1 million ) is gaining increasing favor among genetic statisticians. In this context, the novel findings reported here should be viewed as hypothesis-generating.
We previously quantified the power of the DPP to detect modest genetic effects on diabetes incidence (28). Assuming there are no gene-treatment interactions, these calculations show that the overall DPP cohort has 83% power to detect a previously reported effect size of ∼20% for an SNP of 10% frequency at an α level of 0.05, while the placebo, lifestyle modification, and metformin arms have 53, 34, and 44% power, respectively. The DPP has inadequate power for detecting an effect size of <10%. Thus, it is not surprising that the DPP does not replicate all GWAS-derived findings in that range or that it fails to reach genome-wide significance in discovery efforts. Our null results on diabetes incidence for truly associated variants may be due to the high-risk population at baseline, the short time of follow-up (3.2 years on average), and/or the use of interventions effective in reducing diabetes incidence. On the other hand, considering the number of variants likely to influence the phenotypes under study, even submaximal power is likely to provide a number of true positive associations. In this context, genotyped and imputed SNPs in the gene encoding the AMPK γ2 subunit (PRKAG2) merit further consideration. While the association of SNPs in genes that encode metformin transporters with type 2 diabetes in the entire DPP cohort (if real) requires explanation, this could be due to a sufficiently strong effect in the metformin arm alone. Alternatively, SNPs in this region could be capturing variants in other nearby genes: for instance, immediately upstream of SLC22A1 and SLC22A2 in chromosome 6 lies the gene encoding the insulin-like growth factor 2 receptor (IGF2R), an excellent biological candidate.
This study constitutes the first large-scale prospective pharmacogenetic evaluation of metformin action in a controlled clinical trial. The UK Prospective Diabetes Study (29) and A Diabetes Outcome Progression Trial (ADOPT) (30) investigators independently showed that a substantial proportion of patients with type 2 diabetes eventually fail metformin therapy, defined by a need for additional pharmacotherapy to control hyperglycemia. Given the higher prior probability afforded by the known biological role of SLC47A1 in disposing of metformin and the previously reported genetic association of the major allele at SNP rs2289669 with poorer metformin response (5), validation in the DPP can be convincing without achieving the levels of statistical significance required for novel findings. Our index SNP (rs8065082) is in tight LD with rs2289669 (r2 ∼0.8 in HapMap CEU) and the direction of effect is consistent in DPP, a cohort nearly 10-fold larger than the one documented in the original report from Rotterdam (5). Thus, our findings confirm those of Becker et al. (5) and suggest that major allele homozygotes at this locus (∼30% of the European population) may experience suboptimal responses to metformin treatment.
Our findings on the SLC22A1 locus and metformin response are less robust. While our noted association with a missense SNP appears compelling, it is not among the most functional human variants described by Shu et al. (3), and it is in weak LD with rs622342 (r2 ∼0.14 in HapMap CEU), a SLC22A1 SNP associated with metformin response in another report from Rotterdam (31). SNP rs622342 was included among our tag SNPs but showed no evidence of an interaction with metformin (nominal P = 0.69) or an effect on diabetes incidence in any arm, raising the possibility that the original finding may have been spurious. Similarly, the SLC22A2 missense SNP rs316019 (A270S), reported to influence metformin renal excretion and affect its plasma concentrations (32), did not significantly interact with metformin in the DPP (nominal P = 0.35). Our novel findings in the putative metformin drug targets STK11 and AMPK require confirmation, as do those in MEF2A and MEF2D, themselves regulated by AMPK (33). One of the most significant interactions with metformin occurred at an SNP in HNF4A; given its role in hepatic gluconeogenesis (34), this intriguing result deserves further exploration. In contrast, the multiple interactions noted in the ABCC8-KCNJ11 locus reported previously (35) do not offer a clear mechanism of action. Finally, nominal associations with response to lifestyle modification should be replicated in cohorts that underwent a similar intervention.
In summary, we have conducted a large-scale genetic association study in the DPP and replicated the association of a polymorphism in a metformin transporter with metformin response. Other hypothesis-generating results require more detailed characterization in the DPP and follow-up in independent samples. A focus on likely functional variants may uncover loci with stronger effects.
The National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) of the National Institutes of Health provided funding to the clinical centers and the coordinating center for the design and conduct of the study and collection, management, analysis, and interpretation of the data. The Southwestern American Indian Centers were supported directly by the NIDDK and the Indian Health Service. The General Clinical Research Center Program, National Center for Research Resources, and the Department of Veterans Affairs supported data collection at many of the clinical centers. Funding for data collection and participant support was also provided by the Office of Research on Minority Health, the National Institute of Child Health and Human Development, the National Institute on Aging, the Office of Research on Women's Health, the Centers for Disease Control and Prevention, and the American Diabetes Association. Bristol-Myers Squibb and Parke-Davis provided medication. This research was also supported in part by the intramural research program of the NIDDK. LifeScan, Health O Meter, Hoechst Marion Roussel, Merck-Medco Managed Care, Merck and Company, Nike Sports Marketing, Slim Fast Foods, and Quaker Oats donated materials, equipment, or medicines for concomitant conditions. McKesson BioServices, Matthews Media Group, and the Henry M. Jackson Foundation provided support services under subcontract with the coordinating center. A complete list of centers, investigators, and staff can be found in the online appendix.
This work was funded by R01 DK072041 to K.A.J., T.I.P., A.R.S., D.A., and J.C.F. J.C.F. is also supported by the Massachusetts General Hospital and a Clinical Scientist Development Award by the Doris Duke Charitable Foundation. This work was partially supported by a Doris Duke Charitable Foundation Distinguished Scientist Clinical Award to D.A. P.W.F. was supported in part by grants from Novo Nordisk, the Swedish Heart-Lung Foundation, the Swedish Diabetes Association, Påhlssons Foundation, the Swedish Research Council, and a Career Development Award from Umeå University. J.C.F. received consulting honoraria from Publicis Healthcare, Merck, bioStrategies, XOMA, and Daiichi-Sankyo and has been a paid invited speaker at internal scientific seminars hosted by Pfizer and Alnylam Pharmaceuticals.
No other potential conflicts of interest relevant to this article were reported.
R.L.H. and W.C.K. contributed to the participant recruitment, interventions, and outcomes assessment. P.W.F., T.I.P., R.L.H., R.S., A.R.S., W.C.K., D.A., and J.C.F. compiled the list of candidate genes. J.B.M. and P.I.W.d.B. performed the tagging procedure in European and African populations. P.W.F., T.I.P., R.L.H., R.S., A.R.S., and J.C.F. selected SNPs within those genes. J.B.M. directed the genotyping with supervision from J.C.F. K.A.J., P.I.W.d.B., T.I.P., R.L.H., and J.C.F. constructed the analytical pipeline. K.A.J. conducted all statistical analyses, with input from P.I.W.d.B., T.I.P., R.L.H., S.F., W.C.K., and D.A. K.A.J., R.L.H., and P.I.W.d.B. implemented the permutation procedure, with input from T.I.P., D.A., and J.C.F. J.B.M. and P.I.W.d.B. derived individual estimates of global ancestry and carried out SNP imputation. J.C.F. wrote the manuscript. All authors contributed to the discussion and reviewed and edited the manuscript.
The investigators gratefully acknowledge the commitment and dedication of the participants of the DPP.
*A list of the Diabetes Prevention Program Research Group investigators is provided in the online appendix, available at http://diabetes.diabetesjournals.org/cgi/content/full/db10-0543/DC1.
The opinions expressed in this article are those of the investigators and do not necessarily reflect the views of the Indian Health Service or other funding agencies.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Received April 18, 2010.
- Accepted July 18, 2010.
- © 2010 by the American Diabetes Association.
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. See http://creativecommons.org/licenses/by-nc-nd/3.0/ for details.