Diabetes 56:3063-3074, 2007 DOI: 10.2337/db07-0451 © 2007 by the American Diabetes Association
A 100K Genome-Wide Association Scan for Diabetes and Related Traits in the Framingham Heart StudyReplication and Integration With Other Genome-Wide Datasets
1 Center for Human Genetic Research and Diabetes Unit, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts Address correspondence and reprint requests to James B. Meigs, MD, MPH, General Medicine Division, Massachusetts General Hospital, 50 Staniford St., 9th Floor, Boston, MA 02114. E-mail: jmeigs{at}partners.org
Abbreviations:
DGI, Diabetes Genetics Initiative; FPG, fasting plasma glucose; FBAT, family-based association test; FHS, Framingham Heart Study; GEE, generalized estimating equations; GWA, genome-wide association; HOMA-IR, homeostasis model assessment of insulin resistance; ISI, insulin sensitivity index; MAF, minor allele frequency; mFPG, 28-year mean fasting plasma glucose; NIH, National Institutes of Health; SNP, single nucleotide polymorphism
OBJECTIVE— To use genome-wide fixed marker arrays and improved analytical tools to detect genetic associations with type 2 diabetes in a carefully phenotyped human sample. RESEARCH DESIGN AND METHODS— A total of 1,087 Framingham Heart Study (FHS) family members were genotyped on the Affymetrix 100K single nucleotide polymorphism (SNP) array and examined for association with incident diabetes and six diabetes-related quantitative traits. Quality control filters yielded 66,543 SNPs for association testing. We used two complementary SNP selection strategies (a "lowest P value" strategy and a "multiple related trait" strategy) to prioritize 763 SNPs for replication. We genotyped a subset of 150 SNPs in a nonoverlapping sample of 1,465 FHS unrelated subjects and examined all 763 SNPs for in silico replication in three other 100K and one 500K genome-wide association (GWA) datasets. RESULTS— We replicated associations of 13 SNPs with one or more traits in the FHS unrelated sample (16 expected under the null); none of them showed convincing in silico replication in 100K scans. Seventy-eight SNPs were nominally associated with diabetes in one other 100K GWA scan, and two (rs2863389 and rs7935082) in more than one. Twenty-five SNPs showed promising associations with diabetes-related traits in 500K GWA data; one of them (rs952635) replicated in FHS. Five previously reported associations were confirmed in our initial dataset. CONCLUSIONS— The FHS 100K GWA resource is useful for follow-up of genetic associations with diabetes-related quantitative traits. Discovery of new diabetes genes will require larger samples and a denser array combined with well-powered replication strategies. The genetic architecture of type 2 diabetes appears to be composed of several genes, each of which has a modest impact on disease risk (1). Despite significant advances in our understanding of the genetic determinants of the monogenic forms of diabetes (2), the definitive identification of genes that increase risk of common type 2 diabetes in the general population has been far more elusive. Candidate gene studies have led to the association of several common variants with type 2 diabetes (3). Besides a handful of widely reproduced associations, however, many previously reported associations have not been convincingly replicated despite well-powered attempts to do so. The type 2 diabetes genetics literature is plagued by extensive and often conflicting reports of association. In addition, current gene discovery strategies have frequently focused on coding regions, which overlook regulatory variants that can also influence disease (4,5). Thus, identification of novel type 2 diabetes genes requires complementary approaches that identify high-likelihood variants on the basis of empiric associations derived from well-phenotyped, well-powered cohorts. It is now possible to perform genome-wide association (GWA) studies, which are agnostic to biological plausibility and to the putative functional status of the assayed variants (6). The development of high-throughput genotyping platforms, the compilation of single nucleotide polymorphisms (SNPs) in public databases (7), the dissemination of new analytical tools and statistical methods (8–13), the assembly of large patient cohorts, and the availability of the HapMap (14,15) have all made it possible to scan the human genome for variants associated with disease, without imposing a priori assumptions that may bias the outcome of the scan. Several GWA studies for type 2 diabetes have been performed in recent months (16–20), making it possible to integrate data, replicate findings, extend them into other populations, and perform more detailed phenotypic characterizations. Here, we report results from the Framingham Heart Study (FHS) 100K SNP GWA scan for type 2 diabetes and related traits; replication of top 100K findings in an independent, unrelated FHS sample; initial integration of FHS 100K data with three other 100K (21–23) and one 500K (http://www.broad.mit.edu/diabetes/) type 2 diabetes GWA scans; and the use of the FHS 100K resource to confirm high-likelihood associations reported by others (16–20). This scan complements other large extant type 2 diabetes GWA studies in three major respects: It is population based (not diabetes proband based), its genetic information comprises two generations, and it is based on compiled data from decades of longitudinal standardized follow-up with detailed phenotyping of the offspring generation. A general and preliminary description of the full FHS 100K GWA resource has been published elsewhere (24).
The FHS. The FHS is a community-based, multigenerational, longitudinal study of cardiovascular disease and its risk factors, including diabetes. The FHS comprises the original cohort, offspring, and generation 3 studies. Subjects described in the present analysis include 1,087 individuals from the FHS offspring "family sample," composed of the 307 largest pedigrees previously selected for linkage analyses (25). These subjects, 560 of whom were women and whose mean age at last follow-up was 59 years, were genotyped on the Affymetrix 100K array (Table 1). The study was approved by Boston University's Institutional Review Board, and informed consent, including consent for genetic analyses, was obtained for all study participants.
Offspring subjects have been examined every 4 years since study onset, except for an 8-year interval between exams 1 and 2, with a standardized medical history and directed physical examination at each exam cycle and collection of an extensive array of diabetes-related quantitative traits and phenotypes (26). In this analysis, our principal diabetes-related quantitative traits come from exam 5 (1991–1994) in which data from a 75-g oral glucose tolerance test are available for all nondiabetic offspring. Diabetes-related quantitative traits include exam 5 fasting plasma glucose (FPG), glycated hemoglobin (A1C), fasting insulin, insulin resistance measured by homeostasis model assessment of insulin resistance (HOMA-IR) (27), Gutt's 0- to 120-min insulin sensitivity index (ISI_0–120) (28), the 28-year time-averaged FPG level obtained from exams 1–7 (mFPG), and incident categorical type 2 diabetes assessed over 28 years of follow-up. Laboratory methods for all quantitative traits have been described previously (26).
We used 2003 American Diabetes Association clinical criteria to define diabetes, in which a case was defined as new use of oral hypoglycemic or insulin therapy or a FPG As reported elsewhere (32), this sample size and analytical approach have 97% power to detect a variant that accounts for 2% of the variance in a quantitative trait and 63% power to detect a variant that accounts for 1% of the variance in such a trait.
Replication samples.
Genotyping. Follow-up genotyping was performed by allele-specific multiplex primer extension of PCR-amplified products with detection by matrix-assisted laser desorption ionization–time of flight mass spectroscopy using the Sequenom iPLEX platform (34). Genotyping call rates were 98.3%, and concordance between the Affymetrix and Sequenom platforms on 150 SNPs genotyped on 251 overlapping subjects reached 99.6%.
SNP prioritization in the FHS 100K scan.
Statistical analysis. For quantitative traits, we used additive GEE and FBAT models to test associations between alleles and age-, age2-, and sex-adjusted residual trait values. In subsidiary models, we also adjusted association results for BMI. The application of these methods to the FHS 100K dataset has been described in detail (32). GEE are a population-based test that takes into account familial correlation of the phenotype: it is prone to increased type 1 error for SNPs with low frequency and in the presence of population admixture, which is not a major concern in the FHS (A.K.M., J.D., L.A.C., unpublished observations). FBAT is a within-family test that controls for population admixture. The test looks for an association between the transmission of an allele and the quantitative trait, that is, it examines whether the transmission of one allele is associated with different levels of the quantitative trait; it is less powerful and more conservative than GEE. For incident type 2 diabetes, we tested association using two complementary approaches that used longitudinal information on age at onset of diabetes or age through end of follow-up without diabetes. First, we used Cox proportional hazard survival analysis with robust covariance estimates to test SNPs against the hazard of new cases of diabetes over all seven exams, with time failure at the exam when diabetes was diagnosed, or disease-free censoring at last follow-up without diabetes (35). We used Cox models to estimate the hazard ratio (HR) and 95% CIs associated with the risk allele. Second, we created Martingale residuals from a sex-adjusted model in which high negative values indicated young diabetes age of onset and high positive values indicated older age without diabetes at follow-up, and we analyzed residuals using FBAT (36). To replicate 100K associations in the FHS unrelated replication sample, we used the same statistical methods, except that a general regression model was used to explore associations with quantitative traits, FBAT tests were not applied, and no robust covariance estimate was needed for the Cox survival analysis because the sample consists of unrelated participants. Comparison with other datasets was restricted to a test of whether any SNPs selected from the FHS 100K array were associated with diabetes as a categorical trait in the second dataset at a nominal P < 0.05. For the 500K replication analysis, we also tested whether association of any of our selected SNPs with FPG or HOMA-IR were replicated in DGI at a nominal P < 0.05. This serial replication strategy yields equivalent power as the joint analysis when <1% of SNPs are promoted to the second stage (13). To obtain the null expectation of the number of SNPs chosen for replication, we performed a constrained permutation test that both retained the correlation between the traits and attempted to maintain the trait heritability observed in our sample. We permuted the traits together by matching the rank of a phenotype derived from a principal components analysis of the six traits to the rank of a heritable simulated phenotype, thereby maintaining some of the correlation between individuals in the same family (37). The null distribution from 100 replications showed that the overall selection strategy would yield 152 "associated" nonredundant SNPs on average, with a SD of 16 if there were no true positive association to be found anywhere on the genome. The null expectation for each of the various steps in our analysis is shown in Fig. 1.
100K genotyping. Of 116,204 SNPs on the 100K Affymetrix fixed array, 66,543 SNPs passed quality control filters, including genotyping call rate, Hardy-Weinberg equilibrium, and MAF thresholds (Fig. 2A). We noted that the GEE P value distribution deviated from the null expectation for any single quantitative trait: up to 28% more P values were <0.001 than expected if no SNPs were associated. The deviation was more extreme for smaller significance levels. Nevertheless, this deviation did not change significantly when analyses were restricted to increasingly stringent call rate cutoffs, suggesting that it was independent from call rate and not due to nonrandom missing data. Such deviation was not present for the FBAT analyses (Fig. 2B).
SNP selection. The "pure P value" strategy yielded 683 SNPs associated with any of six primary quantitative traits or diabetes in either GEE or FBAT at P < 0.001. No result achieved conventional genome-wide significance (P 5 x 10–8) (14,15). The "multiple related trait" strategy yielded 191 SNPs, 51 of which also showed P < 0.01 for incident diabetes, and 111 of which had P < 0.001 for at least one trait (thus overlapping with the first set). We used linkage disequilibrium between SNPs (pairwise r2 > 0.8) to select a nonredundant subset of 155 SNPs for further replication (of which 41 also showed P < 0.01 for incident diabetes and 85 had P < 0.001 for at least one trait). The probability of selecting 155 or more nonredundant SNPs if there were no true association to be detected anywhere on the genome was estimated to be 50% by permutation. Hence, the number of SNPs chosen by our selection strategy does not differ substantially from the expectation of 152 SNPs expected under the null hypothesis (Fig. 1). The combination of these two approaches yielded 763 unique SNPs with evidence for association with diabetes or related traits (Supplementary Table 1).
Follow-up genotyping.
In silico replication. Our 100K Type 2 Diabetes Consortium collaborators tested all 763 FHS-associated SNPs for association with type 2 diabetes in their respective datasets. Of the 13 SNPs obtained from the multiple related trait strategy and replicated in the follow-up FHS unrelated sample (Table 2), none showed a nominal P value <0.05 consistent with the expected direction of effect (Table 4). Six of these 13 SNPs were also present in the Affymetrix 500K array used by the DGI, 2 of them had perfect proxies (pairwise r2 = 1.0), and an adequate proxy (r2 0.6) could be obtained for an additional 4 SNPs based on Phase II HapMap CEU genotypes; none of them showed association with type 2 diabetes in the DGI, although 2 of them (rs6664618 and rs17281232, see below) did show a suggestive association with insulin resistance, and rs6664618 also had a nominal association with FPG (Table 4).
These data do not offer consistent evidence for association with any one SNP across all datasets. For instance, nominal P values for the association of the minor C allele at rs10500679 with higher insulin resistance measures and lower A1C in FHS are mutually inconsistent, as is the association of its major G allele with diabetes in the Mexican-American dataset, whereas the minor T allele of SNP rs17281232 (which is in strong linkage disequilibrium with rs10500679 in Europeans, r2 = 0.92) is associated with insulin resistance by HOMA-IR in the 500K DGI scan (P = 0.004). In an analogous manner, the nominal P value of 0.052 obtained for rs2806739 in the Pima Indian case-control dataset indicates that the T allele would be protective for diabetes (odds ratio [OR] 0.75), whereas this same allele is associated with higher FPG in FHS in the initial 100K dataset and with higher incidence of diabetes on replication (HR 1.51 [95% CI 1.2–1.9], Cox P = 0.001). Taken together, these conflicting nominal results caution that the suggestive associations found here could be statistical fluctuations rather than indicating true genetic risk for diabetes. Several of the 13 SNPs showed some consistent trends in the replication samples, albeit not nominally significant. The G allele of rs1489100 was associated with protection from diabetes both in the initial FHS 100K scan (HR 0.75 [95% CI 0.54–1.0], Cox P = 0.089, FBAT P = 0.001) and in the FHS replication sample (0.72 [0.57–0.91], Cox P = 0.007); consistent with this effect, the G allele was associated with lower glucose levels (as measured by all three glucose-related traits) in the initial scan. The association of rs1489100 with diabetes trended in the same direction in the Pima Indian case-control dataset (OR 0.80, P = 0.09). However, a 500K array SNP in perfect linkage disequilibrium with rs1489100 in Europeans (rs7620001, r2 = 1.0) was not associated with diabetes (OR 0.92 [95% CI 0.81–1.04], P = 0.53), FPG (P = 0.55), or HOMA-IR (P = 0.67) in the DGI dataset (Table 4). The G allele at rs729511 was associated with diabetes incidence in the initial 100K FHS scan (HR 1.43 [95% CI 1.0–2.0], Cox P = 0.03, FBAT P = 0.008) and with insulin resistance as measured by all three insulin traits (P = 0.002–0.004); fasting insulin and HOMA-IR also showed nominal association in the FHS replication sample (P = 0.02 for both). The direction of effect for this SNP was consistent in the Pima Indian case-control sample (OR 1.24, P = 0.09), but there was no association with diabetes (OR 0.96 for the A allele [95% CI 0.85–1.09], P = 0.79) or HOMA-IR (P = 0.47) for the same SNP in the DGI (Table 4). The minor G allele at rs952635 was associated with lower diabetes incidence in the initial 100K FHS dataset (HR 0.56 [95% CI 0.40–0.79], Cox P = 0.0007); it was also associated with lower glucose levels and greater insulin sensitivity in both the initial and follow-up FHS genotyping. Interestingly, the DGI 500K SNP which had the strongest linkage disequilibrium with rs952635 in Europeans (rs6664618, r2 = 0.60) showed nominally significant lower FPG (P < 0.04) and a trend toward greater insulin sensitivity for the tagging allele (P = 0.057). Of all 763 FHS-associated SNPs, 78 showed nominal association with type 2 diabetes in one other dataset (57 expected under the null), and 2 (rs2863389 and rs7935082) showed nominal association in more than one (1.4 expected under the null); all results are presented in Supplementary Table 1. The T allele at SNP rs2863389 was protective against diabetes (HR 0.41 [95% CI 0.25–0.69], Cox P = 0.0006), whereas the alternate C allele was associated with higher FPG and mFPG in the FHS sample (P = 0.005 and 0.0005, respectively); the T allele also showed consistent protection from type 2 diabetes in Mexican Americans (OR 0.43, nominal P = 0.03) and in the Amish (OR 0.71, nominal P = 0.04) with similar trends in the Pima Indians. At SNP rs7935082, the C allele was associated with higher FPG in FHS (FBAT P = 0.0006), whereas the alternate T allele was nominally protective from diabetes in the Mexican Americans (OR 0.53, P = 0.049) and in the Pima Indians (OR 0.58, P = 0.009). Of the others, three SNPs revealed suggestive trends: Two SNPs in perfect linkage disequilibrium with each other (rs2378199 and rs6059961, r2 = 1.0) showed nominally significant association with type 2 diabetes in both tests of association used by the Pima Indian investigators for their overlapping (i.e., nonindependent) case-control and sibship samples; this association followed the same direction as that seen in FHS and was consistent with the expected changes in quantitative traits resulting from altered glycemic pathophysiology. Similarly, one other SNP (rs6058115) that was associated with all three insulin traits in FHS showed nominal association with diabetes in the Pima Indian sibs (P = 0.042) and neared nominal association in the overlapping Pima Indian case-control sample (P = 0.054).
We further examined our 763 SNPs for replication in the public 500K DGI resource. Of the 763 SNPs, 206 (27.0%) were present in both Affymetrix genotyping arrays, an adequate proxy (r2
Positive controls. The 100K FHS resource also serves as a resource in which to pursue phenotypic characterization and further validation of putative diabetes risk SNPs reported in other datasets. We therefore sought to replicate the widely reproduced TCF7L2 association, and the top findings reported in five recent high-density GWA scans (16–20). The 100K SNP rs7100927 was in moderate linkage disequilibrium (r2 = 0.50) with the diabetes-associated TCF7L2 SNP rs7903146 and was associated with risk of diabetes (HR 1.56 [95% CI 1.1–2.1], Cox P = 0.007) and with mFPG (GEE P = 0.03) in the FHS 100K dataset. We confirmed this association by directly genotyping rs7903146 in both the family and unrelated samples, obtaining association with diabetes incidence (1.28 [1.08–1.52], Cox P = 0.005). Interestingly, the risk T allele at rs7903146 was directly associated with mFPG and inversely associated with insulin sensitivity adjusted for ß-cell function as measured by the ISI_0–120 (nominal GEE P = 0.03 for both), an effect that persisted after adjustment for BMI. The TCF7L2 100K SNP rs7100927 is also in strong linkage disequilibrium (r2 = 0.93 in HapMap CEU) with SNPs rs7924080 and rs10885406, which tag a putative obesity-associated haplotype (HapA) in Caucasians; we found no statistically significant association between rs7100927 and BMI. Among other SNPs reported to be highly associated with diabetes in the recently published GWA scan, we noted moderate linkage disequilibrium with SNPs present in our 100K array (Table 6). The two HHEX SNPs were in moderate linkage disequilibrium with 100K FHS SNP rs10509645 (r2 = 0.57 and 0.70, respectively), but we found no nominal associations with diabetes incidence or related traits in the FHS. Similarly, the CDKAL1 SNP rs7754840 was in weak linkage disequilibrium with 100K FHS SNP rs2328545 (r2 = 0.35), and no nominal associations with diabetes incidence or related traits were found in the FHS. On the other hand, the FHS SNP rs1995222 was in weak linkage disequilibrium with the original SNP in SLC30A8 (r2 = 0.20), and yet it showed nominal associations with diabetes incidence (FBAT P = 0.01), FPG (FBAT P = 0.006), and mFPG (FBAT P = 0.008); the FHS SNP rs10501278 weakly tagged LOC387761 SNP rs7480010 (r2 = 0.28) and showed nominal associations with fasting insulin (FBAT P = 0.008), HOMA-IR (FBAT P = 0.01), and ISI 01–20 (FBAT P = 0.047); the risk alleles at the three EXT2 SNPs (rs1113132, rs11037909, and rs3740878) were captured by the T allele of rs962848 in the 100K array (r2 = 0.47), which was associated with higher FPG and mFPG (FBAT P = 0.002 and 0.007, respectively) and lower insulin sensitivity (nominal P = 0.049) in FHS; and the FHS SNP rs10513800 showed modest linkage disequilibrium with two IGF2BP2 SNPs (r2 = 0.33), and was nominally associated with mFPG in FHS (GEE P = 0.03).
We present initial associations with type 2 diabetes and related quantitative traits using the FHS 100K GWA resource, with replication and integration of initial associations within FHS and in silico with external GWA datasets. We did not find any single variant to be associated with diabetes or related traits in the FHS 100K sample and all replication samples, but we found a number of consistent associations worthy of follow-up. We were also able to replicate association with the confirmed diabetes risk SNP in TCF7L2 and with SNPs recently identified in high-density GWA scans (16–20). These results demonstrate the contribution that a community-based sample rich with diabetes-related quantitative trait data can make to type 2 diabetes gene discovery.
GWA scans provide a powerful tool with which to query the genome for common variants that confer modest effects on polygenic traits (6). Because of the many statistical tests involved and the high likelihood of obtaining a large number of false-positive results, it is crucial to perform rigorous genotyping quality control and set stringent statistical thresholds. Thus, unless risk variants are very common and/or have a relatively large effect on the trait under study, true results can only be detected with large sample sizes. In instances in which sample size is limiting, a replication strategy with other similarly conducted datasets is essential. This can take the form of a staged approach or a joint analysis of several stages, which requires statistical integration of disparate datasets (13). It is estimated that To prioritize SNPs from the 100K array results and to maximize the likelihood of selecting true positive associations, we developed a method that harnesses the wealth of phenotypic data in FHS while recognizing the limited statistical power of this modestly sized sample. In addition to choosing SNPs based solely on small P values, we selected SNPs that showed consistent nominal associations with multiple related traits. We reasoned that such a SNP is less likely to be a spurious finding and more likely to represent a real association with hyperglycemia/insulin resistance, at least in the FHS. We tested this latter strategy by seeking replication in a nonoverlapping cohort of unrelated FHS participants and both approaches by in silico comparisons with three 100K and one 500K datasets.
None of the primary FHS results achieved convincing replication across multiple datasets, although the two SNPs rs2863389 (not near a known gene) and rs7935082 (in intron 4 of the ubiquitous membrane-spanning 4-domain subfamily A member 7, MS4A7) showed consistent associations in two other populations (Supplementary Table 4). This low yield could be due to either initial false-positive associations or false-negative follow-up testing. In regard to the former, we note that our set of positive results did not depart significantly from the null expectation. A fraction of false-positive results may have been introduced by systematic enrichment of low P values in FHS; although this might have affected the multiple related trait selection strategy, theoretically, it should not have distorted the P value ranks used in our pure P value approach. Alternatively, true positives may have been missed because of low power. Given the emerging notion that a ceiling for the combination of effect size/allele frequency in type 2 diabetes seems to hover around that of TCF7L2 rs7903146 (16) and that diabetes-related polymorphisms may only explain a small fraction of the variance in quantitative glycemic traits, it is not surprising that our initial sample of In regard to the absence of replication, differences in ancestry among cohorts and the relatively small sample sizes of the other 100K datasets may have also precluded us from obtaining significant P values in replication, even among true positive findings. A planned joint meta-analysis of all four datasets where all test statistics are combined may help prioritize the few true positive results that remain consistent across populations. Nevertheless, the strength of the FHS resource lies in its quantitative trait database rather than in diabetes incidence; thus, such integration may be more fruitful when limited to such phenotypes.
The larger DGI 500K dataset, which contains publicly available diabetes and glycemic trait statistics for a European population similar to FHS, provides another convenient replication venue. Here, we have tested our top results and obtained a The worst-case scenario would dictate that fundamental flaws in the 100K genotyping process, in the genotype-calling algorithm, in our phenotypic characterization, or in our statistical procedures prevented us from making striking discoveries; if that were the case, we would not expect to be able to detect any real associations. The convincing results we have obtained for SNPs in TCF7L2 and other genes reported by others (16–19) indicate that FHS is a viable sample in which to replicate real results of adequate magnitude and characterize the phenotypic effects of such variants on glycemic traits and their change over time. The particular utility of the population characteristics of the FHS cohort is illustrated by our attempt to clarify the effects of TCF7L2 variants on diabetes while accounting for obesity. The association of TCF7L2 rs7903146 with type 2 diabetes is incontrovertible, having reached a P value <10–80 after meta-analysis of nearly 50,000 samples (38). This variant appears to confer risk of diabetes by causing an impairment in insulin secretion (39–41). Recently, DECODE investigators have suggested that a haplotype largely defined on the basis of the alternate C allele at rs7903146 (HapA) is associated with obesity, when case and control subjects are analyzed separately (42). However, that strategy also imposes constraints in ascertainment: Control subjects who carry the diabetes risk allele must be protected from diabetes by other factors, including a lower BMI (thus resulting in an apparent association of the C allele with BMI), whereas case subjects who carry the protective C allele must have diabetes on the basis of other components of risk, including BMI (thus resulting in the same apparent association). Therefore, population samples free of diabetes ascertainment criteria such as FHS are needed to verify whether these associations are real. We did not observe a significant association of the 100K SNP rs7100927 (which is in strong linkage disequilibrium with the variants that tag HapA) with BMI, and we found that rs7903146 was nominally associated with higher rather than lower insulin resistance. These results are consistent with those reported in other large population samples (43,44) and indicate that ascertainment on diabetes case-control status may introduce spurious associations when neither phenotypic traits nor haplotype variants are independent.
Some findings presented here appear promising and merit further exploration, including the 13 SNPs that replicated across the two FHS samples, the 5 SNPs that stood out within the Type 2 Diabetes 100K Consortium in silico replication effort, and the 25 SNPs that show suggestive replication in the DGI. None of these 41 unique SNPs lies in genes that would be considered high-likelihood biological candidates, and none represents a coding change. The precedent afforded by TCF7L2 reassures us that nonbiased genetic screens can uncover novel biology. The upcoming high-density GWA scan in
J.C.F. has received National Institutes of Health (NIH) Research Career Award K23 DK-65978-04. J.B.M. has received an American Diabetes Association Career Development Award. This study has been supported by the National Heart, Lung, and Blood Institute FHS (contract no. N01-HC-25195) and the Boston University Linux Cluster for Genetic Analysis funded by the NIH National Center for Research Resources Shared Instrumentation Grant (1S10RR163736-01A1). The Broad Institute Center for Genotyping and Analysis is supported by grant U54 RR020278-01 from the National Center for Research Resources. We thank our co-investigators of the 100K Type 2 Diabetes Consortium at the University of Maryland, the University of Chicago, the University of Texas-Houston, and National Institute of Diabetes and Digestive and Kidney Diseases, Phoenix, for their generous, trusting, and industrious collaboration and our colleagues at the FHS, the Broad Institute, and Massachusetts General Hospital for many helpful and constructive discussions framing the conduct and implications of type 2 diabetes GWA scans.
Published ahead of print at http://diabetes.diabetesjournals.org on 11 September 2007. DOI: 10.2337/db07-0451. J.B.M. has received research grants from GlaxoSmithKline and Wyeth and serves on safety or advisory boards for GlaxoSmithKline and Lilly. Additional information for this article can be found in an online appendix at http://dx.doi.org/10.2337/db07-0451. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Received for publication April 2, 2007 and accepted in revised form September 5, 2007
This article has been cited by other articles:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||