Combined Analyses of 20 Common Obesity Susceptibility Variants
- Camilla Helene Sandholt1,2,
- Thomas Sparsø1,
- Niels Grarup1,
- Anders Albrechtsen3,
- Katrine Almind2,
- Lars Hansen4,
- Ulla Toft5,
- Torben Jørgensen5,6,
- Torben Hansen1,7 and
- Oluf Pedersen1,8,9
- 1Hagedorn Research Institute, Gentofte, Denmark;
- 2Medical and Science Department, Development Projects, Novo Nordisk A/S, Bagsværd, Denmark;
- 3Department of Biostatistics, University of Copenhagen, Copenhagen, Denmark;
- 4Discovery Medicine and Clinical Pharmacology Department, Cardiovascular and Metabolic Diseases, Bristol-Myers Squibb, Princeton, New Jersey;
- 5Research Centre for Prevention and Health, Glostrup University Hospital, Glostrup, Denmark;
- 6Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark;
- 7Faculty of Health Sciences, University of Southern Denmark, Odense, Denmark;
- 8Faculty of Health Sciences, Aarhus University, Aarhus, Denmark;
- 9Institute of Biomedical Sciences, University of Copenhagen, Copenhagen, Denmark.
- Corresponding author: Camilla Helene Sandholt, .
OBJECTIVE Genome-wide association studies and linkage studies have identified 20 validated genetic variants associated with obesity and/or related phenotypes. The variants are common, and they individually exhibit small-to-modest effect sizes.
RESEARCH DESIGN AND METHODS In this study we investigate the combined effect of these variants and their ability to discriminate between normal weight and overweight/obese individuals. We applied receiver operating characteristics (ROC) curves, and estimated the area under the ROC curve (AUC) as a measure of the discriminatory ability. The analyses were performed cross-sectionally in the population-based Inter99 cohort where 1,725 normal weight, 1,519 overweight, and 681 obese individuals were successfully genotyped for all 20 variants.
RESULTS When combining all variants, the 10% of the study participants who carried more than 22 risk-alleles showed a significant increase in probability of being both overweight with an odds ratio of 2.00 (1.47–2.72), P = 4.0 × 10−5, and obese with an OR of 2.62 (1.76–3.92), P = 6.4 × 10−7, compared with the 10% of the study participants who carried less than 14 risk-alleles. Discrimination ability for overweight and obesity, using the 20 single nucleotide polymorphisms (SNPs), was determined to AUCs of 0.53 and 0.58, respectively. When combining SNP data with conventional nongenetic risk factors of obesity, the discrimination ability increased to 0.64 for overweight and 0.69 for obesity. The latter is significantly higher (P < 0.001) than for the nongenetic factors alone (AUC = 0.67).
CONCLUSIONS The discriminative value of the 20 validated common obesity variants is at present time sparse and too weak for clinical utility, however, they add to increase the discrimination ability of conventional nongenetic risk factors.
The prevalence of obesity is increasing rapidly in all parts of the world. The primary cause of the current epidemic development is likely an unhealthy lifestyle, especially high calorie intake and insufficient physical activity. However, studies have established that the pathogenesis of obesity also includes a genetic component predisposing some individuals to gain more weight from a sedentary lifestyle (1–3). Until 2007 none of the suggested susceptibility variants for common obesity were convincingly validated. Genome-wide association studies (GWAS) have, however, with an agnostic approach changed the success of identifying common single nucleotide polymorphisms (SNPs), modifying the risk for common complex diseases including obesity. FTO was the first well-replicated obesity susceptibility locus to be identified through GWAS (4) and moreover has been identified in other independent studies (5–7).
Subsequently, variants downstream of MC4R (8,9) were identified in meta-analyses of GWAS, and linkage peaks in PCSK1 were re-sequenced identifying two coding variants reaching the genome-wide significance threshold (10). A second wave of GWAS of obesity have recently identified 15 additional loci: TMEM18, SH2B1, KCTD15, NEGR1 (11,12), SEC16B, SFRS10, BDNF, FAIM2, BAT2 (11), GNPDA2, MTCH2 (12), NCP1, MAF, PTER, and PRL (13).
Individually all of these common variants exert small-to-modest effect sizes, however, whether the combined effect of the 20 variants increases in an additive manner has not been elucidated. So far the combined ability of nine obesity variants to discriminate obese individuals from lean individuals has been reported (14). In the present study, we estimate the discriminative value of 20 SNPs in the 18 newly identified obesity loci in a Danish population-based cohort both separately and in combination with conventional nongenetic risk factors of obesity. We also examine whether the 20 obesity-related variants exhibit additive combined effects or whether synergistic effects caused by gene-gene interactions are present.
RESEARCH DESIGN AND METHODS
The 20 obesity susceptibility variants were genotyped in 6,514 individuals from the Inter99 cohort, which is a population-based, randomized, nonpharmacological intervention study of middle-aged Danes for the prevention of ischemic heart disease conducted at the Research Centre for Prevention and Health in Glostrup, Copenhagen (clinical trial registry no. NCT00289237) (15).
Body weight and height were measured in lightweight indoor clothes and without shoes. BMI was defined as weight in kilograms divided by height in meters squared (kg/m2). Overweight and obesity were defined as 25 kg/m2 ≤ BMI <30 kg/m2 and BMI ≥30 kg/m2, respectively. A total of 6,510 individuals had available information about BMI, and 2,831 were normal weight, 2,543 were overweight, and 1,136 were obese. All study participants were Danes by self-report, and only individuals with Caucasian ethnicity were included in the genetic analyses. Informed written consent was obtained from all subjects before participation. The study was approved by the Ethical Committee of Copenhagen County and was in accordance with the principles of the Helsinki Declaration.
Nongenetic risk factors in the population-based Inter99 cohort.
All participants completed a comprehensive questionnaire including questions about education level, employment, dietary intake, medication, smoking habits, and physical activity.
Educational level and employment were combined to a social status score with five levels: 1) no education and unemployed, 2) 1 year of education or more and unemployed, 3) no education and employed, 4) 1 to 3 years of education and employed, and 5) four years or more of education and employed. Smoking habits were divided into four classes: 1) never smoked, 2) former smoker, 3) occasional smoker, and 4) daily smoker (16). A three-point dietary score was developed based on a food frequency questionnaire (17) and the method was validated in a thorough 28-day diet history and biomarker analysis (n = 264) (18). In short, the dietary score was based on questions regarding the intake of fruits, raw and boiled vegetables, vegetarian dishes, fish, and fat (both spread on bread and for preparation) to get a rough index of the overall quality of dietary habits, which were divided into three categories: 1) healthy, 2) moderate, and 3) unhealthy. In this study, information about anti-obesity drugs was included as yes or no to the question, Do you use anti-obesity drugs? Physical activity was assessed using a four level score combining time spent when commuting and in leisure time, grouped into four categories 1) 0–113 min/week, 2) 143–225 min/week, 3) 255–340 min/week, and 4) 450–720 min/week (19).
The 10 variants in TMEM18, SH2B1, KCTD15, NEGR1, SEC16B, SFRS10, BDNF, FAIM2, and BAT2 were genotyped using the Centaurus platform (deCODE Genetics, Iceland) (11). The remaining 10 variants in FTO, MC4R, PCSK1, GNPDA2, MTCH2, NCP1, MAF, PTER, and PRL were genotyped using Taqman allelic discrimination or KasPAR SNP Genotyping (KBioscience, U.K.). When adjusting for the multiple tests performed, all SNPs obeyed Hardy Weinberg equilibrium (P < 0.05). All 20 SNPs passed quality control with an average mismatch rate of 0.17% (max. 0.97%) and an average success rate of 97.2% (min. 96.1%).
Logistic regression adjusted for sex and age was applied to examine the effect of each variant on overweight and obesity. Gene-gene interaction analyses were performed using logistic regression comparing one model including only the main effect from each variant with an alternative model including an interaction parameter besides the main effect. Each SNP was included as a covariate coded as number of risk-alleles and the pairwise interaction as the product of the pairs of SNPs, i.e., multiplicative interaction.
The risk-alleles were defined as alleles associated with increased risk of overweight/obesity or BMI in previous studies (4–13). The effect of extreme risk-profiles was evaluated using Fisher exact test, which assumes an equal effect of all variants. Discriminatory value between normal weight, overweight, and obese individuals for the 20 variants and conventional nongenetic risk factors separately and in combination was estimated using receiver operating characteristic (ROC) curves. The models were trained on 1,000 bootstrap samples, with replacement, to cross-validate the results. A ROC curve of disease status assessment was generated using the out-of-bag samples by taking the mean of all bootstrap models in each 1-specificity point. This enabled us to correct for the overfitting made by the apparent (or optimal) ROC curve estimated in the entire dataset. ROC curve performance was compared using the integrated discriminative improvement (IDI) score, which will differ from zero if one ROC curve performs better. An asymptotic test was used to test for significance as described earlier (20). The ROC curves were also evaluated using area under the curve (AUC) and Brier score. AUC will be one if the test is perfect, and should be >0.80 to be of clinical value. The Brier score will be zero if the test is perfect and will be 0.25 if the assigned probability for an event (here normal weight or overweight/obese) based on the parameters in the model is set to 50% (i.e., the test is useless), which corresponds to an AUC of 0.5. The explained variance was estimated using the generalized R2 (21). All analyses were performed using RGui version 2.8.0 (http://www.r-project.org).
Of the 6,510 individuals with available BMI information, genotyping data for all 20 variants was available for 3,925 individuals where 1,725 were normal weight, 1,519 were overweight, and 681 were obese. Associations with overweight and obesity were analyzed individually for all variants applying a multiplicative model introducing sex and age as confounders (Fig. 1). Association with both overweight and obesity was observed for FTO rs9939609, MC4R rs12970134, BDNF rs4923461, and BDNF rs925946, and the minor alleles were the risk-allele for all variants except BDNF rs4923461. The minor allele of PCSK1 rs6232 associated with overweight but not obesity, whereas the minor alleles of PCSK1 rs6235, SEC16B rs10913469, FAIM2 rs7138803, GNPDA2 rs10938397, and MAF rs1424233 and the major allele of TMEM18 rs7561317 associated with obesity but not overweight. No associations with either overweight or obesity were observed for SH2B1 rs7498665, KCTD15 rs29941, NEGR1 rs2568958, SFRS10 rs7647305, MTCH2 rs10838738, BAT2 rs2260000, NCP1 rs1805081, PTER rs10508503, and PRL rs4712625 in the present study of Danes (Fig. 1). Allelic OR (95% CI) for the associated variants ranged from 1.12 (1.00–1.21) (GNPDA2 rs10938397) to 1.25 (1.13–1.37) (BDNF rs4923461) for overweight and from 1.12 (1.00–1.25) (PCSK1 rs6235) to 1.44 (1.25–1.66) (TMEM18 rs7561317) for obesity. Some of these individual variant analyses have been published previously (22,23).
The combined effect of the 20 variants was estimated calculating the percentage of normal weight individuals and overweight/obese individuals stratified according to the number of risk-alleles (Fig. 2A and B). The distribution of risk-alleles followed a normal distribution in both case and control subjects, but a shift toward higher number of risk-alleles was observed among both overweight and obese individuals.
Comparing a low risk profile, defined as the lowest 10th percentile (≤14 risk-alleles), with a high risk profile, defined as the highest 10th percentile (≥22 risk-alleles), we found a considerably increased probability of being both overweight, with an allelic OR of 2.00 (1.47–2.72), P = 4.0 × 10−5, and obese, with an allelic OR of 2.62 (1.76–3.92), P = 6.4 × 10−7.
We estimated the combined discriminatory ability of a genetic test based on the 20 reported obesity susceptibility variants by determining AUC and Brier scores of ROC curves. AUC was estimated using cross-validation of 1,000 bootstrap samples. The discriminatory value of the 20 SNPs was higher for obesity than for overweight measured by both AUC and Brier score (Table 1) consistent with the fact that more SNPs associated with obesity than with overweight (Fig. 1).
Discrimination ability of overweight and obesity using only SNP data resulted in an AUC of 0.53 and 0.57, respectively. When information regarding diet, physical activity, smoking, education, employment, and use of anti-obesity drugs was included together with sex and age as nongenetic factors, discriminative value resulted in AUC of 0.65 and 0.67 for overweight and obesity, respectively. And when combining information regarding SNPs and nongenetic risk factors, the AUC increased slightly for obesity to 0.69, but decreased to 0.64 for overweight (Table 1).
The slight increase in AUC gained when including SNP information to conventional nongenetic risk factors of obesity was estimated using IDI score, which will differ from zero if one model assesses disease status better than the other. The comparison resulted in an IDI score of 0.1 (P < 0.001), and hence the additional discriminatory value from SNPs is statistically significant. The decrease in AUC observed for overweight when including SNP data to conventional obesity risk factors was likewise significant with an IDI score of 0.07 (P < 0.001).
We observed no pairwise interactions among the SNPs, which would be significant after correction for multiple testing (supplementary Fig. 1, available in an online appendix at http://diabetes.diabetesjournals.org/cgi/content/full/db09-1042/DC1).
The same phenomenon is illustrated by the fact that models allowing for interactions among SNPs performed worse than the original models assuming an additive effect estimated by decreased AUC (supplementary Table 1).
The explained variance of obesity status increased from 11.9 to 16.8% in the analyzed study population by the inclusion of SNP data to conventional risk factors, whereas data of the 20 SNPs alone showed poor explanation of obesity status with only 4.5% in the analyzed study population (Table 1).
Four of 20 validated obesity/BMI SNPs associated with both overweight and obesity, one variant associated with overweight but not obesity, and six variants associated with obesity but not overweight in the population-based Inter99 cohort of middle-aged Danes. These different association patterns could be due to modest gene-environment interactions, where the BMI level determines the impact of the genetic variant. For example, that the BMI-increasing effect of a variant emerges when the individual's BMI approaches 25 kg/m2 and therefore only an association with obesity is observed.
Combining all the variants and assuming an additive effect, high risk profile carriers had a 2.0- and 2.6-fold increased probability of being overweight and obese, respectively, compared with low risk profile carriers. Discrimination ability of overweight and obesity was not sufficient to be of clinical utility when using either SNPs or conventional nongenetic risk factors of obesity. The ROC AUC for obesity using SNP data separately was estimated to 0.58 and to 0.67 for nongenetic risk factors. The difference in the performance between two such tests can be quantified by the false positive rate (1-specificity) for a given true positive rate (sensitivity). As an example, if we wanted to detect 80% of the obese individuals in the population-based Inter99 cohort, this would result in a misclassification of ∼70% of the normal weight individuals using only SNP information (Fig. 3A) whereas it would be ∼60% when using information about lifestyle (Fig. 3B).
Generally, the ability to discriminate between case and control subjects was higher for obesity than for overweight. An explanation for the SNP data could be that more variants associated with obesity than with overweight; an explanation for the nongenetic factors could be that there is a better correlation between unhealthy lifestyle and obesity, i.e., you are more likely to become obese than overweight if you have an unhealthy lifestyle.
The highest ROC AUC of 0.69 was achieved when combining SNPs and nongenetic factors. Despite the rather small increase in AUC, the contribution of the information gained by adding SNPs to the nongenetic factors was statistically significant, estimated by the IDI score. The false positive rate also decreased using this model with ∼57% normal weight individuals being misclassified to detect 80% of the obese individuals (Fig. 3C). We also observed a significant decrease in the discriminative ability between normal weight and overweight individuals when adding SNP data to nongenetic data (AUC 0.65 vs. 0.64), which also could be explained by lower correlation between SNPs and overweight.
A low discriminative ability between case and control subjects for a common heterogeneous disease such as obesity is in accordance with attempts to perform risk assessment for other complex diseases. ROC analyses of type 2 diabetes, also in the population-based Inter99 cohort, though including additional cases from other study groups and using 19 validated type 2 diabetes variants, also showed limited success with an ROC AUC of 0.60. Sex, age, and BMI on the other hand was shown to be strongly correlated with type 2 diabetes, hence, providing a high discrimination ability with an AUC of 0.92, which increased to 0.93 when including SNPs (24). We also constructed ROC curves using only information regarding sex and age. This resulted in an AUC of 0.61 and 0.64 for overweight and obesity, respectively, changing to 0.62 and 0.63, respectively, when including SNP data (data not shown). Hence, the effect of sex and age is stronger when analyzing risk of type 2 diabetes compared with obesity, where other lifestyle factors such as diet and physical activity based on our analyses also play crucial roles.
Former attempts have been made to assess risk of obesity using SNPs representing nine of the 18 BMI/obesity loci included in this study (FTO, MC4R, TMEM18, GNPDA2, SH2B1, KCTD15, MTCH2, NEGR1, and PCSK1). Here ROC analyses resulted in an AUC of 0.58 when including all nine SNPs in the model (14). This AUC represents the apparent discriminatory power of the nine SNPs and is lower than the apparent AUC observed in this study for the 20 gene variants for obesity, which was 0.61 (data not shown). Hence the 11 additional SNPs included in this study contribute to an increase in the apparent discrimination ability but in a magnitude that could indicate that the main part of the discriminative ability is carried by the highly obesity–associated SNPs in FTO and TMEM18 (Fig. 1). ROC analyses assessing obesity—excluding the two variants in FTO and TMEM18—resulted in a mean AUC of 0.53 (data not shown), whereas analyses including only the two variants in FTO and TMEM18 resulted in a mean AUC of 0.57. Both findings indicated that in our study sample, the predominant contribution to the discriminative ability comes from these two SNPs.
In the present study, we applied an additive model of inheritance for all variants. In theory one could try to examine the fit of different models to the data to improve discriminatory power; however, the ability to draw inferences from a single study's data are limited. Future meta-analysis of many studies would allow for a stronger exploration of how the available data fit different models of inheritance and to test if such models would change the present result.
This study is performed in a population-based cohort and it is therefore possible to estimate the explained variance, represented by partial R2. The explained variance was, as the AUC, highest for obesity. The explained variance in obesity status was for the 20 BMI/obesity variants estimated to 4.5%, whereas nongenetic risk factors explained 11.9%. The highest explained variance was, however, achieved when combining gene variants and nongenetic risk factors where 16.8% of the variation in obesity status was explained for the analyzed study population. This establishes that gene variants contain valuable predictive information especially in combination with other obesity risk factors.
The fact that some genetic variants could enhance each others effect synergistically, or contrarily mask each others effects through gene-gene interactions, plays an important role in the theoretical understanding of complex heterogenic diseases such as obesity. We analyzed the possible pairwise interaction between all combinations of SNP pairs but failed to detect any that would be significant after correction for multiple testing (supplementary Fig. 1). The suggestion that the effect of the genetic variants on obesity mainly can be explained by log-additive effects among the 20 examined genetic variants is supported by ROC curves based on logistic regression models including two-way genetic interaction terms. Here the discriminative value decreases for all models, both including SNP information exclusively, as well as when combining SNPs and lifestyle factors (supplementary Table 1). This analysis only indicates that two-way gene-gene interactions do not exist between the 20 genetic variants included in this study; however, such interactions between gene variants that so far have been missed in GWAS might still be important in the etiology of obesity.
Even though the identification of these 20 common obesity SNPs in 18 different loci is an advance in the understanding of the genetic predisposition to obesity, their isolated information is still too limited to be of any predictive and preventive value. But we have shown in this study that these variants contribute significantly to the conventional nongenetic lifestyle risk factors of obesity. The fact that common variants with low penetrance show limited predictive value is in agreement with a study where 40 genotypes were simulated in 1 million individuals under different scenarios. Here the variants should all be common and exert effect sizes with an OR of 1.50–2.00 to have discriminative abilities of clinical value (25).
It has, moreover, been proposed that accumulation of low-frequency variants with large effect sizes contribute to the pathogenesis of common complex diseases (26). Therefore future technologies resulting in the re-sequencing of large parts of the human genome may reveal more rare variants, which in combination with the known common variants identified by GWAS, might improve the discriminative value of genetics factors, increasing it to levels that are useful in the identification of individuals at high risk of becoming obese.
Summarizing the number of risk-alleles of 20 validated common obesity SNPs showed an increased prevalence of both overweight and obesity in carriers of 22 or more risk-alleles. The discriminative value of the 20 SNPs on obesity is low compared with conventional risk factors but contributes significantly when combined with conventional risk factors. No gene-gene interactions were shown between the examined variants.
The study was supported by the Lundbeck Foundation Centre for Applied Medical Genomics in Personalized Disease Prediction, Prevention and Care (LuCAMP), and by grants from the Danish Health Research Council, the FOOD Study Group/the Danish Ministry of Food, Agriculture and Fisheries and the Ministry of Family and Consumer Affairs (Grant 2101-05-0044), Novo Nordisk A/S Research & Development Corporate Research Affairs, the Danish Ministry of Science Technology and Innovation, the Faculty of Health Sciences of Aarhus University, the Danish Clinical Intervention Research Academy, and the Danish Diabetes Association.
This study is also a part of the project “Hepatic and adipose tissue and functions in the metabolic syndrome” (HEPADIP; www.hepadip.org), which is supported by the European Commission as an integrated project under the 6th Framework Programme (LSHM-CT-2005-018734) and received support from The Danish Obesity Research Center (DanORC; www.danorc.dk), which is supported by The Danish Council for Strategic Research (Grant No 2101–06–0005).
The Inter99 project was initiated by: T. Jørgensen, K. Borch-Johnsen, H. Ibsen, and T. F. Thomsen. The steering committee comprises the former two and C. Pisinger. The Inter99 project was financially supported by research grants from the Danish Research Counsil, The Danish Centre for Health Technology Assessment, Novo Nordisk, Research Foundation of Copenhagen County, Ministry of Internal Affaires and Health, The Danish Heart Foundation, The Danish Pharmaceutical Association, The Augustinus Foundation, The Ib Henriksen Foundation, and the Becket Foundation.
No potential conflicts of interest relevant to this article were reported.
The authors thank A. Forman, I.-L. Wantzin, and M. Stendal for technical assistance; G. Lademann for secretarial support; A. L. Nielsen for database management; and M.M.H. Kristensen for grant management.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Received July 16, 2009.
- Accepted January 20, 2010.
- © 2010 by the American Diabetes Association.
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. See http://creativecommons.org/licenses/by-nc-nd/3.0/ for details.