Biomarkers for Type 2 Diabetes and Impaired Fasting Glucose Using a Nontargeted Metabolomics Approach

  1. Tim D. Spector1
  1. 1Department of Twin Research and Genetic Epidemiology, King’s College London, London, U.K.
  2. 2Computational Sciences Center of Emphasis, Pfizer Worldwide Research and Development, Cambridge, Massachusetts
  3. 3Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, U.K.
  4. 4Genetics of Complex Traits, Exeter Medical School, University of Exeter, Devon, U.K.
  5. 5Center for Statistical Genetics, Department of Biostatistics, University of Michigan, Ann Arbor, Michigan
  6. 6Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München, Neuherberg, Germany
  7. 7Human Genetics, Wellcome Trust Sanger Institute, Hinxton, U.K.
  8. 8MRC Centre for Causal Analyses in Translational Epidemiology, School of Social and Community Medicine, University of Bristol, Bristol, U.K.
  9. 9Institute of Genetic Epidemiology, Helmholtz Zentrum München, Neuherberg, Germany
  10. 10Clinical Research Statistics, Pfizer Worldwide Research and Development, Groton, Connecticut
  11. 11Metabolon Inc., Raleigh-Durham, North Carolina
  12. 12Biomedical Research Institute, University of Dundee, Ninewells Hospital and Medical School, Dundee, U.K.
  13. 13Cardiovascular and Metabolic Diseases, Pfizer Worldwide Research and Development, Cambridge, Massachusetts
  14. 14Department of Physiology and Biophysics, Weill Cornell Medical College in Qatar, Qatar Foundation, Doha, Qatar
  1. Corresponding authors: Tim D. Spector, tim.spector{at}, and Nicole Soranzo, ns6{at}
  1. C.M. and E.F. contributed equally to this study.

  2. M.J.B., K.S., N.S., and T.D.S. contributed equally to this study.


Using a nontargeted metabolomics approach of 447 fasting plasma metabolites, we searched for novel molecular markers that arise before and after hyperglycemia in a large population-based cohort of 2,204 females (115 type 2 diabetic [T2D] case subjects, 192 individuals with impaired fasting glucose [IFG], and 1,897 control subjects) from TwinsUK. Forty-two metabolites from three major fuel sources (carbohydrates, lipids, and proteins) were found to significantly correlate with T2D after adjusting for multiple testing; of these, 22 were previously reported as associated with T2D or insulin resistance. Fourteen metabolites were found to be associated with IFG. Among the metabolites identified, the branched-chain keto-acid metabolite 3-methyl-2-oxovalerate was the strongest predictive biomarker for IFG after glucose (odds ratio [OR] 1.65 [95% CI 1.39–1.95], P = 8.46 × 10−9) and was moderately heritable (h2 = 0.20). The association was replicated in an independent population (n = 720, OR 1.68 [ 1.34–2.11], P = 6.52 × 10−6) and validated in 189 twins with urine metabolomics taken at the same time as plasma (OR 1.87 [1.27–2.75], P = 1 × 10−3). Results confirm an important role for catabolism of branched-chain amino acids in T2D and IFG. In conclusion, this T2D-IFG biomarker study has surveyed the broadest panel of nontargeted metabolites to date, revealing both novel and known associated metabolites and providing potential novel targets for clinical prediction and a deeper understanding of causal mechanisms.

Currently, stratification of individuals at risk for type 2 diabetes (T2D) within the general population is based on well-established factors such as age, BMI, and fasting glucose (1). Although these factors contribute considerably to disease risk, they may not identify at-risk individuals before the disease process is well under way.

Recently, a number of studies have found several metabolites to be correlated with insulin resistance and T2D (26), and T2D-associated metabolic profiles have been identified 10–15 years before the diagnosis/onset of the disease (79). To help preventive strategies, and maximize the potential for existing effective interventions, it is important to characterize the molecular changes that take place in the development of T2D.

We aim to understand other biochemical changes, in addition to hyperglycemia, that take place at the onset of T2D using the largest metabolomic screening approach to date. We assessed >400 metabolites to determine which metabolomic profiles are correlated with T2D and impaired fasting glucose (IFG) in a large cohort of females from TwinsUK with independent replication.


We analyzed data from 2,204 females from TwinsUK for whom nontargeted plasma metabolomic profiling was available along with glucose/diabetic information (10). Subjects were classified into three groups based on fasting glucose levels at time of initial sampling and at subsequent visits (on average 2.08 [1.21] visits): T2D case subjects (fasting glucose ≥7 mmol/L or physician’s letter confirming diagnosis), individuals with IFG (5.6 mmol/L< fasting glucose <7 mmol/L), and T2D control subjects (3.9 mmol/L< fasting glucose <5 mmol/L).

Metabolomics measurements (on plasma and urine).

Nontargeted metabolite detection and quantification was conducted by the metabolomics provider Metabolon, Inc. (Durham, NC) on TwinsUK fasting plasma samples as described previously (11) and on 187 spot urine samples taken at the same time as plasma.

Replication cohort for 3-methyl-2-oxovalerate.

We included 536 individuals with IFG and 184 control subjects identified via fasting glucose from the follow-up study KORA F4 (Cooperative Health Research in the Region of Augsburg) (12) with fasting metabolomic profiles for 3-methyl-2-oxovalerate.

Statistical analysis.

We inverse normalized the data as the metabolite concentrations were not normally distributed. To avoid spurious false-positive associations due to small sample size, we excluded metabolic traits with >20% missing values.

For each T2D-control and IFG-control contrast, we ran random intercept logistic regressions adjusting for age and BMI at the time of sampling, metabolite batch, and family relatedness. We used a conservative Bonferroni correction to account for multiple testing, thus giving a significant threshold of 1 × 10−4 (0.05/447).

Taking advantage of the twin design of our study, for each metabolite significantly associated with one or more contrasts, we estimated heritability using structural equation modeling. For contrasts between each disease class and controls, we ran a stepwise linear regression including all the significant metabolites to look for metabolites independently associated with T2D and IFG respectively.

We further investigated the role of 3-methly-2-oxovalerate, the strongest predictive biomarker after glucose, by 1) replicating the result in an independent population (KORA), 2) validating the result in urine (TwinsUK), 3) investigating the underlying genetic influences using genome-wide association study (GWAS) data, and 4) assessing causality of the metabolite-IFG association by Mendelian randomization.

As 3-methyl-2-oxovalerate was associated with single nucleotide polymorphism (SNP) rs1440581 (S.-Y.S., E.F., A.-K.P., et al., unpublished observations), we tested this SNP for association with T2D status using case and control subjects from the Diabetes Genetics Replication and Meta-analysis Consortium (DIAGRAM) and by genotyping rs1440581 in 4,961 T2D case subjects and 5,948 control subjects from GoDarts (KASPar System; KBiosciences; genotyping success rate >95%, Hardy-Weinberg Equilibrium P > 0.05).


Metabolites associated with T2D and IFG.

Levels of 447 fasting plasma metabolites (281 known and 176 unknown) were obtained for 115 T2D case subjects, 192 individuals with IFG, and 1,897 normoglycemic control subjects. The demographic characteristics are presented in Table 1. After adjusting for age, BMI, metabolite batch, and family relatedness, 42 of the 447 metabolites tested showed significant differences among T2D case and control subjects with a Bonferroni-corrected cutoff of 1 × 10−4 (=0.05/447). As depicted in Fig. 1A, the 42 metabolites fall into three principal classes: 12 are lipids (primarily medium and long-chain free fatty acids), 7 are carbohydrates, 9 are branched-chain amino acids (BCAAs) or derivatives, and 14 are unknown. Besides glucose, a one standard deviation change in metabolite level resulted in T2D effect sizes ranging from odds ratio (OR) 1.05 to 3.36 for adrenate (22:4n6) and mannose, respectively (Table 2).


Demographic characteristics of the study populations

FIG. 1.

Metabolites associated with T2D case-control status (A) and with IFG control status (B). Each metabolite super-pathway is represented in a different color.


List of significant metabolites in one or more comparison

We repeated the analysis for the IFG group contrasting with control subjects. This revealed 14 significantly associated metabolites, 8 of which were also identified for T2D (Table 2). Six of the 14 metabolites are related to BCAA catabolism, three are carbohydrates, and two are lipids (Fig. 1B). Two metabolites were independently associated with IFG in the stepwise regression, including these 14 metabolites: glucose and 3-methyl-2-oxovalerate. Using 1,297 monozygotic and 1,200 dizygotic twin pairs, we estimated heritability for each metabolite identified in one or more contrasts. The calculated heritabilities ranged from 0 to 65%.

Effect sizes, association statistics, heritability estimates, and literature references for both contrasts are shown in Table 2.

Investigating the role of 3-methyl-2-oxovalerate in IFG.

3-Methyl-2-oxovalerate is the branched-chain keto-acid (BCKA) derivative of isoleucine, one of three BCAAs. We found it to be significantly associated with IFG in 536 individuals with IFG and 184 normoglycemic control subjects from the KORA population (OR 1.68 [95% CI 1.34–2.11], P = 6.52 × 10−6) and in the inverse-variance fixed-effect meta-analysis of the results (1.66 [1.45–1.90], P = 2.62 × 10−13), thus replicating our result.

We next studied 94 individuals with IFG and 95 control subjects from TwinsUK with urine metabolomic profiles available at the same time as plasma sampling. 3-Methyl-2-oxovalerate correlated significantly with IFG (OR 1.87 [95% CI 1.27–2.75], P = 1 × 10−3), thus suggesting that urine could also be used to test for elevated 3-methyl-2-oxovalerate.

Genetics of 3-methyl-2-oxovalerate and GWAS.

3-Methyl-2-oxovalerate has a heritability h2 = 0.20 (95% CI 0.08–0.33) (Table 2). Our companion metabolite GWAS (S.-Y.S., E.F., A.-K.P., et al., unpublished observations) revealed that 3-methyl-2-oxovalerate is strongly associated with SNPs upstream of the PPM1K gene on chromosome 4 (top hit SNP rs1440581, beta = −0.014 [0.017], P = 1.21 × 10−16).

We assessed whether the association between 3-methyl-2-oxovalerate and IFG is consistent with a causal hypothesis. Given the magnitude of effect between 3-methyl-2-oxovalerate and T2D and between rs1440581 and 3-methyl-2-oxovalerate, we theoretically estimated assuming causality (using Mendelian randomization) that the biomarker raising allele C would be associated with increased risk of T2D (OR 1.10 [95% CI 1.03–1.18]). We obtained in the actual data a meta-analyzed test statistic of OR 1.03 ([1.00–1.05], P = 0.08) after analyzing rs1440581 in 17,132 T2D cases and 62,810 control subjects (DIAGRAM consortium [13] plus replication in GoDARTs [14]).


Using the largest biochemical screening approach to date (447 metabolites), we searched for molecular markers that arise before and after hyperglycemia in a large cross-sectional population of women. We identified 42 metabolites with high statistical significance associated with T2D and 14 metabolites associated with IFG. Although diabetes is considered to be primarily a disorder of glucose, we find other dimensions, apart from carbohydrates, in the metabolic space that associate with T2D and IFG, namely lipids and amino acids.

Although many metabolites identified have previously been associated with T2D or insulin resistance (Table 2), we are the first to report their associations with IFG. Moreover as IFG presents itself before T2D in prospective studies, this could improve disease prediction and early intervention. Also, this is the first study on IFG using a wide untargeted platform such as Metabolon (a previous IFG study [15] used a different platform with little overlap). We also report the novel association of the BCKA 3-methyl-2-oxovalerate with IFG both in plasma and in urine.


As expected, glucose itself showed the strongest association with both T2D and IFG, followed by mannose, which is consistent with previous findings (3,5,1619) and emphasizes the importance of other glucose and nonglucose pathways. In particular, dimethylarginine (SDMA and ADMA) has been more associated with the micro- and macrovascular complications than with the pathogenesis of diabetes itself; whereas the association of malate and arabinose with T2D was never reported.


T2D patients often present with elevated lipid profiles, and within this study, lipids (primarily the free fatty acids) make up the second largest group of T2D/IFG-associated metabolites.

Lipids with the longest chain (adrenate [22:4n6] and arachidonate [20:4n6]) are elevated in IFG patients compared with control subjects. Similarly, lipids with shorter chain (5-dodecenoate [12:1n7], heptanoate [7:0], and pelargonate [9:0]) are depleted in T2D patients relative to control subjects.

In contrast, the fatty acid chains found in triglyceride molecules in diabetes seem to act differently. Rhee et al. (9) found that triglycerides containing longer-chain fatty acids were associated with a decreased risk of diabetes, whereas triglycerides containing shorter chains were associated with an increased risk. This contrasting pattern of association may reflect alterations in triglyceride lipolysis, which could either contribute to or be a result of the dysregulation of glucose metabolism.

Among the lipids identified, the novel associations include the fatty acid 15-methlypalmitate and the medium fatty acid 5-dodecenoate (12:1n7).

Amino acids.

The third major group of metabolites are amino acids. Within this group, the BCAAs valine, isoleucine, and leucine and their BCKAs 3-methyl-2-oxovalerate, 4-methyl-2-oxopentanoate, and 3-methyl-2-oxobutyrate are significantly elevated in both individuals with IFG and subjects with T2D compared with control subjects.

Elevated BCAA levels have previously been associated with increased risk of incident T2D (3,5,7,20) and independently predict future T2D onset (7). Breakdown products of BCAAs (propionylcarnitine, α-methylbutyrylcarnitine, and isovalerylcarnitine) were also found to be elevated (21). However, whereas previous targeted panels did not include BCKAs, the nontargeted approach used here highlighted specific effects on these important intermediates in BCAA catabolism (Fig. 2). These suggest that it may be the breakdown of BCAAs that is associated with diabetes and not specifically the elevated levels of BCAAs themselves. Consistent with this idea, a knockout of the mouse BCAT2 gene, which blocks the first step in BCAA metabolism, results in greatly elevated plasma levels of BCAAs, and yet these animals have improved glucose control, insulin sensitivity, and resistance to diet-induced obesity (22).

FIG. 2.

BCAA catabolism. The three BCAAs are first converted to BCKAs and eventually lead to the production of C3 and C5 acylcarnitines.


Among the metabolites identified, the BCKA 3-methyl-2-oxovalerate is the strongest predictor of IFG after and independently of glucose. BCAA catabolism occurs primarily in the mitochondria, proceeding through BCAA transaminase, and then through the branched-chain α-keto-acid dehydrogenase, a complex of three separate gene products. In our companion GWAS (S.-Y.S., E.F., A.-K.P., et al., unpublished observations), SNP rs1440581 had the strongest associations with all BCAAs, all BCKAs, and the C3-acylcarnitine propionylcarnitine. This SNP is upstream of PPM1K mitochondrial phosphatase, which dephosphorylates and thereby activates the BCKD, clearly highlighting the importance of mitochondrial function for plasma levels of BCAAs and BCKAs (23). The centrality of mitochondrial function to BCAA catabolism and metabolic disease has been noted before (24). BCAA dysregulation could be a cause and/or consequence of mitochondrial dysfunction. Increased BCAA catabolism, resulting in increased BCAA catabolic intermediates, may impair mitochondrial oxidation of glucose and lipids, potentially resulting in mitochondrial stress and impaired insulin secretion and action. Reduced mitochondrial function in T2D and IFG may reduce the capacity of the mitochondria to break down BCAAs, resulting in elevated levels of BCAAs and BCKAs.

The current study has several strengths. It used a nontargeted metabolomic approach that identifies a wide range of biochemicals besides lipids. TwinsUK has phenotypic longitudinal data available that allowed us to accurately classify subjects as case, IFG, and control subjects. The availability of urine metabolites, genetic data, and twin design enabled us to explore the biological implication of 3-methyl-2-oxovalerate further. Finally, the robustness of our results is highlighted by the fact that we confirm many previous findings, and our main association reported is clearly replicated in an independent cohort and validated in urine.

Our study has some limitations. Our discovery sample consisted of women only, and some metabolites might be influenced by sex-specific hormones. Unknown metabolites might not really be new but merely not yet identified. Finally, our Mendelian randomization analysis was unable to firmly support or reject causality for the association of 3-methyl-2-oxovalerate and IFG. To explore this further, additional 3-methyl-2-oxovalerate–associated variants need to be identified and tested, boosting power and reducing the impact of potential unwanted pleiotropic confounding.

Here we find evidence that multiple metabolites from three major fuel sources (carbohydrates, lipids, and proteins) are robust risk factors for the development of both IFG and T2D. Further work is encouraged by these data, including understanding the role of diet and microbiota on the free fatty acid relationships with T2D.


This study was funded by the Wellcome Trust and the European Community’s Seventh Framework Programme (FP7/2007-2013). The study also receives support from the National Institute for Health Research (NIHR) Clinical Research Facility at Guy’s and St. Thomas’ National Health Service (NHS) Foundation Trust and NIHR Biomedical Research Centre based at Guy's and St. Thomas' NHS Foundation Trust and King's College London. T.D.S. is an NIHR senior investigator and holds an ERC Advanced Principal Investigator award. J.R.B.P. is supported by the Wellcome Trust as a Sir Henry Wellcome Postdoctoral Research Fellow (092447/Z/10/Z). S-.Y.S. is supported by a postdoctoral research fellowship from the Oak Foundation. N.S.'s team is supported by the Wellcome Trust (grants WT098051 and WT091310) and the European Community's Seventh Framework Programme (EPIGENESYS Grant 257082 and BLUEPRINT Grant HEALTH-F5-2011-282510).

Part of this work was funded by Pfizer Worldwide Research and Development. M.M. and R.P.M. are employees of Metabolon, Inc. E.F., C.H., J.T., and M.J.B. are full-time employees and shareholders of Pfizer. No other potential conflicts of interest relevant to this article were reported.

C.M. analyzed the data and wrote the manuscript. E.F. and M.P. wrote the manuscript. I.E. and C.H. analyzed the data. J.R.B.P. contributed reagents/materials/analysis tools and wrote the manuscript. G.K. analyzed the replication dataset. S.-Y.S., A.-K.P., and W.Y. contributed reagents/materials/analysis tools. K.J.W. and J.T.B. reviewed and edited the manuscript. M.M. performed the experiments. C.N.A.P. performed the experiments and contributed reagents/materials/analysis tools. T.M.F. contributed reagents/materials/analysis tools and reviewed and edited the manuscript. J.T. and C.G. conceived and designed the experiments. R.P.M. performed the experiments and reviewed and edited the manuscript. M.J.B. and T.D.S. conceived and designed the experiments and wrote the manuscript. K.S. and N.S. conceived and designed the experiments and reviewed and edited the manuscript. T.D.S. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

The authors thank Sally John, Carol Hicks, Li Xi, and Vinicius Bonato (Pfizer Worldwide Research and Development) for their contributions. The authors thank Gabriela Surdulescu and Dylan Hodgkiss (King’s College London) for sample selection, sample handling, and shipment. The authors are grateful to the DIAGRAM consortium for sharing data ( Finally, the authors wish to express their appreciation to all study participants of the TwinsUK and KORA studies for donating their blood and time.

  • Received April 12, 2013.
  • Accepted July 15, 2013.

Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. See for details.


| Table of Contents

This Article

  1. Diabetes vol. 62 no. 12 4270-4276
  1. All Versions of this Article:
    1. db13-0570v1
    2. 62/12/4270 most recent