Prediction of Type 2 Diabetes Using Simple Measures of Insulin Resistance
Combined Results From the San Antonio Heart Study, the Mexico City Diabetes Study, and the Insulin Resistance Atherosclerosis Study
Combined Results From the San Antonio Heart Study, the Mexico City Diabetes Study, and the Insulin Resistance Atherosclerosis Study
To determine and formally compare the ability of simple indexes of insulin resistance (IR) to predict type 2 diabetes, we used combined prospective data from the San Antonio Heart Study, the Mexico City Diabetes Study, and the Insulin Resistance Atherosclerosis Study, which include well-characterized cohorts of non-Hispanic white, African-American, Hispanic American, and Mexican subjects with 5–8 years of follow-up. Poisson regression was used to assess the ability of each candidate index to predict incident diabetes at the follow-up examination (343 of 3,574 subjects developed diabetes). The areas under the receiver operator characteristic (AROC) curves for each index were calculated and statistically compared. In pooled analysis, Gutt et al.’s insulin sensitivity index at 0 and 120 min (ISI0,120) displayed the largest AROC (78.5%). This index was significantly more predictive (P < 0.0001) than a large group of indexes (including those by Belfiore, Avignon, Katz, and Stumvoll) that had AROC curves between 66 and 74%. These findings were essentially similar both after adjustment for covariates and when analyses were conducted separately by glucose tolerance status and ethnicity/study subgroups. In conclusion, we found substantial differences between published IR indexes in the prediction of diabetes, with ISI0,120 consistently showing the strongest prediction. This index may reflect other aspects of diabetes pathogenesis in addition to IR, which might explain its strong predictive abilities despite its moderate correlation with direct measures of IR.
Insulin resistance (IR) is a central feature in the natural history of type 2 diabetes (1,2). In addition, there is increasing evidence suggesting that IR or related pathophysiological mechanisms may be involved in the etiology of cardiovascular disease (3,4). The ability to accurately measure IR is therefore of substantial importance for chronic disease researchers. IR can be quantified using detailed physiological protocols, such as the euglycemic-hyperinsulinemic clamp technique and the frequently sampled intravenous glucose tolerance test (FSIGT) (1). These methods, however, are excessively complex, invasive, and costly for use in large observational epidemiological studies. Consequently, a number of simple (surrogate) indexes have been proposed for projects that require the estimation of IR in large numbers of subjects (5–16). These indexes use either transformations or weighted combinations of insulin or insulin and glucose concentrations in the fasted state and at various times during the oral glucose tolerance test (OGTT). Some have been developed using mathematical modeling of the direct measures mentioned above. Over the past few years, several new indexes of IR have been proposed, including a limited number that use additional metabolic or demographic variables, such as age, body mass, and triglyceride concentration (17–21).
As pointed out by Hanson et al. (13), the usefulness of these simple indexes for epidemiological studies depends on the strength of their correlation with criterion measures and the degree to which they predict type 2 diabetes in prospective analyses. Although several previous studies have reported on the criterion validity of simple measures in various populations (12,13,22), only Hanson et al. (13) have presented prospective results in a study of Pima Indian subjects. It would be of interest to extend these findings to include multiple ethnic groups and more recently proposed indexes, and to use formal statistical comparisons of predictive abilities of the indexes. The objective of the present study, therefore, was to determine and formally compare the ability of simple indexes of IR to predict type 2 diabetes using prospective data from the San Antonio Heart Study (SAHS), the Mexico City Diabetes Study (MCDS), and the Insulin Resistance Atherosclerosis Study (IRAS). These studies include cohorts of non-Hispanic white, African-American, Hispanic American, and Mexican subjects that are well characterized in terms of the anthropometric and metabolic risk factors for type 2 diabetes.
RESEARCH DESIGN AND METHODS
Data sources and variables.
For the present study, we used combined data from the SAHS, the MCDS, and the IRAS. The methodological details of these studies have been presented in previous publications (23–25). In each study, an OGTT was administered for the diagnosis of impaired glucose tolerance (IGT) and diabetes at both the baseline and follow-up examinations (26). In addition, IRAS subjects received a modified FSIGT, with insulin sensitivity (SI) estimated using the minimal model (25). The analyses for the present article included 3,574 subjects who were free of diabetes at baseline and for whom information was available on follow-up glucose tolerance status (second follow-up for MCDS) as well as metabolic and anthropometric variables of interest (Table 1).
Simple indexes of insulin resistance.
We conducted a literature review using the Medline database to identify published articles that described simple indexes of IR (Table 2). Additional articles were identified from the reference lists of published articles. Although certain proposed simple indexes use glucose and insulin values from multiple sampling time points during the OGTT (e.g. 0, 30, 60, 90, and 120 min), we evaluated indexes that use only fasting and/or 2-h glucose and insulin because these time points were available from all three studies. Furthermore, intermediate sampling time points (30, 60, and 90 min) are not commonly used in large epidemiological studies. In addition, we also evaluated recently proposed indexes that include age, body mass, and triglyceride concentration (17–21).
SAS version 8.0 (SAS Institute, Cary, NC) was used for all statistical analyses. Data from each study were pooled into a single dataset with an additional variable identifying the study, and variable names and measurement units were standardized. Hispanic American and non-Hispanic white IRAS participants from the San Antonio field center were also participants in the SAHS, and thus were excluded from the prospective analysis described below to avoid the use of duplicate measures on some subjects (they were, however, included for the correlation analysis described below). Therefore, Hispanic American IRAS participants in the prospective analysis are drawn from the San Luis Valley Diabetes Study site exclusively.
Means and SDs or proportions (as appropriate) were calculated for baseline and follow-up variables of interest stratified by study and ethnic group (Table 1). Using data on the subset of subjects who were baseline IRAS participants, the criterion validity of candidate simple indexes of IR was assessed by examining their Spearman correlation coefficients against SI, a direct measure of insulin sensitivity from the FSIGT (Table 2). Poisson regression analysis (using PROC GENMOD) was used to assess the ability of each candidate index of IR to predict incident diabetes at the follow-up examination. Each model was offset by the natural log of the length of each subject’s incidence period to control for differing observation times in each study. The magnitude of the association of each index with risk of diabetes was calculated by comparing the risk among those in the top (most resistant) 10% versus those in the bottom 90%. In addition, area under the receiver operator characteristic (AROC) curves for each model were calculated. The AROC curve is a measure of how well a continuous variable is able to predict the outcome of interest. The results of these individual models were ranked in terms of the magnitude of the association with incident diabetes and the AROC curve. Furthermore, the AROC curves for each model were formally compared using the DeLong algorithm (27). Additional analyses included comparisons of risk among those in the 2nd or 3rd tertile versus those in the 1st tertile as well as risk among those at or above the median compared with those below (data not shown). Analyses were conducted comparing individual IR measures in unadjusted and adjusted (for age, sex, systolic blood pressure, HDL cholesterol, and BMI) models with all subjects pooled (Table 3) as well as separately by glucose tolerance status (normal glucose tolerance [NGT] versus IGT), study, and ethnic group (Tables 4–6, adjusted subgroup results not shown). We tested for effect modification across these subgroups in the AROC and 10 vs. 90% models by including interaction terms in our Poisson regression models (both unadjusted and adjusted). We first conducted a global test of interactions across subgroups. For those interactions that were significant, we examined the individual subgroup parameters and P values. Given the large number of statistical comparisons used in this interaction analysis, P < 0.01 was considered statistically significant.
Study features and baseline and follow-up characteristics of cohort members who were nondiabetic at the baseline examination are presented in Table 1, by study and ethnic group. Consistent with the design of IRAS (25), ethnic subgroups from this study contained a higher proportion of subjects with baseline IGT (31.5–33.5%) compared with those from the other studies (8.4–15.6%). Within the SAHS, Hispanic American subjects tended to have higher levels of metabolic syndrome variables compared with non-Hispanic white subjects, including glucose, insulin, and measures of adiposity. This difference between Hispanic American and non-Hispanic white subjects was not as apparent in IRAS. African-American subjects in IRAS had higher BMI and glucose concentrations but lower triglyceride concentrations compared with Hispanic American and non-Hispanic white subjects. The mean follow-up time across studies ranged from 5.2 to 7.6 years, and the cumulative incidence of diabetes ranged from 4.8 to 16.5%, with the highest conversion proportion in the IRAS subgroups. In total, 343 of 3,574 subjects developed diabetes over the follow-up period.
The criterion validity of simple indexes of IR was examined against SI in the subgroup of the pooled cohort who were IRAS participants (Table 2). Correlation coefficients were generally similar among indexes that use only fasting glucose and insulin, with the insulin-to-glucose ratio displaying a slightly weaker coefficient (|0.65| vs. |0.67–0.68|). Indexes that included body weight or triglyceride concentration (14–18) did not display stronger correlations with SI compared with those using only fasting glucose and insulin. Coefficients were slightly stronger and spread over a wider range among indexes that use fasting and 2-h glucose and insulin concentrations, with the 2-h insulin–to–2-h glucose ratio showing the weakest correlation (|0.59|) and Avignon’s insulin sensitivity index (SiM) index the strongest (|0.77|). Three other indexes had correlation coefficients >|0.70|, including ISIgly_a (15), and Stumvoll (20) (both with and without demographics). These findings were essentially similar when the analyses were conducted separately by ethnicity (Table 2).
The results of Poisson regression analyses of the ability of each candidate index of IR to predict incident diabetes at the follow-up examination are presented in Table 3. The table shows pooled results from the three studies sorted by decreasing unadjusted AROC curves. A test of the null hypothesis that the AROCs of all the simple indexes were equal was rejected (P < 0.0001). The insulin sensitivity index at 0 and 120 minutes (ISI0,120) displayed the largest AROC (78.5%) and was significantly more predictive (P < 0.0001) than the other candidate indexes. A group of six indexes also had AROCs >70%, including ISIgly_a, SiM, ISIgly_b, Quantitative Insulin Sensitivity Check Index (QUICKI), and Stumvoll index without demographic variables (Stum_nodem). Fasting insulin and the insulin-to-glucose ratio (using either fasting or 2-h glucose concentrations) had AROCs <60% (P < 0.0001 compared with homeostasis model assessment [HOMA], AROC = 62.8%). The ranking of the indexes based on the magnitude of the risk for the top 10% versus the bottom 90% was generally similar, although the ranking of ISI, HOMA, and Duncan’s fasting insulin resistance index (FIRI) slightly improved, whereas th ranking of the log of fasting insulin slightly declined, compared with the AROC analysis (Table 3). Furthermore, the results were consistent using ranking of the indexes based on risk among those in the 3rd tertile versus those in the 1st tertile as well as risk among those at or above the median compared with those below (data not shown).
The findings were essentially unchanged after the models were adjusted for age, sex, systolic blood pressure, HDL cholesterol, and BMI, although the relative ranking of ISI (both fasting and 2 h) improved in AROC analysis. In addition, the relative rankings of FIRI and HOMA improved under models comparing the risk in the top 10% versus the bottom 90%, whereas the relative ranking of ISIgly_a and Stum_nodem declined (Table 3).
The predictive abilities of the indexes were generally similar in unadjusted analyses conducted separately by glucose tolerance, ethnicity, and study subgroups (Tables 4–6). Among subjects with NGT, the results were very similar to those from the pooled analysis (Table 4). The five indexes that were the most predictive in the pooled analysis were also the most predictive among NGT subjects. Among subjects with IGT, the results were also similar to those from the pooled analysis, with four of the five most predictive indexes from the pooled analysis also among the five most predictive in the IGT subjects. There were some minor differences in the IGT group, however, including notably lower AROC values and top 10% vs. bottom 90% RRs for ISIgly_a and the indexes of Stumvoll and colleagues (20).
ISI0,120 had the highest or second highest AROC ranking in five of the six ethnicity/study subgroups. Four of the top five predictors in terms of AROC ranking in the pooled analysis (ISI0,120, ISIgly_a, SiM, and QUICKI) tended to be within or near the top five predictors in each of the subgroups. The IRAS Hispanic American subgroup was an exception to this pattern, with results that were notably inconsistent with those from the pooled analysis as well as other individual subgroups, although the range of AROC curve values was narrower in this subgroup compared with the others (Table 5). Most high-ranking indexes from the pooled analysis were not highly ranked (ranks 10–18) in this subgroup, particularly for the comparison in risk for the top 10% versus the bottom 90% (Table 6). For the other subgroups, however, ISI0,120 (and to a lesser degree ISIgly_a and SiM) consistently demonstrated moderate-to-high rankings for the 10% versus bottom 90% comparison (Table 6). These subgroup patterns were very similar when the analyses were adjusted for age, sex, systolic blood pressure, HDL cholesterol, and BMI (data not shown).
In unadjusted analyses, a number of the global interaction terms for both AROC and top 10% vs. bottom 90% RRs were significant at the P < 0.01 level, including fasting insulin, FIRI, HOMA, IGT, and Raynaud (data not shown). However, after models were adjusted for age, sex, systolic blood pressure, HDL cholesterol, and BMI, the vast majority of these interaction terms were no longer statistically significant.
Insulin resistance (IR), a central feature in the natural history of type 2 diabetes (1,2), is emerging as a possibly important metabolic disorder in the etiology of a number of other highly prevalent chronic diseases, including coronary heart disease and stroke (3,4). Given the cost and complexity of direct measures of IR, there has been considerable interest for many years in simple indexes of IR for use in large clinical or epidemiological studies. The last few years, in particular, have witnessed the publication of several new indexes (10,11,14,15,17–21). The criterion validity of these new indexes has usually been demonstrated against direct measures in small groups of subjects. The criterion validity of a spectrum of these indexes in large numbers of subjects has been presented infrequently (12,13,22), and only once has the predictive ability of these indexes been examined (13). In the present study, we found that Gutt et al.’s (19) ISI0,120 demonstrated the best overall ability to predict diabetes in a large multiethnic cohort. This was the case in both pooled and subgroup analysis. Other indexes, including ISIgly_a, SiM, and QUICKI, also displayed strong and consistent prediction of diabetes, although there were some subgroup differences, especially among IRAS Hispanic American subjects. An explanation for the differences in index ranking among IRAS Hispanic American subjects is not immediately apparent, although this is the smallest of the subgroups (n = 149), and the findings might therefore be the result of random variation. In addition, the range of AROC curve values was substantially narrower in this subgroup when compared with the others. Finally, it is conceivable that the finding is the result of genuine variation in the etiology of diabetes between the ethnic groups, with the possibility of a relative preponderance of IR in Hispanic American subjects.
Only one previous study has examined the predictive ability of simple indexes of IR. Hanson et al. (13) considered a number of indexes in the prediction of diabetes among Pima Indians, but their article appeared before the publication of several new indexes over the past 2 years. We have extended this work by including the use of receiver operator characteristic curves and formal statistical comparisons, and by including a number of ethnic groups and a full spectrum of indexes. Hanson et al. (13) reported that fasting insulin and ISI (both fasting and 2 h) most strongly predicted the incidence of diabetes among the IR indexes tested. We found that these particular indexes showed moderate abilities to predict diabetes compared with more recently proposed indexes (Table 3).
Although several previous studies have examined the criterion validity of simple indexes of IR against gold standard measures, the majority compared a newly proposed index to a few others in small datasets. A limited number of studies have compared several indexes using larger datasets (12,13,22). It is difficult, however, to make comparisons because ours is the first study to have evaluated the full spectrum of indexes available to date, including the newly proposed indexes. In the present study, we found that SiM, ISIgly_a, ISI2h, and the Stumvoll index demonstrated the strongest correlation with SI.
The differences in the findings between the criterion validity and prediction analyses are of interest. Although ISI0,120 was most predictive of diabetes, it showed only moderate correlation with SI (r = 0.68), with other indexes being more strongly correlated (see Table 2). It is possible that this difference reflects the fact that ISI0,120 may more completely capture other important domains of diabetes pathogenesis, including β-cell dysfunction and increased hepatic glucose production, whereas other indexes are more customized measures of IR itself. In this respect it is important to recall that IR and diabetes are not equivalent end points, and that IR and β-cell dysfunction independently predict diabetes (2). In addition, many of the simple IR indexes considered here were developed using mathematical modeling of criterion (direct) measures of IR, rather than “diabetes.”
The findings of this analysis yielded no strikingly apparent patterns regarding the benefits of using indexes that use both fasting and 2-h measures of insulin and glucose, versus those that use fasting measures alone. Although a number of fasting and 2-h indexes ranked among the top five in the AROC and top 10% vs. bottom 90% analyses, two indexes that use only fasting and glucose and insulin concentrations (including QUICKI and ISIgly_b) also ranked within the top five. In a recent article examining the repeatability of simple indexes of IR, Mather et al. (28) found that indexes that used natural logarithmic transformations of insulin and glucose concentrations had superior test characteristics, which helped explain their stronger correlation with direct measures. However, the use of log transformations does not appear to explain the current findings. Although ISI0,120 does use the log transformation of the mean of fasting and 2-h insulin, other indexes that were consistently predictive of diabetes do not (e.g., ISIgly_a, SiM, ISIgly_b).
The HOMA-IR and the FIRI indexes are linear transformations of each other and should therefore produce identical findings. Although this was generally the case, the two indexes differed in ranking in a few instances. The results differ because HOMA-IR cannot be calculated when fasting glucose is <3.5 mmol/l (7), whereas there are no such instructions for calculation of FIRI (8). We confirmed this by rerunning our analysis, excluding subjects with fasting glucose <3.5 mmol/l in the calculation of FIRI, and we found that the HOMA-IR and FIRI results were identical.
In conclusion, we found substantial differences between published IR indexes, with ISI0,120 consistently showing greater association with the incidence of diabetes. This index may reflect other aspects of diabetes pathogenesis in addition to IR, which might explain its strong predictive abilities despite its moderate correlation with direct measures of IR.
This study was supported by National Heart, Lung, and Blood Institute contracts U01-HL47887, U01-HL47889, U01-HL47892, U01-HL47902, DK-29867, RO1 58329, RO1 HL24799, and RO1 HL36820. A.J.G.H. was supported by a postdoctoral fellowship from the Canadian Institutes of Health Research.
Address correspondence and reprint requests to Dr. Steven Haffner, Division of Clinical Epidemiology, University of Texas Health Science Center at San Antonio, Mail Code 7873, 7703 Floyd Curl Dr., San Antonio, TX, 78229-3900. E-mail:.
Received for publication 18 September 2002 and accepted in revised form 29 October 2002.
AROC; area under the receiver operator characteristic; FIRI, Duncan’s fasting insulin resistance index; FSIGT, frequently sampled intravenous glucose tolerance test; HOMA, homeostasis model assessment; IGT, impaired glucose tolerance; IR, insulin resistance; IRAS, Insulin Resistance Atherosclerosis Study; ISI, insulin sensitivity index; ISI0,120, insulin sensitivity index at 0 and 120 min; MCDS, Mexico City Diabetes Study; NGT, normal glucose tolerance; OGTT, oral glucose tolerance test; QUICKI, Quantitative Insulin Sensitivity Check Index; SAHS, San Antonio Heart Study; SI, insulin sensitivity; SiM, Avignon’s insulin sensitivity index; Stum_nodem, Stumvoll index without demographic variables.