A Quantitative Trait Locus Influencing Type 2 Diabetes Susceptibility Maps to a Region on 5q in an Extended French Family

  1. Lisa J. Martin12,
  2. Anthony G. Comuzzie2,
  3. Sophie Dupont3,
  4. Nathalie Vionnet4,
  5. Christian Dina3,
  6. Sophie Gallina3,
  7. Mouna Houari3,
  8. John Blangero2 and
  9. Philippe Froguel35
  1. 1Center for Epidemiology and Biostatistics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio
  2. 2Department of Genetics, Southwest Foundation for Biomedical Research, San Antonio, Texas
  3. 3Institute of Biology, Institute Pasteur of Lille, Lille, France
  4. 4Centre National de Génotypage, Evry Cedex, France
  5. 5Barts & The London Genome Centre, Queen Mary & Smithfield College, London, U.K


    Type 2 diabetes is a heterogeneous disorder of glucose metabolism characterized by insulin resistance, β-cell dysfunction, and increased glucose production by the liver. Given the high degree of genetic heterogeneity, multiple genes with small to moderate effects may influence susceptibility to diabetes. To circumvent this limitation, we searched for quantitative trait loci (QTLs) that explain the variation in susceptibility of type 2 diabetes in a single extended family, as these individuals are likely to share polymorphisms. We collected genotypic and phenotypic data on 152 individuals ascertained through a multimedia campaign in France to find diabetes-prone families for genetic studies. The effects of genes and covariates (age and sex) on diabetes status were estimated using a threshold model and a maximum likelihood variance component approach. We obtained suggestive evidence of linkage (logarithm of odds [LOD] = 2.4) for diabetes status on chromosome 5q. Within the 1-LOD unit support interval, there are two strong candidates: PCSK1 and CAST. Furthermore, we have obtained a replication (LOD = 1.6) for a QTL for type 2 diabetes on chromosome 11 detected by Hanson and colleagues (1998).

    Type 2 diabetes is a heterogeneous disorder of glucose metabolism characterized by insulin resistance, β-cell dysfunction, and increased glucose production by the liver. It is well established that type 2 diabetes has a significant genetic component (1). For example, mutations in the genes for glucokinase have been identified (2,3) as well as hepatocyte nuclear factor-1α (4) in autosomal-dominant maturity-onset diabetes of the young. Nonetheless, identification of common polymorphisms associated with type 2 diabetes has been difficult.

    A major cause for this difficulty is disease heterogeneity. Given the complexity of glycemic control mechanisms, this heterogeneity is not surprising. For example, the insulin receptor signals by phosporylating at least six intracellular substrates (5). Thus, a mutation affecting production of any of these substrates could result in an increased susceptibility to diabetes.

    Indeed, significant linkages have been identified on several chromosomes, including 2q37 (6), 15q21 (7), and 3p (8) in Mexican-Americans; 3q27 (9) in French Caucasians; 10q26 (10), 12q24 (11), and 1q21-q23 (9,12,13) in Caucasians and Pima Indians; 20q (9,14) in Caucasian whites; and 9q21 (15) in Chinese. These results underscore the potential genetic heterogeneity of type 2 diabetes across diverse ethnic groups and the difficulties for replication of linkage.

    Given the high degree of genetic heterogeneity, multiple genes with small to moderate effects may influence susceptibility to diabetes. One strategy to circumvent this limitation would be to focus on individuals that are likely to share the same polymorphism, such as in an extended family. Although a single family may share household effects, the use of all possible relative pairs in an extended family minimizes this bias because a large proportion of relative pairs do not share a common household. The purpose of this paper is to search for quantitative trait loci (QTLs) for susceptibility of type 2 diabetes in a single extended family from France.



    Participants were ascertained through a multimedia campaign in France to identify diabetes-prone families for genetic studies (9,16). Among the 550 families recruited, a single family with 480 members was identified, with 152 of these individuals from three generations participating in the study (Fig. 1).


    Phenotypes were determined in light of the clinical report and results of the latest measurements for fasting and/or oral glucose tolerance test according to the American Diabetes Association criteria for type 2 diabetes (17) as diabetic, glucose intolerant, or normoglycemic. Individuals were classified as unknown when data were unavailable or type 1 diabetes was present. Accordingly, of the 159 individuals, 22 were diabetic, 15 were glucose intolerant, 115 were normoglycemic, and 7 were unknown.

    Affected individuals were either diabetic or glucose intolerant. Although individuals with impaired glucose tolerance (IGT) have not developed diabetes, these individuals have an increased risk of type 2 diabetes (18). Indeed, in a Dutch population (the Hoorn study), 64.5% of the participants who had both impaired fasting and impaired postload glucose levels at baseline progressed to diabetes during a 6-year follow-up (19). Yet, a substantial number of IGT individuals never progress to diabetes. Nonetheless, as these individuals are from a single family, there may be a common genetic mechanism underlying the diabetic and IGT states.


    Genomic DNA was isolated from leukocytes using the Puregene DNA isolation kit, according to the manufacturer’s protocol (Gentra Systems). Genotyping was performed using a fluorescently labeled human linkage mapping set (PE-LMSV2) comprising 400 highly informative microsatellite markers, with an average intermarker spacing of 9.7 cM. Multiplex PCR conditions were set up to amplify the markers in 87 PCRs. PCR (95°C for 12 min, followed by 30 cycles of 94°C for 15 s, 55°C for 15 s, 72°C of 30 s, and 72°C for 10 min) was performed on GeneAmp PCR system 9700 (Perkin-Elmer) (10-ml reactions; 40 ng of genomic DNA, 2.5 mmol/l MgCl2, and 0.25 mmol/l each dNTP; Pharmacia), with a variable amount (0.2–1.5 pmol) of 5′ and 3′ primers and 0.4 units AmpliTaq Gold DNA polymerase (Perkin Elmer) in 1× PCR Bufffer II (Perkin Elmer) (multiplex PCR conditions are available on request).

    Pooled amplification products were electrophoresed through 5% polyacrylamide gels (Long Ranger Singel pack; Perkin Elmer) for 1.5 h at 2,000 V on 24-cm plates on an ABI 377 DNA sequencer. An automated 96-channel pipettor Multimek 96 (Beckman) was used for all pipetting steps. Semiautomated fragment sizing was performed using GENESCAN 3.0 software (ABI), followed by allele calling with GENOTYPER 2.1 software (ABI). To confirm the accuracy of allele calling, two individuals reviewed each genotype independently. Average heterozygosity was 0.79. Incompatibilities were identified with PED-CHECK 1.1 (20), and inconsistencies were resolved by blanking (220 of 62,957 genotypes).

    Variance components linkage analysis.

    Our variance components approach is an extension of the strategy developed by Amos (21), which is based on specifying expected genetic covariances between relatives as a function of the identity by descent (IBD) relationship at the marker locus. We tested the null hypothesis that the additive genetic variance due to a QTL (σq2) equals 0 (no linkage) by comparing the likelihood of this restricted model with that of a model in which σq2 is estimated (22). The difference between the two log10 likelihoods produces a logarithm of odds (LOD) score that is equivalent to the classical LOD score of linkage analysis. Twice the difference in loge likelihoods of these models yields a test statistic that is asymptotically distributed as a 1/2:1/2 mixture of a χ2 variable and a point mass at 0 (23). This method has been implemented in the program package SOLAR (24), which determines whether genetic variation at a specific chromosomal location can explain the variation in the phenotype (21,22,25).

    A pairwise maximum likelihood-based procedure was used to estimate IBD probabilities (24). To permit multipoint analysis for QTL mapping, an extension (23) of the technique of Fulker et al. (26) was used. Estimates of the IBD probabilities were generated at any point on a chromosome using a constrained linear function of observed IBD probabilities of markers at known locations within the region. Exact multipoint IBDs were calculated using SIMWALK; however, results did not differ from the SOLAR-generated IBDs, therefore results are reported for the SOLAR IBDs. A LOD-score evaluation was performed at 1-cM intervals along the chromosome; the distances between markers having been obtained from the sex-averaged maps compiled by Généthon.

    Extension to dichotomous traits.

    We extended the basic variance components method to dichotomous traits by assuming that an individual belongs to a specific affection status if an underlying genetically determined risk (i.e., liability) exceeds a certain threshold (27). Although disease status is usually considered qualitative, with individuals either scored as affected or unaffected, it is generally assumed that there is an underlying quantitative liability determining affection status. If an individual’s liability score exceeds a specified threshold (determined using age- and sex-specific parameters, to produce the appropriate population prevalence), disease ensues. In contrast, if an individual’s score is below the threshold, the individual remains unaffected. An integral over the appropriate region of the curve is used to estimate each person’s liability value, and the latent liability is assumed to have an underlying normal distribution.


    Of 152 individuals with phenotypic information, 37 were classified as affected (prevalence 24%). Prevalence of affected was greater in men (24 of 73 individuals, 33%) than in women (13 of 79 individuals, 16%), although mean age for men and women was not significantly different (52.7 and 53.8 years for men and women, respectively). This prevalence is greater than the general French Caucasian prevalence; therefore, this family is “enriched” for affected individuals.

    Before quantitative genetic analysis, effects of age and sex were removed. Increasing age was a risk factor for affecteds. Additionally, men had a greater risk of being affected. These covariates explained 15% of the total phenotypic variation in affection status. No evidence of age by sex interactions was found, thus these were not included. BMI was also included as a covariate, and results were similar to those without BMI. Because BMI is a risk factor for type 2 diabetes, we report results of analyses without BMI.

    Figure 2 displays results from the linkage analysis for diabetes status by chromosome. We detected one chromosome with suggestive evidence of linkage (LOD score >1.9) (28). A maximum LOD score of 2.4 (P = 0.00044) near marker D5S428 was obtained on chromosome 5 (99 cM) and was confirmed by simulation analyses. The 1-LOD unit support interval, surrounding the peak LOD score on chromosome 5, ranges from 87–114 cM from p-terminal (Fig. 3). Three other chromosomes yielded LOD scores >1: chromosome 10 at 89 cM (LOD = 1.0), chromosome 11 at 147 cM (LOD = 1.6), and chromosome 13 at 4 cM (LOD = 1.6). However, these failed to reach the level for suggestive evidence of linkage (28).


    Type 2 diabetes is a genetically heterogeneous disease involving insulin resistance, β-cell dysfunction, and increased glucose production by the liver. Identification of multiple QTLs suggests that there is a high degree of genetic heterogeneity of type 2 diabetes. To circumvent this limitation, we focused on a single large family, because those individuals likely share the same polymorphisms, thus minimizing genetic heterogeneity. Although environmental factors, such as physical activity and diet, also influence the development of type 2 diabetes and may interact with genetic effects, these measures were not collected, therefore future studies should further examine these effects.

    Our results suggest that a gene located near D5S428 is responsible for variability in affection status. As we focused on a single family, the applicability of these findings to the general population is a concern. However, Lindsay et al. (29) identified tentative evidence of linkage (LOD = 1.5) ∼15 cM from our peak when conditioning on maternal inheritance. Additionally, Hager and et al. (30) identified a QTL on chromosome 5 for serum leptin levels. Our signal for diabetes susceptibility localizes within the 1-LOD unit support interval of this QTL. As leptin is an indicator of total adiposity and obesity is a risk factor for diabetes, the same mechanism may be causing obesity and diabetes in these populations. Therefore, our findings may be generalizable.

    Within the 1-LOD unit support interval, there are several mapped genes with an effect on insulin action. A potential candidate in this region is the gene for prohormone convertase 1 (PC1/PC3, SUBTILISIN/KEXIN-TYPE 1, PCSK1), which is implicated in the processing of several prohormones into mature hormones, such as proinsulin and proopiomelanocortin (POMC). The gene products of CPE, PC1, and PC2 cooperate in prohormone processing. By in situ hybridization, Seidah et al. (31) mapped PC1 to human 5q15-q21. Glucose regulates insulin and PC1 expression, supporting the important role of PC1 in regulating proinsulin processing. Although type 2 diabetes is associated with increased secretion of proinsulin and proinsulin-like molecules, screening for mutations of the entire coding region of the PC1 gene in Japanese type 2 diabetes subjects found no mutation associated with type 2 diabetes (32). However, in a woman with extreme childhood obesity, abnormal glucose homeostasis and clinical manifestations suggestive of defective prohormone processing, Jackson et al. (33) identified compound heterozygosity for two mutations of the PC1 gene. Three of their four clinically unaffected children had the gly483-to-arg missense mutation, while the fourth child had the splice site mutation.

    Another positional candidate gene encodes calpastatin (CAST), which is an endogenous modulator of intracellular calpain activity and is localized within the same cytoplasmic region as calpains. Calapastatin is a calcium-dependent cystein protease probably involved in muscle protein degradation in living tissue. CAST has been mapped to 5q15-q21 by in situ hybridization and spot-blot analysis of sorted chromosomes (34). Although the physiological role of calpains remains unclear, Horikawa and et al. (35) described an association between type 2 diabetes and a genetic variant identified in the gene encoding calpain-10. Calpains are ubiquitous nonlysosomal cystein proteases, which have been implicated in a wide variety of cellular functions, such as cell proliferation and differentiation, intracellular signaling, and apoptosis. Calpastatin also downregulates genes encoding several membrane-associated proteins or nuclear proteins and upregulates genes of collagen α2. Hence, in addition to their proteolytic activities on cytoskeletal proteins and other cellular regulatory proteins, calpain-calpastatin systems can affect expression levels of genes encoding structural or regulatory proteins.

    Although no other regions reached the level of suggestive linkage, the signal on chromosome 11 may be a replication of the QTL for diabetes on chromosome 11 originally detected in Pima Indians (13). We obtained an LOD of 1.6 on chromosome 11 near marker D11S1320. Based on Marshfield map distances (http://research.marshfieldclinic.org/genetics/), this signal is <20 cM from the QTL for BMI and diabetes in Pima Indians (D11S4464).

    It is important to note that we grouped diabetic and IGT subjects in the affected category. However, not all IGT individuals progress to diabetes, therefore by using IGT instead of diabetes status we may be misclassifying individuals. Although it would be interesting to examine the effect of this misclassification (including IGTs who never progress to diabetes status) on our linkage signal, the examination of such an effect would require that we have some predictive measure of who is most likely to convert to type 2 diabetes. As we do not have such a measure, estimation of the effect of misclassification would be random, therefore not informative. Moreover, the identification of genes that are involved in both type 2 diabetes and IGT would clinically relevant because early intervention for high-risk individuals may reduce the number of complications associated with type 2 diabetes. Biologically, IGT is associated with insulin resistance, while diabetes is associated with insulin resistance and dysregulation of insulin secretion. Therefore, our approach would likely identify genes involved with insulin resistance genes. However, this approach would likely lessen our ability to detect a genetic signal associated with genes influencing insulin secretion.

    In summary, we obtained suggestive evidence of linkage (LOD = 2.4) for type 2 diabetes susceptibility on chromosome 5q in a single extended family. This signal is in the same region as a QTL for diabetes when conditioning on maternal inheritance (28) and as a QTL for leptin levels (30). Within the 1-LOD unit support interval, there are two strong candidates: PCSK1 and CAST. Furthermore, we have replicated localization of a QTL for diabetes status on chromosome 11 detected by Hanson et al. (13). These findings suggest that both novel and previously identified QTLs may be influencing the development of diabetes.

    FIG. 1.

    The pedigree of the single extended family from France.

    FIG. 2.

    Maximum LOD scores by chromosome for the genome screen of serum diabetes status.

    FIG. 3.

    Estimated LOD functions obtained from multipoint quantitative trait linkage analysis of diabetes status for chromosome 5.


    This research was supported in part by Eli Lilly, Région Nord-Pas de Calais (to S.D.), and National Institutes of Health Grant MH59490.

    We are indebted to the family members who participated in this study.


    • Address correspondence and reprint requests to Lisa Martin, Center for Epidemiology and Biostatistics, Cincinnati Children’s Hospital Medical Center, 2800 Winslow, Room 2110, Mail Code 5041, Cincinnati OH 45229. E-mail: lisa.martin{at}cchmc.org.

      Received for publication 18 April 2002 and accepted in revised form 16 September 2002.

      IBD, identity by descent; IGT, impaired glucose tolerance; LOD, logarithm of odds; QTL, quantitative trait locus.


    | Table of Contents