Diabetes 51:S308-S315, 2002 © 2002 by the American Diabetes Association, Inc.
Searching for Type 2 Diabetes Genes on Chromosome 20From the Washington University School of Medicine, Division of Endocrinology, Diabetes and Metabolism, St. Louis, Missouri
Genome scans in families with type 2 diabetes identified a putative locus on chromosome 20q. For this study, linkage disequilibrium mapping was used in an effort to narrow a 7.3-Mb region in an Ashkenazi type 2 diabetic population. The region encompassed a 1-logarithm of odds (LOD) interval around the microsatellite marker D20S107, which gave a LOD score of >3 in linkage analysis of a combined Caucasian population. This 7.3-Mb region contained 25 known and 99 predicted genes. Predicted single nucleotide polymorphisms (SNPs) were chosen from public databases and validated. Two SNPs were unique to the Ashkenazi. Here, 91 SNPs with a minor allele frequency of 10% were genotyped in pooled DNA from 150 case subjects and 150 control subjects of Ashkenazi Jewish descent. The SNP association study showed that SNP rs2664537 in the TIX1 gene had a significant P value of 0.035, but the finding did not replicate in an additional case pool. In addition, HNF4a and Mybl2 were screened for mutations and new polymorphisms. No mutations were identified, and a new nonsynonymous SNP (R687C in exon 14 of Mybl2) was found. The limits to this type of association study are discussed.
Type 2 diabetes is a complex metabolic disorder characterized by abnormal hepatic glucose output, insulin resistance, and impaired insulin production (1). Multiple environmental and genetic factors contribute, and genome scans in families with multiple affected individuals from several racial/ethnic groups have been undertaken (Table 1). A number of potential loci have been identified, but in general, the evidence for linkage has not been strong, and the regions identified have been quite broad. A major question remaining is how to proceed with the search for complex disease genes, knowing that a single gene is neither necessary nor sufficient and that recombinant mapping in families will not suffice. The next phase of the search for diabetes susceptibility genes will likely require new strategies.
A genome scan with microsatellite markers at an average distance of 9.5 cM was completed in type 2 diabetic sibling pairs (n = 472) of Ashkenazi Jewish descent (2). The Ashkenazi population is relatively young and homogeneous, having undergone several constrictions and expansions resulting in reduced genetic heterogeneity in comparison to that of most Western Caucasian subjects. Studies of DNA polymorphisms have suggested that present-day Ashkenazi Jews descended from a small founder population, numbering perhaps as few as 10,000 individuals who existed in Eastern Europe at about 1500 AD (3). Today, there are about 10 million, representing a 1,000-fold expansion in roughly 20 generations. In the genome scan, five regions on four chromosomes exhibited nominal evidence for linkage (P < 0.05) (2). A maximal signal of Z = 2.05 was observed on chromosome 20 near D20S195. Because four other groups had previously reported evidence for linkage in the same region of chromosome 20q in Caucasians (see summary in Table 1), this region was further considered. Subsequent to the Ashkenazi type 2 diabetes genome scan, several investigators contributed linkage data for chromosome 20 to the International Type 2 Diabetes Linkage Analysis Consortium (http://www.sfbr.org/external/diabetes), a collaborative effort organized principally by the National Institute of Diabetes and Digestive and Kidney Diseases to map genes for the disease. Common markers on chromosome 20 were genotyped, and results were combined for a total of 1,852 families. The initial results of this analysis suggested a peak of linkage at D20S107 with a logarithm of odds (LOD) of >3. We therefore decided to target this region in our search in Ashkenazi patients with type 2 diabetes. Gene mapping by linkage disequilibrium analysis in related families identifies broad conserved regions of chromosomal DNA. In contrast, "unrelated" affected individuals share smaller conserved chromosomal regions because there are many more meioses, resulting in greater recombination around the region harboring the disease gene. Theoretically, polymorphic markers in linkage disequilibrium with the disease locus can be used to find associations of regions containing the gene mutation. Linkage disequilibrium mapping has been shown to be an important tool for fine-mapping of monogenic diseases (1012), and recently there has been success in identifying a gene involved in a complex diseaseinflammatory bowel disease (13). However, there is little doubt that linkage disequilibrium mapping for most other complex diseases will be more difficult (14). The distance over which disequilibrium extends between markers and disease loci is not well understood nor is the degree of genetic risk contributed by any particular locus, suggesting that genotyping closely spaced markers in many case and control subjects would be required. Single nucleotide polymorphisms (SNPs), whereas biallelic, have been preferred over simple sequence repeat polymorphisms for this type of analysis because SNPs are more abundant in the genome (15). As a general rule, the extent of linkage disequilibrium or association between a marker and a disease locus depends on the genetic distance between the two and the number of generations that have occurred since the mutation originated (16). In isolated populations such as the Ashkenazi Jews, linkage disequilibrium has been shown to extend over a broad region (3). We therefore undertook a search for diabetes susceptibility gene(s) on chromosome 20q in Ashkenazi Jewish unrelated patients with diabetes. Here we report the initial results of our association studies with 91 validated SNPs spanning a 1-LOD interval (7.3 Mb) around the microsatellite marker D20S107 candidate region on chromosome 20. DNA from case and control subjects was genotyped in pools by a recently described method involving pyrosequencing technology (17). No significant associations have been found to date.
Patient population. Type 2 diabetic case subjects and control subjects were of Ashkenazi Jewish descent as described (2). As shown in Table 2, the control subjects were significantly older than the case subjects by design because control subjects were selected for old age and absence of diabetes.
DNA isolation, quantification, and construction of pools. The DNA samples were isolated from whole blood using the Puregene kit as described (Gentra Systems, Minneapolis, MN). DNA was quantified with the TKO 100 Mini-Fluorometer and Hoechst dye method as described (Hoefer Scientific Instruments, San Francisco, CA). For the purposes of creating DNA pools, efforts to accurately determine DNA concentrations for each sample are critical because errors will skew the proportion of each genotype in the pool. Spectrophotometric analysis was avoided because substances such as protein and salts may give spurious results (18). The DNA samples were gently mixed on a rocking platform to ensure homogeneity before pipetting. Equal volumes of each sample were delivered to a sterile 55-ml polypropylene solution basin (Labcor Products, Frederick, MD) using an accurately calibrated multichannel pipette. After mixing gently and thoroughly overnight, the pooled DNA was placed into 1.0-ml aliquots in sterile 1.5-ml polypropylene microtubes and stored at 4°C in the dedicated refrigerator. As a quality control, the uniformity of the mixing procedure was verified by genotyping replicate aliquots of the pools for several SNPs.
PCR.
PCR plate setup, template preparation, and pyrosequencing.
Allele quantification software.
Denaturing HPLC.
SNP validation.
Statistical analysis.
The microsatellite markers bordering the 1-LOD interval around D20S107 include RPN2 and D20S911 at 35.7 and 43 Mb, respectively. This 7.3-Mb region contains a total of 25 known and 99 predicted genes (NCBI Human Genome Map, Build 28) (http://www.ncbi.nlm.nih.gov). All of the known genes in this region are shown in Table 3, along with the indication of those genes with expressed sequence tags (ESTs) expressed in pancreatic islets found in UniGene (http://www.ncbi.nlm.nih.gov/UniGene). As can be seen, seven of the known genes and two predicted genes were found with islet ESTs, and one gene (MAFB) showed relatively high expression in islets.
Because it would be difficult to directly sequence all 124 known and predicted genes within the putative at-risk region for a significant number of patients with diabetes, linkage disequilibrium mapping was used to evaluate the entire 7.3-Mb region. SNPs were identified in public databases and validated in the Ashkenazi population through a collaborative arrangement with a member of the SNP Consortium (P. Kwok, Washington University Medical School). Direct sequencing of pooled DNA from Caucasian, Asian, and African-American individuals was conducted. SNPs with minor allele frequencies greater than 10% in the Caucasian subjects were then tested in pooled samples of DNA from Ashkenazi subjects. Interestingly, only 50% of the SNPs submitted for validation were found to have a minor allele frequency of 10%. The validation data were entered into the public SNP database 1 week after testing (http://www.ncbi.nih.gov/SNP). The results for the validated SNPs for the four racial/ethnic groups are shown in Table A1 (in the APPENDIX). The SNPs rs736823 and rs932440, which were monomorphic in the Caucasian subjects, had minor allele frequencies of 42.2 and 11.1, respectively.
The allele frequencies of SNPs between Ashkenazi case subjects and control subjects (n = 300 each) were tested. As shown in Table A2, a total of 91 SNPs were examined. Of these, 65 SNPs were located in known or predicted genes, and 26 were intragenic. Statistical analyses (Fig. 1) showed that the allele frequency for one SNP at TIX1 appeared to differ (13.2 vs. 20.8%T for control and case subjects, respectively; P = 0.035) but on replication was found to have no difference (17.8%T in case pool 2, P = 0.22 vs. control) (Table A2 and Fig. 1).
Two genes in the region, Mybl2 and HNF4a, were screened by denaturing HPLC for exonic mutations and new polymorphisms. Mybl2 was screened because of the marginally significant P value of 0.057 for rs419842. HNF4a has been described in maturity-onset diabetes of the young (MODY)-1 (22) and as a possible type 2 diabetes candidate gene and has not yet been examined in Ashkenazi subjects. No coding or obvious splicing mutations were identified in either gene. However, in Mybl2, an unpublished nonsynonymous SNP R687C in exon 14 was identified (nt2186 C to T, accession number NM 002466) (data not shown).
In addition, allele frequencies for reported type 2 diabetes candidate gene SNPs were determined for the islet ATP-sensitive K+ channel (KIR6.2 and SUR), peroxisome proliferator-activated receptor (PPAR)-
This study involved SNP association in pooled DNA for a 7.3-Mb chromosomal region on 20q containing a total of 124 known and predicted genes. Allele frequencies were assessed in DNA pools rather than individual genotypes to expedite the study and decrease costs. By using pyrosequencing technology, allele frequencies in the pooled DNAs occurred within 2% of those frequencies defined by individual genotypes (17).
One SNP (TIX1) out of 91 tested showed marginal significance in case subjects versus control subjects, which was not replicated in a second pool of type 2 diabetic case subjects. Despite the lack of association to a specific SNP at this preliminary stage in the study, the occurrence of significant LOD scores along chromosome 20q from four racial/ethnic group studies supports the hypothesis that a genetic element contributing to type 2 diabetes is present. However, there are several limiting factors. First, each of the chromosome 20q peaks are fairly broad and encompass
A major question remains as to what SNP density is required for accurate analysis. Does this mean typing one SNP every 10 kb requiring
The results for the validated SNPs for the four racial/ethnic groups (Table A1) and all know genes on chromosome 20q (Table A2) are presented on the following pages.
This work was supported in part by National Institutes of Health Grants DK16746, DK07120, and DK49583 (to M.A.P.).
Address correspondence and reprint requests to M.A. Permutt, 660 S. Euclid Ave., Campus Box 8127, Saint Louis, MO 63110. E-mail: apermutt{at}im.wustl.edu. Received for publication 20 March 2002 and accepted in revised form 15 May 2002. LOD, logarithm of odds; PPAR, peroxisome proliferator-activated receptor; SNP, single nucleotide polymorphism. The symposium and the publication of this article have been made possible by an unrestricted educational grant from Servier, Paris.
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||