Identification of HKDC1 and BACE2 as Genes Influencing Glycemic Traits During Pregnancy Through Genome-Wide Association Studies

  1. for the HAPO Study Cooperative Research Group
  1. 1Division of Endocrinology, Metabolism, and Molecular Medicine, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois
  2. 2Department of Medicine, Division of Endocrinology, Université de Sherbrooke, Sherbrooke, Quebec, Canada
  3. 3General Medicine Division, Massachusetts General Hospital, Boston, Massachusetts
  4. 4Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, Illinois
  5. 5Department of Biostatistics & Bioinformatics, Duke University Medical Center, Durham, North Carolina
  6. 6Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois
  7. 7Department of Biostatistics, University of Washington, Seattle, Washington
  8. 8Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Quebec, Canada
  9. 9ECOGENE-21 and Lipid Clinic, Chicoutimi Hospital, Saguenay, Quebec, Canada
  10. 10The Broad Institute, Cambridge, Massachusetts
  11. 11Center for Inherited Disease Research, Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland
  12. 12Institute for Genome Sciences & Policy, Duke University, Durham, North Carolina
  13. 13Jesse Brown Veterans Affairs Medical Center, Chicago, Illinois.
  1. Corresponding author: M. Geoffrey Hayes, ghayes{at}


Maternal metabolism during pregnancy impacts the developing fetus, affecting offspring birth weight and adiposity. This has important implications for metabolic health later in life (e.g., offspring of mothers with pre-existing or gestational diabetes mellitus have an increased risk of metabolic disorders in childhood). To identify genetic loci associated with measures of maternal metabolism obtained during an oral glucose tolerance test at ∼28 weeks’ gestation, we performed a genome-wide association study of 4,437 pregnant mothers of European (n = 1,367), Thai (n = 1,178), Afro-Caribbean (n = 1,075), and Hispanic (n = 817) ancestry, along with replication of top signals in three additional European ancestry cohorts. In addition to identifying associations with genes previously implicated with measures of glucose metabolism in nonpregnant populations, we identified two novel genome-wide significant associations: 2-h plasma glucose and HKDC1, and fasting C-peptide and BACE2. These results suggest that the genetic architecture underlying glucose metabolism may differ, in part, in pregnancy.

The intrauterine milieu of the developing fetus, as determined largely by maternal metabolism, impacts both fetal and later health outcomes. Offspring of mothers with pre-existing or gestational diabetes mellitus (GDM) have an increased risk of metabolic disorders in childhood, including obesity, impaired glucose tolerance, and higher lipid levels (13). Maternal glucose levels less than those diagnostic of GDM are also associated with greater offspring birth weight and adiposity and may impose similar risks later in childhood and adulthood (46). The mechanisms underlying these risks are not known, but maternal metabolism is important given the impact of the mother’s metabolic profile on the intrauterine milieu of the developing fetus.

Maternal glucose metabolism during pregnancy differs from the nongravid state because the mother must meet both her own and the growing fetus’s energy needs (7). Fasting glucose decreases progressively throughout gestation, but insulin resistance increases from the end of the first through the third trimester. As insulin resistance increases, basal and stimulated insulin secretion, postprandial glucose levels, and hepatic glucose production increase compared with the nongravid state.

Maternal metabolism is determined by genetic and environmental factors. Given the unique aspects of glucose metabolism in pregnancy, we examined whether genetic variation associated with glycemic traits during pregnancy differs from that known to be important in the nongravid state. This was accomplished using DNA and phenotype data collected by the Hyperglycemia and Adverse Pregnancy Outcomes (HAPO) Study, a multicenter, international study that collected high-quality phenotypic data related to fetal growth and maternal glucose metabolism from ∼25,000 pregnant women of varied geographic, ethnic, and sociodemographic backgrounds. Standardized protocols that were uniform across centers were used to test for associations of maternal glycemia less severe than overt diabetes with risks of adverse pregnancy outcomes (6,8). Genetic loci important for maternal metabolism during pregnancy were identified by genome-wide mapping and replication of single nucleotide polymorphisms (SNPs) demonstrating association.


Samples and DNA source

HAPO cohort.

All pregnant women at less than 32 weeks of gestation were eligible for enrollment in HAPO unless they met one of several exclusion criteria. All participants gave written informed consent, and an external data monitoring committee provided oversight. Study phenotype collection methods and inclusion and exclusion criteria have been published elsewhere (6,8).

Participants underwent a 75-g oral glucose tolerance test (OGTT) at ∼28 weeks’ gestation. Maternal DNA was taken from blood collected into an EDTA tube at 2 h during the OGTT, when phenotypes of interest were measured, including glucose, blood pressure, weight, and height. Glucose and C-peptide were measured in a central laboratory (6,8), and DNA was prepared using the automated Autopure LS from Gentra Systems.

Submitted for genotyping were 9,814 mother and offspring HAPO samples (2,581 Afro-Caribbean [AC], 3,152 European ancestry [EU], 1,615 Hispanic [HI], and 2,466 Thai [TH]), along with 126 HapMap control samples, of which 9,008 (2,278 AC, 2,797 EU, 1,498 HI, and 2,435 TH) survived quality control (QC). Demographic and phenotypic descriptions of the mothers whose samples survived QC are summarized in Supplementary Table 1, and the sampling locations of each cohort are listed in Supplementary Table 2.

Sherbrooke cohort.

Women planning to deliver at the Centre Hospitalier Universitaire de Sherbrooke (CHUS) were recruited between 6 and 13 weeks of pregnancy. Exclusion criteria were age <18 or >40 years, multiple pregnancy, pregestational diabetes (type 1 or 2) or diabetes discovered at the first trimester (defined as glycemia >10.3 mmol/L at 1 h after 50-g glucose ingestion), drugs and/or alcohol abuse, uncontrolled endocrine disease, renal failure, or other major medical conditions that would affect glucose regulation. The project was approved by the CHUS Ethical Review Board, and written informed consent was obtained from all women before their inclusion in the study.

Demographics and baseline characteristics collected at the first trimester included maternal age, gestational weeks, medications, and personal and family medical history. Height and weight were measured using standardized procedures, and BMI was calculated. Systolic and diastolic blood pressures were measured in the sitting position after 5 min of rest; the average of 3 measurements was used for analyses.

During the second trimester (between 24 and 28 weeks of gestation), medical history and weight were again ascertained. Each participant had a 75-g OGTT in the fasting state (>8 h). Blood samples were maintained at 4°C and centrifuged; plasma was collected, aliquoted, and stored at −80°C until measurements. Plasma glucose was measured by glucose hexokinase (Roche Diagnostics, Indianapolis, IN). C-peptide levels were measured by ELISA (Luminex technology; Millipore Corp., Billerica, MA).

Chicoutimi cohort (ECOGENE-21).

Women with a singleton pregnancy were recruited at their first trimester of pregnancy from a founder population of French-Canadian origin (Saguenay area, Canada). Women older than 40 years, those with pregestational diabetes or other disorders known to affect glucose metabolism, and those with a positive history of alcohol and/or drug abuse during the current pregnancy were excluded. The Chicoutimi Hospital Ethics Committee approved the project. All women provided written informed consent before their inclusion in the study.

BMI was measured using standardized procedures. Glucose tolerance was assessed using a 75-g OGTT performed at 24–28 weeks’ gestation after a 12-h fast. Blood glucose and C-peptide measurements were made on fresh serum samples. Glucose was evaluated using a Beckman analyzer (model CX7; Fullerton, CA), and C-peptide levels were measured using a commercially available ELISA kit (ALPCO Diagnostics, Salem, NH).


AC DNA samples were genotyped using the Illumina Human1M-Duov3 B SNP array, EU DNA samples were genotyped using the Illumina Human 610 Quad v1 B SNP array, and the HI DNA samples were genotyped using the Illumina Human1M-Duov3 B SNP array at the Broad Institute. TH DNA samples were genotyped using the Illumina HumanOmni1-Quad v1-0 B SNP array at the Center for Inherited Disease Research, following agreed upon protocols of the Gene-Environment Association Studies (GENEVA) consortium (9).

Genome-wide association study QC.

Genotype data that passed initial QC at the genotyping centers were released to the GENEVA Coordinating Center (CC), National Center for Biotechnology Information database of Genotypes and Phenotypes (dbGaP), and HAPO study teams, who collectively performed QC using procedures previously described by the GENEVA consortium (9). Poorly performing samples or SNPs were removed based on misspecified sex, chromosomal anomalies, unintended sample duplicates, sample relatedness, low call rate, high number of Mendelian errors, departures from Hardy-Weinberg equilibrium, duplicate discordance, sex differences in heterozygosity, and low minor allele frequencies, as detailed in Supplementary Tables 3 and 4. Complete QC reports are available through dbGaP,


Population structure was determined using principal components analysis (PCA), essentially as described by Price et al. (10). All unduplicated HAPO study samples were analyzed separately in each of the four HAPO populations, along with HapMap (Utah residents with ancestry from northern and western Europe; Han Chinese in Beijing, China; Japanese in Tokyo, Japan; and Yoruba in Ibadan, Nigeria) samples genotyped with the HAPO study subjects. From the autosomal SNPs with missing call rate <5% and minor allele frequency >5%, we selected a subset through two rounds of linkage disequilibrium (LD) pruning (short- and long-range), as described previously (9). Outliers (those ≥5 SDs from the mean first and second principal component values for the HAPO cohort) were removed. After exclusion, the PCA analysis was performed again without HapMap samples. The first two eigenvectors from the results in these analyses were used as covariates in the association tests to adjust for possible population structure among the mothers (Supplementary Figs. 1–4).


Imputation was performed separately in each of the four QC cleaned and filtered genotyping sets using BEAGLE (11) and a HapMap 3 reference panel (12). We used a combined reference panel of unrelated individuals from multiple HapMap Phase 3 populations for imputation, based on the PCA analysis described above (Supplementary Table 5). We first used the strand-checking utility of BEAGLE to ensure consistent strand assignments between the reference dataset and the QC cleaned and filtered datasets, and we subsequently corrected strand and/or removed SNPs where strandedness could not be resolved. Next, we conducted imputation runs in the mothers and offspring separately within each of the four HAPO cohorts. We used a conservative allelic r2 threshold of 0.9 to remove questionable imputed SNPs.

Association tests.

The genotype call probabilities from the filtered BEAGLE output were used in a linear regression model between each of the phenotypes and the genotypes probabilities under an additive model adjusting for the set of model-specific covariates. Trait values were adjusted as follows: fasting C-peptide (FCP) and plasma glucose (FPG): log10 (trait), and 1- and 2-h plasma glucose (1HPG and 2HPG): square root (trait). We used the frequentist approach in SNPTEST v2.2.0 (13) to estimate the betas and SEs for each regression model and assess significance of the association between the SNP and the phenotype of interest.

We adjusted for confounders in two successive models: model 1 included mother’s age, gestational age at OGTT, parity, field center, and ancestry; and model 2 added maternal BMI, height, and mean arterial pressure measured at the OGTT, and maternal smoking and drinking status (yes/no). For the association tests between the maternal genotype and baby phenotypes (birth weight, fat mass, and sum of skinfolds), we adjusted for confounders in three successive models: model 1 included field center, ancestry using PCA, newborn sex, gestational age at delivery, parity, and maternal age at OGTT; model 2 included the covariates from model 1 plus maternal BMI, height, and mean arterial pressure at OGTT, and maternal smoking and drinking status (yes/no); and model 3 included the covariates from model 2 plus maternal FPG and FCP during the OGTT.


The betas and SEs were combined across the four cohorts using meta-analysis under a fixed-effects model weighting each strata by sample size. METAL (14) calculates a z statistic that summarizes the magnitude and direction of effect for the association of a reference allele selected at each marker. After aligning the SNPTEST output from each of the four cohorts to the same reference allele, a weighted sum of individual cohort results was used to calculate an overall z statistic and P value. The square root of the cohort-specific sample size was used as the proportional weight, and these squared weights sum to 1.


Replication was initiated after analysis of the HI, AC, and EU genome-wide association study (GWAS) populations before availability of the TH GWAS data. Top associations (those with P < 1 × 10−5 in the HI-AC-EU meta-analysis or EU cohort itself, and trimmed for LD [r2 < 0.5]) were replicated in a second set of 2,192 EU HAPO mothers using a custom Illumina 384 SNP bead array consisting of 127 SNPs selected for replication of the traits described herein, 157 SNPs selected for replication of association signals from related projects, and 100 ancestry informative markers, which were the top 50 SNPs associated with the first two principle components in the EU group in the GWAS discovery phase. Genotyping was performed at the Broad Institute.

For the external Sherbrooke and Chicoutimi replication cohorts, DNA was purified from whole-blood samples with Gentra Puregene Cell Kit (Qiagen, Valencia, CA). Selected gene polymorphisms (top 30 associations after replication phase 1) were genotyped using a quantitative RT-PCR assay (model 7500Fast, Applied Biosystems) with Applied Biosystems TaqMan probes and primers (sequences can be obtained upon request), following the manufacturers’ recommendations (Life Technologies Inc., Burlington, ON, Canada).


We performed a discovery GWAS in a large subset of HAPO mothers from four different ancestry populations using the Illumina 610, 1M, and Omni1 platforms, with 4,528 participants (Supplementary Table 1) surviving genotyping QC control. Cohort-specific and meta-analyses of genome-wide SNP data were conducted to identify common genetic variants associated with maternal FPG, 1HPG, and 2HPG levels as well as FCP levels measured at ∼28 weeks of gestation. Associations were assessed with linear regressions under an additive genetic model adjusting for confounders in two successive models using genotyped and imputed SNPs (see research design and methods for a description of the models). Results from cohorts were combined through meta-analyses weighting each stratum by sample size. Associations meeting a significance of P < 1 × 10−5 were replicated in a second cohort of EU HAPO mothers, and the top 30 signals were replicated in two independent cohorts of pregnant women of EU.

Several genes/SNPs associated with glycemic traits in nongravid populations (15,16) demonstrated genome-wide significant association (P < 5 × 10−8) in pregnant women (Table 1; Supplementary Table 6; Supplementary Figs. 5–8). Specifically, we found associations with FPG and SNPs in glucokinase regulator (GCKR), glucose-6-phosphatase 2 (G6PC2), proprotein convertase subtilisin/kexin type 1 (PCSK1), protein phosphatase 1, regulatory subunit 3B (PPP1R3B), and melatonin receptor 1B (MTNR1B); 1HPG and SNPs in MTNR1B; and FCP and SNPs in PPP1R3B and GCKR. The top association in the vicinity of GCKR was rs1260326, which reached P = 6.08 × 10−13 with FPG and P = 5.73 × 10−11 with FCP in the seven-group meta-analysis, with betas ranging from −0.030 to −0.015 log10(µg/L) and from −0.0066 to 0.00086 √(mmol/L) per T allele for FCP and FPG, respectively, in the four GWAS ancestry groups. Similarly, the top G6PC2 association was rs560887 with FPG (P = 2.08 × 10−16), with betas ranging from −0.0082 to −0.0026 √(mmol/L) per T allele. The SNP rs6235 in PCSK1was the SNP most strongly associated with FPG (P = 4.96 × 10−15), with betas ranging from −0.0069 to −0.00054 √(mmol/L) per G allele. The top PPP1R3B SNP was rs4841132 at P = 4.55 × 10−15 for FCP (betas ranging from −0.061 to −0.017 log10[µg/L] per G allele) and P = 2.88 × 10−13 for FPG (betas ranging from −0.014 to −0.0038 √[mmol/L] per G allele). The SNP rs7936247 was the SNP in MTNR1B most strongly associated with FPG (P = 2.11 × 10−12; betas ranging from 0.0000055 to 0.0073 √[mmol/L] per T allele) and 1HPG (P = 3.44 × 10−16; betas ranging from 0.013 to 0.29 √[mmol/L] per T allele). Although the strength of association varied across cohorts, these data demonstrated evidence of association when combined through meta-analysis.


Genome-wide significant associations of glucose metabolism in gravid populations that overlap with those identified in nongravid populations

We also found strong (but not genome-wide significant) associations between SNPs previously found to be associated with glucose or insulin levels or type 2 diabetes in large meta-analyses of nongravid populations (Supplementary Table 7). Genes with SNPs reaching P < 0.001, which are not discussed above, include HNF1A (rs7957197 with 2 HPG, Pmeta = 5.44 × 10−5), CDKAL1 (rs9368222 with 1 HPG, Pmeta = 1.01 × 10−4), YPS26A (rs1802295 with 2 HPG, Pmeta = 6.42 × 10−5), and ARAP1 (rs11603334 with 2 HPG, Pmeta = 8.63 × 10−4).

Several of these most strongly associated SNPs were not the SNPs previously reported to be associated with the phenotype of interest in nongravid populations. A query of those SNPs (Table 1) also showed strong evidence for association at or near genome-wide significance and, importantly, all in the expected direction based on nongravid populations.

The locus with strongest association in the GWAS was 10q22.1 with 2HPG. This locus, which showed a relatively narrow region of association, is found in a segment of high LD upstream from the first intron of HKDC1 (hexokinase domain containing 1; no MIM number), a recently identified member of the hexokinase family (17). The LD structure in each of the four ancestry populations shows that this association locus spans a 400-kb region with D′ >0.5 from the most strongly associated SNP and includes the following genes in addition to HKDC1: SUPV3L1, SRGN, VPS26A, and HK1 (Fig. 1A). In the GWAS, the best SNPs at this locus demonstrated evidence for association with 2HPG in three of the four ancestry groups with P values ranging from 1.52 × 10−1 in AC mothers to 7.03 × 10−6 in Northern EU mothers (Table 2). The SNP with strongest association in the GWAS, rs4746822, reached genome-wide significant association in a meta-analysis that combined the four ancestry groups (P = 8.26 × 10−13; β range 0.167–0.229 √[mmol/L] per T allele). The proportion of phenotypic variation explained by this SNP ranged from 1.2% in EU to 2.7% in HI. This association was replicated in a cohort of 2,192 additional EU HAPO mothers and two smaller (n = 228 and 606) independent EU cohorts from Quebec, Canada, yielding a P value of 1.02 × 10−22 in a meta-analysis that combined the seven GWAS and replication cohorts.

FIG. 1.

A: LocusZoom plot of association results and LD boundaries around HKDC1. The top panel reflects the meta-analysis results of the four GWAS cohorts. Each of the four middle panels contains the population-specific (AC, EU, HI, TH) association results and estimates of LD (D′) from the SNP with the strongest evidence for association in the meta-analysis. The LD estimates are color coded as a heat map from purple (D′ ≥0.3 to >0.4) to red (D′ ≥0.9 to >1.0), whereas gray indicates D′ <0.3. These coincide with the recombination hotspots indicated by the blue lines (recombination rate in genetic distance between markers [cM]/physical distance [Mb] from HapMap (12). The bottom panel shows the genes and their directions in this region of chromosome 10. B: HKDC1 mRNA in human tissues as determined by RT-PCR: (1) adipose tissue, (2) bladder, (3) brain, (4) cervix, (5) colon, (6) esophagus, (7) heart, (8) kidney, (9) liver, (10) lung, (11) ovary, (12) placenta, (13) prostate, (14) skeletal muscle, (15) small intestine, (16) spleen, (17) testes, (18) thymus, (19) thyroid, and (20) trachea. C: Aligned genes, SNPs, active enhancer marks, OC regions, and gene expression profiles of the 2HPG-associated HKDC1 region on chromosome 10. SNPs upstream and within HKDC1 align with peaks representing regions enriched for active histone marks and OC regions in cell types representing 16 different tissues. HKDC1 is highly expressed in colon, lung, liver, and cervical carcinomas.


Genome-wide significant associations of glucose metabolism unique to gravid populations

HKDC1 mRNA was present in multiple human tissues, with highest levels in colon, small intestine, trachea, thymus, kidney, and endocrine tissues (Fig. 1B). Hepatic expression was also evident. Examination of the association locus using ENCODE and other databases demonstrated 2HPG-associated variants proximal to HKDC1 in regions of open chromatin (OC) and histones H3K27ac and H3K27me, all of which are indicative of active regulatory elements (18). For example, rs4746822 overlaps OC, H3K27ac, and H3K27me in the first intron of HKDC1 in HepG2 liver carcinoma cells and liver stellate cells (Fig. 1C), whereas rs5030937 is proximal to OC in liver stellate cells. The variants may therefore affect the function of these regulatory elements and alter liver HKDC1 levels.

A second novel finding was association of the rs6517656 G allele in BACE2 (β-site amyloid polypeptide cleaving enzyme 2; MIM 605668) with higher FCP. This locus showed moderate association in each of the four GWAS cohorts (P = 1.26 × 10−2 to 1.74 × 10−3) and approached genome-wide significance when combined across all four groups through meta-analysis (P = 3.06 × 10−7). The proportion of FCP phenotypic variation explained by this SNP ranged from 0.2% in Thais to 1.0% in AC. Strong association was present in the HAPO EU replication cohort (P = 3.89 × 10−11) and a meta-analysis combining the four discovery GWAS and three replication cohorts (0.018–0.054 log10[µg/L] per G allele; P = 6.30 × 10−16; Table 2). The LD structure in the four ancestry populations shows this association locus spans a 200-kb region and includes, in addition to BACE2, PLAC4, C21orf130, and FAM3B (Fig. 2A).

FIG. 2.

A: LocusZoom plot of association results and LD around BACE2. See Fig. 1 legend for details. B: Aligned genes, SNPs, active enhancer marks, OC regions, and gene expression profiles of the FCP-associated BACE2 region on chromosome 21. SNPs within BACE2 overlap with peaks representing regions enriched for active histone marks and OC regions in cell types representing eight different tissues. BACE2 is expressed in epidermal skin, breast and colon cancers, skeletal muscle, adipose tissue, and mammary epithelial cells. cM, genetic distance between markers; Mb, physical distance.

OC near the BACE2 transcription start site is common to many tissue types (18), suggesting that the gene is poised for expression (Fig. 2B). Several regions of islet-specific OC are located in the first intron of BACE2, and these regions may be important for BACE2 expression in islets. The FCP-associated tag and imputed variants are located within the islet-specific OC region, suggesting a possible role in regulating β-cell–specific BACE2 expression in islets.

For any of the associations attaining genome-wide significance between the maternal alleles and measures of maternal glucose metabolism (Tables 1 and 2), we tested for associations between these maternal alleles and neonatal anthropometric outcomes such as birth weight, sum of skinfolds, and fat mass (Supplementary Table 8). Although we observed evidence of nominally significant associations at some of these variants, none remained significant after correcting for multiple testing.


Maternal glucose metabolism during pregnancy differs from the nongravid state to meet the needs of the growing fetus and compensate for pregnancy-induced insulin resistance (7). Multiple GWAS and other studies in nongravid cohorts have demonstrated the contribution of variation in multiple genes to glucose and insulin levels (15,16,19). However, given the pregnancy-induced changes in glucose metabolism, the question arises whether the genetic architecture of glucose metabolism during pregnancy and the nongravid state are similar. We report now the first GWAS of maternal metabolic traits during pregnancy and have demonstrated both similarities and differences between the gravid and nongravid states. We also report evidence for association of many loci in multiple ancestry groups.

Five loci that exhibited genome-wide significant association with maternal metabolic traits have been identified previously in nongravid cohorts, primarily of EU. These include four loci that have demonstrated association with metabolic traits in multiple studies, including G6PC2, MTNR1B, GCKR, and PPP1R3B (15,16,19). MTNR1B demonstrated association with maternal FPG and 1HPG and has been previously associated with fasting glucose, impaired FPG, and type 2 diabetes as well as with altered β-cell function, including decreased insulin release after oral and intravenous glucose (2024). G6PC2, among the first genetic loci shown to be associated with FPG (25), was also associated with FPG in pregnant women. PPP1R3B, which demonstrated association with FPG and FCP, is important in glycogen metabolism (26) and has been previously associated with various lipid-related phenotypes and fasting glucose (2731). GCKR has been shown to be associated with many lipid phenotypes in addition to fasting glucose and insulin levels (32,33). In pregnant women, GCKR was associated with FPG and FCP levels.

Finally, we observed association of PCSK1 with maternal FPG. PCSK1 encodes proprotein convertase subtilisin/kexin type 1, an endoprotease involved in proteolytic activation of several precursor proteins, including proinsulin, proglucagon-like peptide 1, and pro-opiomelanocortin (34). The association of PCSK1 with metabolic traits has not been consistently observed across studies, but association with obesity-related traits, fasting glucose, 2-h glucose, and fasting and postglucose proinsulin levels has been reported (3538). We also observed several other strong associations with additional previously identified genes involved in glucose metabolism that did not reach genome-wide significance in our study of >6,000 pregnant women.

The above findings suggest some similarities between the genetic architecture of glucose metabolism in pregnant and nonpregnant populations. The absence of association between other previously identified glucose genes and glucose levels during pregnancy may have been due to partial differences in the genetic architecture of glucose metabolism in pregnant and nonpregnant populations and/or reduced power in the current study; determining this awaits the availability of additional large cohorts of pregnant women.

We also found evidence for genome-wide significant association of loci with maternal metabolic traits that have not been previously reported in nongravid populations. HKDC1 has not been associated with metabolic traits in GWAS performed in nongravid populations, although a recent large meta-analysis using a gene-based approach reported modest association of HKDC1 (P = 1.24 × 10−4) with 2HPG in 42,854 nongravid EU individuals (19). The lead SNP in that study, rs9645500, showed much stronger evidence for association with 2HPG in our GWAS of only 1,351 pregnant HAPO EU mothers (P = 7.52 × 10−6; Supplementary Table 6). Our top HKDC1 SNP, rs4746822, which is located in the 5′-flanking region of HKDC1, is in high LD with rs9645500 and demonstrated strong association in an additional 2,192 pregnant EU HAPO mothers during replication (P = 9.68 × 10−8), as well as in the two smaller external EU replication cohorts of 228 (P = 0.011) and 606 women (P = 9.25 × 10−4) for a combined Pmeta-All = 1.022 × 10−22. These data demonstrate association of HKDC1 in pregnant women from multiple ancestry groups and suggest that HKDC1 may play a more important role in glucose metabolism during pregnancy than in nongravid states. HKDC1, a recently identified member of the hexokinase family, is adjacent to hexokinase 1 on chromosome 10 in a head-to-tail arrangement, suggesting that HKDC1 and HK1 are products of a gene duplication event (17). HKDC1 is conserved across multiple species, including mammals, birds, fish, and amphibians, and has both a glucose-binding domain and ATP-binding site in its COOH-terminal domain, suggesting that it has hexokinase activity (17). The biological role of HKDC1 is unknown, but as shown in the current study, HKDC1 mRNA is present in a wide distribution of human tissues. This top HKDC1 variant (rs4746822) showed a nominally significant association (P = 0.01 in the five HAPO population meta-analysis) with sum of skinfolds in the neonates. Although this does not remain significant after correcting for multiple testing, these data, which need confirmation in larger meta-analysis, suggest that maternal HKDC1 variants impact neonatal outcomes modulated through maternal glucose metabolism.

A second locus, BACE2, which was associated with FCP in the current study, has not been previously associated with metabolic traits in nongravid populations. A recent large meta-analysis of nongravid EU cohorts consisting of 108,557 individuals did not report association of SNPs within the BACE2 locus with fasting insulin levels or other metabolic traits (Supplementary Table 9) (39). Moreover, results from 38,238 individuals of EU ancestry in the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC) demonstrated no evidence for association of the lead SNP in the BACE2 locus, rs6517656, with fasting insulin levels (Supplementary Table 9). BACE2 is capable of processing amyloid precursor protein (40) and is expressed in multiple tissues (41). In islets, BACE2 expression is limited to β-cells, where it is located in endocytic vesicles (42,43). It is not thought to contribute to amyloid deposition in pancreatic islets, but it has been shown to both augment and inhibit insulin secretion and/or production in human islets (42,43). Thus, BACE2 either represents a second locus that is uniquely associated in pregnancy or is a newly identified locus specifically associated with C-peptide as opposed to insulin levels, although a role for BACE2 in proinsulin processing has not been reported.

Prior studies of the genetics of maternal metabolism during pregnancy have been largely candidate gene studies focused on GDM (44). This includes studies of candidate genes based on biological plausibility or, more recently, type 2 diabetes susceptibility genes identified through GWAS (4446). The latter studies in European and, to a large degree, Asian populations have demonstrated association of a number of type 2 diabetes susceptibility genes with GDM, including TCF7L2, MTNR1B, IGF2BP2, KCNJ11, CDKAL1, KCNQ1, CDKN2A-CDKN2B, SLC30A8, HHEX, and GCK. More recently, a GWAS for GDM performed in a Korean cohort demonstrated genome-wide significant association of CDKAL1 and MTNR1B and marginal association of IGF2BP2 with GDM (47). With the exception of MTNR1B and CDKAL1, none of the loci identified in the current study were reported as demonstrating marginal evidence for association with GDM in that GWAS. Studies in nonpregnant populations have demonstrated both similarities and differences between the genetic architecture of metabolic traits and type 2 diabetes (19,48,49). We have previously demonstrated association of SNPs in TCF7L2 and GCK with fasting, 1-h and/or 2-h glucose levels in women of EU or TH ancestry (50). Thus, the results of the current study, together with the results of the previous studies described above, suggest that the genetic architecture of GDM and maternal metabolism, similar to the nongravid state, exhibit both similarities and differences.

This is the first GWAS of glycemic traits during pregnancy and is strengthened further by the inclusion of non-EU populations. We demonstrated that genes important to the genetic architecture of glycemic traits in largely EU nonpregnant populations are also important in pregnancy, suggesting similarities in the underlying genetic architecture of glycemic traits in gravid and nongravid populations that extend across ancestry groups. However, our data also suggest differences between the gravid and nongravid states. Two loci with the strongest evidence for association demonstrated either no or weak association with glycemic traits in nonpregnant populations. Together with the results of earlier studies, our findings suggest that the roles of HKDC1 in glucose metabolism and BACE2 in insulin secretion are more important during pregnancy than in the nongravid state. Defining the underlying genetic architecture of maternal glycemia during pregnancy may assist in future efforts to identify women at risk for hyperglycemia during pregnancy.


This study was supported by National Institutes of Health (NIH) grants HD-34242, HD-34243, HG-004415, and CA-141688, Institutes of Health Research–INMD (Funding Reference Number 110791), and by the American Diabetes Association. B.T.L. is supported by the U.S. Department of Veterans Affairs, Veterans Health Administration, Office of Research and Development (Career Development Grant 51K2BX001587-02).

No potential conflicts of interest relevant to this article were reported.

M.G.H., M.U., M.-F.H., T.E.R., N.J.C., and W.L.L. conceived and designed the study. M.G.H., M.-F.H., L.L.A., J.M., C.G., D.A.S., A.P., D.M.L., C.P.M., C.M.A., B.T.L., D.M., K.F.D., M.V.L., and R.N.L.-H. performed experiments and statistical analyses. L.P.L., L.B., D.B., A.R.D., and B.E.M. recruited study subjects and measured or analyzed phenotypic data. M.G.H., M.-F.H., T.E.R., and W.L.L. wrote the manuscript. All authors critically reviewed and approved the manuscript. M.G.H. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

The authors are indebted to the participants of the HAPO Study at the following centers: Newcastle and Brisbane, Australia; Bridgetown, Barbados; Toronto, Ontario, Canada; Hong Kong; Bangkok, Thailand; Belfast and Manchester, U.K.; Bellflower, California; Chicago, Illinois; Cleveland, Ohio; and Providence, Rhode Island.

Parts of this study were presented at the 72nd Scientific Sessions of the American Diabetes Association, Philadelphia, Pennsylvania, 8–12 June 2012.

  • Received December 6, 2012.
  • Accepted May 5, 2013.

Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. See for details.


| Table of Contents

This Article

  1. Diabetes vol. 62 no. 9 3282-3291
  1. Supplementary Data
  2. All Versions of this Article:
    1. db12-1692v1
    2. 62/9/3282 most recent