ST6GALNAC3 — Glycosylation Shapes How Much Vitamin D Your Body Carries

Vitamin D does not travel through the bloodstream on its own. Around 85–90% of circulating vitamin D is bound to a carrier protein called vitamin D-binding protein11 vitamin D-binding protein
VDBP, also known as GC-globulin, is a glycoprotein encoded by the GC gene on chromosome 4. It carries most of the 25(OH)D and 1,25(OH)2D in the blood and determines how long vitamin D stays in circulation
. (VDBP). The rs12144344 variant in the ST6GALNAC3 gene represents an emerging link between the biology of protein glycosylation and vitamin D status — a pathway that influences how VDBP is processed and how much of it circulates in your blood.

The Mechanism

ST6GALNAC3 encodes a sialyltransferase22 sialyltransferase
An enzyme that attaches sialic acid residues to the ends of sugar chains on glycoproteins and glycolipids, modifying their structure, stability, and biological activity
enzyme that adds sialic acid residues to O-linked glycan chains on glycoproteins. VDBP itself is a glycoprotein: the Gc1 isoforms (the most common forms) carry an O-linked trisaccharide at threonine 418 that includes a terminal sialic acid residue. The terminal sialic acid is added by a sialyltransferase, and ST6GalNAc family enzymes are strong candidates for this modification. Variants in ST6GALNAC3 may alter the enzyme's activity, changing the sialylation state of VDBP. Different sialylation patterns affect VDBP's isoelectric properties, half-life in circulation, and possibly its overall concentration. The rs12144344 variant lies in an intron and most likely influences gene expression or splicing rather than the enzyme's active site directly.

The Evidence

In a genome-wide association study of 1,380 Finnish men33 genome-wide association study of 1,380 Finnish men
Moy KA et al. Genome-wide association study of circulating vitamin D-binding protein. Am J Clin Nutr, 2014
from the Alpha-Tocopherol, Beta-Carotene Cancer Prevention (ATBC) cohort, rs12144344 was the third strongest genetic signal for circulating VDBP levels. Each copy of the T allele was associated with an increase of approximately 396 nmol/L in serum VDBP (SE = 80.21, P = 5.9 × 10⁻⁷). Across genotype groups, mean VDBP concentrations were 5,408 nmol/L (CC), 5,825 nmol/L (CT), and 6,200 nmol/L (TT) — a roughly 15% step-wise increase per allele. This was the first genome-wide scan for VDBP as a distinct biochemical endpoint and the first study to implicate ST6GALNAC3 in vitamin D biology.

The association was genome-wide suggestive (P < 5 × 10⁻⁶) in the European cohort but was not replicated in an African-ancestry GWAS of VDBP44 African-ancestry GWAS of VDBP
Wang et al. 2023 — GWAS of circulating vitamin D outcomes among individuals of African ancestry, n = 9,536. PMC10196601
(β = −0.06, SE = 0.04, P = 0.17), suggesting the effect may be population-specific or that the original signal was driven by European-ancestry linkage disequilibrium. The variant has not been assessed in large vitamin D GWAS as a direct endpoint.

Because this variant affects VDBP concentration (total carrier protein) rather than the vitamin D activation or receptor pathway, it operates differently from the major GC variants (rs7041, rs4588). Higher VDBP means more protein-bound vitamin D in circulation but does not necessarily mean more biologically active (free) vitamin D at the cellular level.

Practical Implications

TT homozygotes in the European cohort had VDBP concentrations approximately 15% above CC homozygotes. Higher VDBP generally means higher total 25(OH)D on standard blood tests, but free (biologically active) vitamin D may not increase proportionally — more carrier protein can sequester more vitamin D without increasing delivery to tissues. This raises the possibility that TT carriers with seemingly adequate total 25(OH)D may have less freely available vitamin D at the cellular level.

The effect size is modest compared to the primary GC variants (rs7041 and rs4588 each account for much larger differences in VDBP), and the lack of replication in non-European populations means this variant should be interpreted cautiously. Where available, testing free 25(OH)D alongside total provides a more complete picture of functional vitamin D status.

Interactions

The primary determinants of serum VDBP and vitamin D transport are rs7041 and rs4588 in the GC gene itself. rs12144344 appears to modulate VDBP levels through a separate glycosylation pathway, making it biologically additive to (rather than redundant with) the GC variants. If an individual carries both higher-VDBP GC alleles and the rs12144344-T allele, the combined effect on total VDBP concentration — and the gap between total and free vitamin D — could be greater than either variant alone. This interaction has not been formally studied.

When ApoB Is Cut Short — The APOB Stop Codon at Position 2085

Apolipoprotein B-100 (ApoB-100)11 Apolipoprotein B-100 (ApoB-100)
The structural backbone of every LDL and VLDL particle; a single 4,536-amino-acid protein that must be assembled intact before each particle can leave the liver
is among the largest proteins in the human body. The rs121918386 variant introduces a premature stop codon at amino acid 2085, producing a truncated protein roughly 46% the length of normal ApoB-100. In the lipoprotein assembly pathway, this is enough to prevent full LDL particle formation — driving LDL cholesterol levels far below population norms in heterozygous carriers and, in the extremely rare biallelic state, causing serious fat-soluble vitamin malabsorption.

The Mechanism

In the liver, each nascent VLDL particle requires one intact ApoB-100 molecule as its structural scaffold. The rs121918386 A allele changes codon 2085 from arginine (CGG) to a stop signal (TGA) at the coding level 22 Plus-strand allele A corresponds to c.6253C>T on the minus-strand APOB gene — the plus-strand G is the reference; A is the pathogenic alternate. The resulting truncated fragment — sometimes called ApoB-46 (representing 46% of the full-length protein) — lacks the C-terminal domains required for stable lipoprotein particle completion. Most truncated fragments this short are degraded intracellularly before secretion.

The truncation at position 2085 is notable relative to another key landmark: the ApoB-48 editing site at codon 2153. Full-length intestinal ApoB-48 (required for chylomicron assembly) is encoded by the first ~48% of the APOB mRNA. The 2085 truncation falls just upstream of this boundary, meaning heterozygous carriers retain full intestinal ApoB-48 production from their intact allele — fat absorption and chylomicron formation are normal at the heterozygous stage.

An unexpected consequence of this mutation is that apoB-100 secretion drops to roughly 25% of normal, not the 50% predicted from losing one functional allele33 apoB-100 secretion drops to roughly 25% of normal, not the 50% predicted from losing one functional allele
Schonfeld 2003 — truncated apoB fragments suppress ApoB-100 secretion from the intact allele through a dominant-negative effect on endoplasmic reticulum lipidation
. This amplified reduction explains the deeply suppressed LDL cholesterol seen in heterozygous carriers.

The Evidence

The cardiovascular implications of APOB truncating variants were quantified definitively by Peloso et al. (2019)44 Peloso et al. (2019), who analysed rare protein-truncating variants in APOB across studies involving approximately 58,000 individuals. Carriers showed a 72% reduction in coronary heart disease risk (OR 0.28) — one of the largest protective effects documented for any single-gene variant against cardiovascular disease.

Farese et al. (1992)55 Farese et al. (1992) documented that heterozygous APOB truncation carriers consistently show LDL cholesterol one-quarter to one-third of unaffected family members — a striking deviation from routine lipid panels that often triggers clinical investigation for secondary causes before the genetic basis is identified.

Welty (2020)66 Welty (2020) synthesised data from 12 case-control studies and confirmed the hepatic risk in this condition: fatty liver, cirrhosis, and hepatocellular carcinoma have been reported in familial hypobetalipoproteinemia, primarily in biallelic disease, because lipid that cannot be packaged into VLDL accumulates in hepatocytes. Heterozygotes have roughly 5–10% risk of nonalcoholic hepatic steatosis.

This variant is extraordinarily rare: the A allele appears in approximately 23 of 1.4 million alleles in gnomAD v4 exomes, present predominantly in European-ancestry populations at an estimated frequency of ~0.0000172. Homozygosity is essentially absent in population data.

Practical Actions

For heterozygous carriers (AG genotype), the unexpected clinical picture is unexpectedly low LDL on routine testing — often triggering workup for secondary hypocholesterolaemia before the genetic cause is found. Once identified, the low-LDL profile is strongly cardioprotective. The key concerns are hepatic steatosis (in a minority of carriers) and the need to flag the genotype to any prescribing physician considering LDL-lowering therapy, since baseline LDL is already dramatically suppressed.

Published surveillance guidelines (Burnett, Hooper, and Hegele, GeneReviews 202177 Burnett, Hooper, and Hegele, GeneReviews 2021) recommend fasting lipid panels and liver function tests every 1–2 years for heterozygous carriers, with hepatic ultrasound every 3 years if transaminases are persistently elevated.

Interactions

rs121918386 shares its disease category — familial hypobetalipoproteinemia type 1 — with other APOB truncating variants, most notably rs121918384 (Val1856fs, a frameshift upstream of this stop codon). Compound heterozygotes carrying two different APOB loss-of-function alleles develop biallelic-equivalent disease with severe fat malabsorption, as each allele independently destroys ApoB-100 function. Within the broader lipid pathway, co-occurrence with PCSK9 gain-of-function variants (which further reduce LDL receptor clearance) or LDLR loss-of-function alleles could partially offset the hypocholesterolaemic phenotype — producing a paradoxically normal LDL despite carrying a hypobetalipoproteinemia allele.

FOXO3's Inflammatory Rheostat — A Variant With Documented Functional Consequences

Most FOXO3 longevity variants are statistical associations — interesting signals in large cohort studies without a known molecular mechanism. rs12212067 is different. This intronic variant has been mechanistically characterized: the minor G allele creates a binding site for myeloid zinc finger 1 (MZF1), a transcription factor expressed primarily in myeloid cells (monocytes, macrophages, neutrophils), that increases FOXO3 expression in these cells. The downstream consequence is a documented shift in the cytokine balance of monocytes — less TNFα, IL-1β, IL-6, and IL-8; more anti-inflammatory IL-10.

Lee et al. 201311 Lee et al. 2013
Human SNP links differential outcomes in inflammatory and infectious disease to a FOXO3-regulated pathway. Cell. 2013
identified this SNP through a cross-disease genomic analysis, showing that the same minor allele that mildly reduces inflammatory disease severity — through reduced monocyte pro-inflammatory output — paradoxically increases the risk of severe malaria. This evolutionary trade-off (dampened inflammation protects against autoimmune damage but impairs pathogen clearance) is the hallmark of immune-regulatory polymorphisms, and rs12212067 is now one of the most thoroughly characterized examples in the human genome.

The minor G allele has a frequency of roughly 13% in Europeans and 18% in Africans, meaning it is relatively rare. The common TT genotype (approximately 76% of people) has the higher-inflammation profile, while TG heterozygotes (~23%) and the very rare GG homozygotes (~1%) carry the anti-inflammatory variant.

The Mechanism

rs12212067 sits in intron 2 of FOXO3, within the same 101,625 base-pair noncoding region that harbors the major longevity haplotype variants. The G allele creates a binding site for myeloid zinc finger 1 (MZF1), which in myeloid cells drives increased FOXO3 transcription. Elevated FOXO3 then acts through a TGFβ1-dependent pathway to suppress nuclear translocation of inflammatory transcription factors, reducing output of the major monocyte-derived cytokines TNFα, IL-1β, IL-6, and IL-8, while promoting IL-10 production.

The key point is that this is a myeloid-specific effect. MZF1 is primarily expressed in monocytes, macrophages, and neutrophils — not in lymphocytes or non-immune tissues. This means rs12212067 specifically tunes the innate immune response, not adaptive immunity. The 2016 inflammatory polyarthritis study22 2016 inflammatory polyarthritis study
Viatte S et al. Association Between Genetic Variation in FOXO3 and Reductions in Inflammation and Disease Activity in Inflammatory Polyarthritis. Arthritis Rheumatol. 2016
confirmed this mechanism clinically, showing that G-allele carriers had lower serum CRP, IL-6, and anticollagen antibody titers, and better disease activity scores (DAS28), swollen joint counts, and Health Assessment Questionnaire scores over time.

rs12212067 is in partial — but not strong — linkage disequilibrium with the major longevity-associated variant rs2802292. The two variants are in the same longevity haplotype block, but rs12212067 has a substantially lower minor allele frequency (~0.13 vs ~0.44 for rs2802292 G allele). This means rs12212067 may represent a more recently derived variant that overlaps with the longevity haplotype while having its own independent functional effect through a distinct mechanism (MZF1 vs HSF1 binding).

The Evidence

The mechanistic case for rs12212067 rests on four independent lines of evidence:

1. Inflammatory disease prognosis. The founding Cell 2013 paper showed the G allele was associated with milder Crohn's disease and rheumatoid arthritis course in multiple European cohorts. Importantly, this was a prognosis signal, not a susceptibility signal — the G allele did not change who got the disease, only how severe it became.

2. Cellular cytokine phenotype. Monocytes isolated from GG homozygotes produced measurably less TNF, IL-1β, IL-6, and IL-8, and more IL-10, than TT monocytes. This is a direct ex vivo functional demonstration — not just an association signal.

3. Mortality resilience proteomics. Donlon et al. 202333 Donlon et al. 2023
Proteomic basis of mortality resilience mediated by FOXO3 longevity genotype. GeroScience. 2023
analyzed 4,500 serum proteins in 975 older men and found that G-allele carriers had 44 stress-protein mortality pathways significantly attenuated — meaning high levels of danger proteins like GDF15 (a well-established aging biomarker) that predicted mortality in TT individuals were largely non-lethal in G-allele carriers. This mortality resilience operated specifically through innate immunity, bone morphogenetic protein signaling, leukocyte migration, and growth factor response pathways.

4. Infectious disease paradox. Allard et al. 201444 Allard et al. 2014
FOXO3A regulatory polymorphism and susceptibility to severe malaria in Gabonese children. Immunogenetics. 2014
showed the G allele increased severe malaria risk by 54% (OR 1.54, P=0.0028) in African children. This confirms the cellular phenotype: dampened innate inflammatory responses protect against immunopathology in chronic inflammatory diseases but impair acute pathogen clearance in severe infections requiring robust inflammatory defenses.

Practical Implications

For individuals with the common TT genotype, the relevant insight is that their monocytes operate with a higher-inflammation set point — elevated baseline capacity for TNFα, IL-6, and IL-1β production. This is not inherently harmful and in fact may provide better protection against acute infections. However, this higher inflammatory tone becomes a liability in the context of chronic inflammatory diseases and age-related inflammatory accumulation ("inflammaging").

The actionable evidence centers on lifestyle and dietary strategies that suppress chronic low-grade inflammation independently of genotype: omega-3 fatty acids directly reduce monocyte-derived TNFα and IL-6 production; time-restricted eating reduces inflammatory cytokine levels; regular moderate exercise shifts monocyte phenotype toward anti-inflammatory profiles. For TT individuals with diagnosed inflammatory conditions (RA, IBD, cardiovascular disease), this genetic background is relevant context for clinicians regarding inflammatory disease severity expectations.

For G-allele carriers, the anti-inflammatory phenotype generally confers benefit in inflammatory disease contexts but warrants awareness when traveling to or living in regions with high malaria transmission — the same pathway that limits inflammatory tissue damage also reduces capacity to generate effective innate immune responses against acute intracellular pathogens.

Interactions

rs12212067 is in partial LD with the FOXO3 longevity haplotype anchored by rs280229255 rs2802292, but the functional mechanisms are distinct: rs2802292 G allele creates an HSF1 binding site activated by oxidative stress and heat shock, while rs12212067 G allele creates an MZF1 binding site that specifically operates in myeloid cells. Individuals carrying longevity-associated alleles at both loci would have FOXO3 upregulated through two independent transcription factor pathways in two distinct cellular contexts — stress-response cells (HSF1) and innate immune cells (MZF1).

The rs1220609466 rs12206094 variant provides a third FOXO3 regulatory layer (CTCF-binding enhancer activity reversible by IGF-1), and rs494693577 rs4946935 provides a fourth (SRF binding site). These overlapping regulatory mechanisms suggest that the FOXO3 locus has evolved multiple independent control points, each tunable by different cellular signals.

rs12272669

UNKNOWN

Emerging Risk Factor

An Uncharted B12 Locus on Chromosome 11

Vitamin B12 (cobalamin) levels in circulation are controlled by a cascade of transport, absorption, and processing steps — each influenced by genetic variation at multiple points. From gut absorption (FUT2) to cellular delivery (TCN2) to intracellular activation (MMACHC), genome-wide association studies have now mapped over a dozen loci influencing serum B12.

The rs12272669 variant sits on chromosome 11 at position 71,681,563 (GRCh38), in a non-coding intergenic region near the pseudogene ALG1L9P. Despite its location in a functionally sparse region of the genome, a large sequencing-based GWAS identified the A allele as significantly associated with higher circulating vitamin B12 concentrations. What makes this locus unusual is the gap between its statistical significance and its mechanistic explanation: no protein-coding gene at this position has an obvious connection to cobalamin metabolism.

The Evidence

The association was discovered in the Grarup et al. 2013 GWAS11 Grarup et al. 2013 GWAS
Genetic Architecture of Vitamin B12 and Folate Levels Uncovered Applying Deeply Sequenced Large Datasets. PLoS Genet. 2013;9(6):e1003530
, which analyzed ~22.9 million sequence variants across up to 45,576 Icelandic and Danish individuals. The study identified six novel B12 loci and confirmed seven previously known associations. At rs12272669, the A allele was associated with higher serum B12 (p = 3.0 × 10⁻⁹) with a beta coefficient of +0.51 on the quantile-normalized scale — a substantial effect size that reached genome-wide significance. The signal was detected in the Icelandic cohort (n=37,283 with B12 measurements), where the A allele had a minor allele frequency of only 0.22%.

The frequency discrepancy between the Icelandic cohort (0.22%) and global population databases (~9–18% depending on ancestry group) is notable. Modern population data from gnomAD and 1000 Genomes show the A allele at roughly 9–11% in European, East Asian, and African populations and 18% in South Asians. The extreme rarity in Iceland suggests either a population-specific frequency difference, or that the rsid has undergone reassignment in dbSNP between the 2013 study and current databases — a known issue with older GWAS rsid cataloging.

The mechanism connecting chromosome 11q13.4 to circulating B12 is not established. Possibilities include cis-regulatory effects on a nearby expressed transcript that influences B12 transport or storage, tagging of a causal variant elsewhere through linkage disequilibrium, or an artifact of imputation in the original study. The evidence level is therefore classified as emerging — the association reached statistical significance in a large well-powered study, but without replication or mechanistic characterization.

For context, well-characterized B12 loci already in the GeneOps database illustrate what a biologically understood signal looks like. FUT2 (rs601338) affects B12 absorption by controlling mucin fucosylation in the intestinal epithelium. TCN2 (rs1801198) controls how efficiently B12 is delivered to cells via holotranscobalamin. FUT6 (rs78060698) alters haptocorrin-bound B12 levels through fucosylation. Each of these has a clear pathway connection. The chromosome 11 locus at rs12272669 lacks this mechanistic anchor.

What the A Allele Does

The A allele is the less common variant in most populations and the one associated with higher serum B12. Because the effect direction is toward higher B12 (not lower), the G allele — the reference and most common allele globally — is associated with typical or baseline B12 levels by comparison.

For individuals with the GG genotype, B12 levels are typical for the population; the genetic contribution from this locus is neutral. Carriers of the A allele (AG or AA) may have modestly higher circulating B12 on average, which likely confers a slight buffer against functional B12 insufficiency.

Practical Context

This locus does not yet meet the threshold for targeted clinical action — the mechanism is unknown, replication in independent cohorts has not been published, and the effect size, while statistically significant, is modest in absolute terms. The primary value of documenting it here is completeness within the B12- pathway genetic map and transparency about what is and is not known.

B12 status overall is best assessed by measuring holotranscobalamin (holoTC, the "active B12" fraction) or methylmalonic acid (MMA), regardless of any single variant. These functional markers reflect actual cofactor availability at the cellular level and are informative regardless of which B12-pathway variants you carry.

Interactions

This variant exists in the context of a well-mapped genetic pathway. The established B12 loci — FUT2 absorption, FUT6 haptocorrin binding, TCN2 delivery — combine additively with rare and common variants across the B12 pathway. If you carry risk variants in TCN2 (GG genotype at rs1801198) or are a FUT2 non-secretor (AA at rs601338), the chromosome 11 A allele may partially offset those risks by nudging B12 levels slightly upward, but no compound genotype data exists for this specific combination.

rs1238574

SULT1E1 SULT1E1 intronic variant

Emerging Risk Factor

SULT1E1 rs1238574 — Estrogen Clearance, Local Hormone Balance, and Cancer Susceptibility

Estrogens do not simply circulate and act — they are continuously inactivated and reactivated in tissues through a biochemical toggle controlled by sulfotransferases and sulfatases. SULT1E111 SULT1E1
Estrogen sulfotransferase; a phase II metabolizing enzyme that adds a sulfate group to estradiol and estrone, converting them to biologically inactive sulfate conjugates that cannot bind the estrogen receptor
is the primary enzyme responsible for this inactivation step, with particularly high activity in endometrial, hepatic, and placental tissue. When SULT1E1 activity falls — whether through genetic variation, epigenetic silencing, or tissue-specific dysregulation — local estrogen concentrations rise in ways that circulating hormone assays may not capture.

rs1238574 is an intronic variant located deep within SULT1E1 (c.772+856 on the transcript, chr4:69843305 GRCh38). It does not change the SULT1E1 protein sequence, but its position within an intron raises the possibility of effects on pre-mRNA processing, intronic regulatory elements, or splicing efficiency. The C allele — the minor allele globally at roughly 5–6% in European populations but substantially more common in East Asians (~36%) — has been associated with worse cancer outcomes and is the focus of emerging research interest.

The Mechanism

SULT1E1 sulfates estradiol (E2) and estrone (E1) at the 3-hydroxyl position, producing estrogen sulfates that are hydrophilic, receptor-inactive, and destined for excretion. This sulfation reaction is the counterpart to steroid sulfatase (STS), which cleaves the sulfate back off — the balance between SULT1E1 and STS sets local estrogen tone in tissues independently of the ovarian cycle.

Studies of endometriosis tissue directly document this imbalance: SULT1E1 mRNA and protein expression are significantly reduced in ovarian endometriotic lesions22 SULT1E1 mRNA and protein expression are significantly reduced in ovarian endometriotic lesions
Hevir et al. Mol Cell Endocrinol, 2013
, creating a state of local estrogen excess driven not by increased synthesis but by impaired inactivation. A 2024 study using Mendelian randomization33 2024 study using Mendelian randomization
Zou et al. Biology (Basel), 2024
identified SULT1E1 as a likely causal susceptibility gene for ovarian endometriosis through cross-tissue regulatory network analysis, suggesting genetically determined variation in SULT1E1 expression can propagate into disease risk.

The Evidence

The most direct evidence linking rs1238574 to clinical outcomes comes from a Chinese case-control study of estrogen metabolic pathway genes and colorectal cancer44 Chinese case-control study of estrogen metabolic pathway genes and colorectal cancer
Li et al. Archives of Toxicology, 2018
. Among SULT1E1 variants studied, rs1238574 was associated with significantly worse progression-free survival (HR = 1.24, 95% CI 1.02–1.50, P = 0.028) and overall survival (HR = 1.51, 95% CI 1.16–1.97, P = 0.002) in colorectal cancer patients. The companion variant rs3822172 showed similar associations (HR = 1.30–1.53). These findings are biologically coherent: estrogens have been proposed to reduce colorectal cancer risk partly through anti-inflammatory and cell-cycle effects, and impaired estrogen inactivation in colonic tissue could paradoxically alter local estrogenic signaling quality.

At the gene level, SULT1E1 variants have been connected to endometrial cancer risk in a population-based case-control study of 502 cases and 1,326 controls55 population-based case-control study of 502 cases and 1,326 controls
Rebbeck et al. J Natl Cancer Inst, 2006
: the SULT1E1 promoter G→A variant carried an odds ratio of 1.45 (95% CI 1.06–1.99) for endometrial cancer, particularly when combined with hormone replacement therapy use. A separate Korean study66 Korean study
Choi et al. Cancer Epidemiol Biomarkers Prev, 2005
found SULT1E1 haplotypes were associated with both breast cancer risk and recurrence (hazard ratio >3 for certain haplotype combinations).

The evidence for rs1238574 specifically is limited to a single study in a Chinese cohort and is classified as emerging. Replication in independent cohorts and functional characterization of this intronic variant's effect on SULT1E1 expression are needed.

Practical Actions

For individuals carrying the C allele, particularly in the context of estrogen-sensitive conditions or cancer risk assessment, the relevant clinical consideration is whether SULT1E1-mediated estrogen inactivation is functioning optimally. Phytoestrogens from soy — specifically isoflavones — have been shown to elevate SULT1E1 enzyme levels in endometriotic stromal cells77 elevate SULT1E1 enzyme levels in endometriotic stromal cells
Tarumi et al. Gynecol Endocrinol, 2025
, suggesting dietary phytoestrogen intake may partially compensate for reduced SULT1E1 activity. Cruciferous vegetables support phase II metabolism broadly.

Monitoring for estrogen-driven conditions — endometriosis, uterine fibroids, and estrogen-sensitive cancers — is a reasonable precaution for C allele carriers given the biological plausibility of the SULT1E1 pathway and the emerging cancer survival data.

Interactions

rs3822172 (SULT1E1): This second SULT1E1 intronic variant was studied in the same Chinese colorectal cancer cohort as rs1238574 and showed nearly identical survival associations (HR = 1.30 for PFS, HR = 1.53 for OS). Whether these two variants are in linkage disequilibrium or tag independent functional effects within SULT1E1 is not yet established. Both should be considered together when assessing SULT1E1 function.

Estrogen metabolism pathway: SULT1E1 operates in concert with CYP1B1 (which activates catechol estrogens), CYP1A1 (estrogen hydroxylation), STS (sulfatase reactivation), and UGT enzymes (glucuronidation). Combined variation across these enzymes, particularly CYP1A1 and SULT1A1, has been associated with amplified endometrial cancer risk in case-control studies. Compound action proposals for supervisor: consider a SULT1E1/CYP1A1 interaction action if rs1238574 and a CYP1A1 risk allele co-occur in the same user, given the documented combined OR of 4.58 (Hirata et al. 2008, PMID 18318428).

rs12722

COL5A1 C/T 3'UTR

Moderate Risk Factor

COL5A1 — The Collagen Blueprint Behind Tendon Strength

Your tendons and ligaments are built from collagen fibrils — rope-like protein structures that give connective tissue its strength and elasticity. Type V collagen, encoded by the COL5A1 gene, acts as a master regulator of this construction process. It controls how thick individual collagen fibrils11 collagen fibrils
microscopic protein cables that bundle together to form tendons, ligaments, and other connective tissues
grow by embedding within larger type I collagen fibrils and limiting their lateral expansion. The rs12722 variant sits in the 3'UTR22 3'UTR
the 3' untranslated region of mRNA, which regulates how stable the mRNA molecule is and how much protein gets made from it
of COL5A1, where it alters how much type V collagen your cells produce.

The Mechanism

The C-to-T change at rs12722 affects mRNA stability rather than protein structure. Functional studies33 Functional studies
Laguette et al. Sequence variants within the 3'-UTR of the COL5A1 gene alters mRNA stability. Matrix Biology, 2011
demonstrated that the T allele produces more stable mRNA transcripts, which leads to increased production of the type V collagen alpha-1 chain. While more collagen might sound beneficial, an excess of type V collagen actually disrupts the normal fibril assembly process. The resulting fibrils may have altered diameter and spacing, changing the mechanical properties of tendons and ligaments — making them stiffer and potentially more prone to injury under repetitive loading.

This mechanism also helps explain why the variant influences range of motion44 range of motion
Brown et al. The COL5A1 genotype is associated with range of motion measurements. Scand J Med Sci Sports, 2011
: individuals with the CC genotype tend to have greater joint flexibility, while TT carriers have stiffer connective tissue and reduced range of motion.

The Evidence

The association between rs12722 and soft tissue injuries has been examined in multiple studies and meta-analyses:

  • The original discovery55 original discovery
    Mokone et al. The COL5A1 gene and Achilles tendon pathology. Scand J Med Sci Sports, 2006
    found that the C allele (here called A2) was significantly more common in healthy controls than in patients with chronic Achilles tendinopathy (29.8% vs 18.0%, OR 1.9).
  • September et al.66 September et al.
    September et al. Variants within the COL5A1 gene are associated with Achilles tendinopathy in two populations. Br J Sports Med, 2009
    replicated the finding in a second independent population, strengthening the evidence.
  • A 2018 meta-analysis of 9 studies77 2018 meta-analysis of 9 studies
    Lv et al. Association between polymorphism rs12722 in COL5A1 and musculoskeletal soft tissue injuries: a systematic review and meta-analysis. Oncotarget, 2018
    (1,140 cases, 1,410 controls) found TT carriers had 58% higher risk of soft tissue injuries compared to CT/CC carriers (OR 1.58, 95% CI 1.33-1.89). When broken down by injury type: tennis elbow OR 2.06, ACL injuries OR 1.53, Achilles tendon pathology OR 1.48.
  • The largest meta-analysis to date88 largest meta-analysis to date
    Guo et al. Association of COL5A1 gene polymorphisms and musculoskeletal soft tissue injuries: a meta-analysis based on 21 observational studies. J Orthop Surg Res, 2022
    (2,164 cases, 5,079 controls) confirmed the association with an allelic OR of 1.14 and homozygous (TT vs CC) OR of 1.33, driven primarily by ligament injuries in Caucasian populations.

Importantly, the association appears strongest in Caucasian populations and for ligament injuries specifically. Studies in East Asian populations have generally not found significant associations, which may reflect both the lower T allele frequency in those populations (~17% vs ~58% in Europeans) and potential gene-environment differences.

Practical Implications

The TT genotype increases injury risk under a recessive model — carrying one T allele (CT) confers only modestly elevated risk, while two copies (TT) is where the meaningful increase begins. If you carry TT, this does not mean injury is inevitable. It means your connective tissue may be less resilient to repetitive strain, and proactive measures — adequate warm-up, progressive training loads, eccentric strengthening exercises, and collagen-supportive nutrition — become more important.

The C allele appears protective for flexibility and injury resistance. CC carriers tend to have greater range of motion and lower baseline risk for tendon and ligament injuries.

Interactions

The rs12722 variant interacts with other COL5A1 polymorphisms. The nearby rs13946 variant (also in the 3'UTR) has been studied alongside rs12722, and haplotype analysis suggests their combined effect may further modulate injury risk. The rs3196378 variant in the same gene has also shown independent associations with soft tissue injury susceptibility. Additionally, variants in other collagen genes (COL1A1, COL11A1, COL11A2) may compound the effect on connective tissue properties.

The 5q31 PCOS Locus — Immune Signaling, Androgens, and Polycystic Ovary Risk

On chromosome 5, within a gene-dense region at 5q31.1, lies rs13164856 — a regulatory variant in the vicinity of two notable genes: [IRF1 | Interferon Regulatory Factor 1, a transcription factor that modulates immune signaling and gene expression] and [RAD50 | A DNA double-strand break repair protein that is part of the MRN complex critical for genome integrity]. This variant was identified in one of the first large-scale genome-wide association studies of polycystic ovary syndrome (PCOS) in women of European ancestry, and has been distinguished from other PCOS loci by its specific association with circulating testosterone levels.

The Mechanism

rs13164856 is a non-coding tag SNP — it does not change a protein sequence but instead tags a haplotype block that likely affects the regulation of one or more nearby genes. The 5q31 region contains a cluster of immunologically active genes (IL4, IL13, IL5, IRF1, RAD50), and colocalization analyses using single-cell [eQTL | expression quantitative trait locus — a variant that influences how much a nearby gene is expressed] data have identified IRF1 as the most plausible candidate causal gene at this locus.

IRF1 is a transcription factor with broad roles in innate immunity and cellular stress responses. It has also been identified as a regulator of androgen receptor (AR) expression through the IL-6/TLR4 signalling axis, providing a plausible molecular link between immune activation and androgen excess — a hallmark of PCOS. RAD50, as part of the MRN (MRE11-RAD50-NBS1) complex, participates in DNA double-strand break repair, a process that is critical for oocyte viability and primordial follicle maintenance throughout a woman's reproductive lifespan.

The T allele at rs13164856 is the common allele (approximately 71% in Europeans) and is the risk-associated allele, meaning the majority of women carry at least one copy. The effect is additive: each additional T allele modestly increases PCOS risk. Among the confirmed PCOS susceptibility loci, the IRF1/RAD50 locus stands out for its association specifically with testosterone levels rather than LH/FSH ratios, ovarian morphology, or other PCOS endophenotypes.

The Evidence

Day et al. 201511 Day et al. 2015
Causal mechanisms and balancing selection inferred from genetic associations with polycystic ovary syndrome. Nature Communications 6:8464
conducted a GWAS of PCOS in up to 5,184 self-reported European-ancestry cases and 82,759 controls, followed by replication in approximately 2,000 clinically validated cases and 100,000 controls. The study identified six genome-wide significant loci including the RAD50/IRF1 5q31 locus (P=3.5×10⁻⁹). Mendelian randomisation analyses in the same paper confirmed causal roles for elevated BMI, insulin resistance, and reduced SHBG in PCOS aetiology.

A 2018 large-scale meta-analysis22 A 2018 large-scale meta-analysis
Day et al. Large-scale GWAS meta-analysis of PCOS in 10,074 cases and 103,164 controls. PLOS Genetics, 2018
confirmed rs13164856 at genome-wide significance (P=1.45×10⁻¹⁰) with an odds ratio of 1.13 (95% CI 1.09–1.18). Crucially, in phenotypic analyses of PCOS endophenotypes, the IRF1/RAD50 locus was the only confirmed PCOS locus uniquely associated with testosterone levels, distinguishing it from loci that primarily affect gonadotropin ratios, ovulatory function, or ovarian morphology.

Replication in Han Chinese33 Replication in Han Chinese
Peng et al. ERBB4 confers risk for PCOS in Han Chinese. Scientific Reports 2017
found that allele frequencies of rs13164856 were not significantly different between PCOS cases and controls in 1,500 Han Chinese cases and 1,220 controls, suggesting the association is European-ancestry specific or that the effect size is considerably smaller in East Asian populations.

The variant has no entry in ClinVar and is not listed in OMIM as pathogenic — it represents a common susceptibility allele with a modest but replicated effect in the expected direction.

Practical Implications

The T allele's association with testosterone levels positions this variant as relevant to both PCOS diagnosis workup and reproductive decision-making. Women who carry one or two copies of the T allele — the majority of European-ancestry women — may have a modestly elevated androgen set-point compared to CC carriers. In practice, this means that borderline elevated testosterone or free androgen index measurements may, in part, reflect genetic background rather than a pathological process.

From a monitoring perspective, women with the TT genotype who have irregular cycles, unwanted hair growth, or fertility difficulties should consider having total and free testosterone measured alongside anti-Müllerian hormone (AMH), as this genotype is specifically linked to the androgen-excess dimension of PCOS. The CC genotype (approximately 8% of European-ancestry women) does not provide protection from PCOS through other pathways, but does suggest that androgen excess specifically driven by this locus is less likely.

For offspring analysis, males who carry the T allele do not express the ovarian PCOS phenotype, but may pass the allele to daughters. The T allele does not have documented effects in males in the published GWAS literature.

Interactions

FSHB rs10553397: The FSHB locus (11p14.1) is another of the six Day et al. 2015 PCOS loci and is strongly associated with elevated LH and a high LH/FSH ratio — the hallmark neuroendocrine PCOS phenotype. A woman carrying both the rs13164856 T allele (elevated testosterone via the IRF1/RAD50 pathway) and a FSHB risk variant (elevated LH, suppressed FSH via pituitary dysregulation) would carry susceptibility through two distinct biological mechanisms: androgen-excess and gonadotropin imbalance. These represent the two most clinically prominent PCOS dimensions, and their combination may identify women with a more complete PCOS phenotype requiring both anti-androgen and gonadotropin-normalizing considerations in management.

FSHR rs6166 (hormones-sleep category): Although rs13164856 is not an FSHR variant (it is on chromosome 5 near IRF1/RAD50, while FSHR is on chromosome 2), women with this variant who also carry the FSHR rs6166 GG (Ser/Ser) genotype face a compound challenge: elevated androgen levels (from rs13164856) combined with reduced FSH receptor sensitivity (from rs6166 GG). In an IVF context, this combination may predict both PCOS-like ovarian phenotype and a relatively poor response to standard gonadotropin stimulation, warranting a tailored approach that addresses both dimensions.

ANRIL and the 9p21 Aging Accelerator — Senescence at the Heart of Cardiovascular Risk

A single region on chromosome 9 — the 9p21.3 locus — carries the most robustly replicated genetic association with coronary artery disease ever discovered. The rs1333049 variant sits within CDKN2B-AS1, the gene encoding ANRIL11 ANRIL
Antisense Non-coding RNA in the INK4 Locus — a long non-coding RNA of 3,834 bp spanning 126 kb of the genome
, positioned directly adjacent to the cell-cycle-inhibitor genes CDKN2A (p16-INK4a, p14-ARF) and CDKN2B (p15-INK4b). This cluster is not merely a cardiovascular locus — it is the master switch for cellular senescence in vascular tissue.

The C allele at rs1333049 is present in roughly 47% of people of European, South Asian, and East Asian descent, but only 26% of those of African ancestry. This extraordinary prevalence makes 9p21 among the most impactful polygenic cardiovascular risk factors in the human genome.

The Mechanism

ANRIL maintains the proliferative, non-senescent state of vascular smooth muscle cells and macrophages by recruiting Polycomb repressive complexes PRC1 and PRC2 to the CDKN2A/B loci (PRC2 deposits H3K27me3 marks that silence p16 and p15; PRC1 locks in this silencing), epigenetically suppressing p16-INK4a and p15-INK4b. When these suppressors are lifted — as happens in aging and atherosclerosis — cells exit the cell cycle and enter senescence, losing proliferative repair capacity and secreting pro-inflammatory factors that promote plaque vulnerability.

The risk C allele at rs1333049 alters ANRIL expression and splicing. The 9p21 risk haplotype disrupts enhancer elements within ANRIL's regulatory architecture, impairing the ANRIL–polycomb axis and allowing inappropriate early de-repression of p16 and p15 in vascular tissue. The consequence is accelerated vascular smooth muscle cell senescence and impaired repair capacity22 accelerated vascular smooth muscle cell senescence and impaired repair capacity
Studies in VSMCs from C-allele homozygotes showed lowest p16 and p15 expression alongside highest plaque burden
, facilitating atherosclerotic plaque formation. Paradoxically, risk allele carriers show both dysregulated senescence induction and increased vascular inflammation — operating through a secondary interferon-γ pathway33 secondary interferon-γ pathway
Harismendy et al. 2011 Nature showed 9p21 risk SNPs impair STAT1 binding at an IFN-γ-responsive enhancer
.

The C allele is also associated with earlier coronary disease onset and higher cholesterol and triglyceride levels44 earlier coronary disease onset and higher cholesterol and triglyceride levels
CC carriers develop coronary disease 2–5 years earlier; C allele increases total cholesterol, LDL-C, and triglycerides independently of lifestyle
, effects that are independent of traditional cardiovascular risk factors.

The Evidence

The Samani et al. GWAS (NEJM 2007)55 Samani et al. GWAS (NEJM 2007)
Joint analysis of the Wellcome Trust Case Control Consortium and German MI Family Study — 2,801 cases and 4,582 controls across two cohorts (WTCCC + German replication)
reported rs1333049 as the top hit for CAD genome-wide (combined p = 2.91×10⁻¹⁹), with risk increased by 36% per copy of the C allele (OR 1.36 per allele; OR ~1.90 for CC vs GG). This has been replicated in dozens of populations across four continents.

A 2011 INTERHEART and FINRISK analysis66 2011 INTERHEART and FINRISK analysis
8,114 INTERHEART participants from 52 countries plus 19,129 FINRISK participants
found that the 9p21 effect on MI risk was present only in individuals consuming low-prudent diets (OR 1.32, p < 0.001), was attenuated at medium intake (OR 1.17), and was statistically eliminated in individuals consuming the highest amounts of raw vegetables, fruits, and berries (OR 1.02, p = 0.68) (measured at LD proxy rs2383206). This is one of the most striking gene-diet interactions documented for a common cardiovascular variant.

Conversely, a study in a Hispanic cohort (n = 3,311)77 a study in a Hispanic cohort (n = 3,311)
1,560 MI cases and 1,751 controls from Costa Rica
found that high sugar-sweetened beverage intake exacerbated the 9p21 risk for myocardial infarction (measured at LD proxy rs4977574).

The locus also predicts disease severity: in the GRACE Genetics Study88 GRACE Genetics Study
3,247 ACS patients followed prospectively
, C allele carriers had HR 1.48 for recurrent MI and cardiac death within 6 months of an acute coronary syndrome.

From the longevity angle, a study of Spanish centenarians99 study of Spanish centenarians
225 centenarians vs. 293 CAD-free healthy controls and 148 CAD controls
found a non-significant trend (p=0.088) toward lower C-allele frequency in Spanish centenarians; this was not replicated in a Japanese cohort.

Practical Implications

The most actionable finding for C-allele carriers is the strong diet interaction. In the INTERHEART study, high raw vegetable and fruit intake dose-dependently attenuated and ultimately eliminated the genetic MI risk — suggesting specific food choices, not merely broad lifestyle patterns, interact with the 9p21 locus biology. The operative component is not vague "healthy eating" but specifically high load of raw, unprocessed plant foods1010 high load of raw, unprocessed plant foods
Raw vegetables preserve heat-labile phytonutrients that may directly counter the ANRIL-mediated inflammatory pathway
. Eliminating sugar-sweetened beverages is separately supported as an action to avoid amplifying genetic risk.

For CC homozygotes in particular, measuring fasting insulin levels1111 measuring fasting insulin levels
The Swedish Obese Subjects study showed rs1333049 CC/CG carriers with elevated fasting insulin derived MI-preventing benefit from bariatric surgery (HR 0.72)
may guide treatment decisions. Carriers with elevated fasting insulin may benefit especially from interventions that reduce hyperinsulinemia, including dietary carbohydrate restriction or, in severe obesity, bariatric surgery.

Given that 9p21 accelerates plaque formation rather than disease progression once plaque exists, coronary artery calcium (CAC) scoring is a rational targeted screening tool for C-allele carriers to detect subclinical atherosclerosis years before symptoms develop.

Interactions

rs1333049 is in near-perfect linkage disequilibrium with rs10757278 (r² = 1.0 in CEU) and high LD with rs4977574 (r² ≈ 0.89)1212 near-perfect linkage disequilibrium with rs10757278 (r² = 1.0 in CEU) and high LD with rs4977574 (r² ≈ 0.89)
These SNPs are inherited together as a haplotype block and measure the same biological effect
. Testing any one of these SNPs captures essentially the same 9p21 risk signal.

The 9p21 locus shows documented interaction with shorter telomere length on coronary artery disease prognosis: among CAD patients, those combining rs1333049 C allele with short telomeres have the worst cardiovascular outcomes1313 among CAD patients, those combining rs1333049 C allele with short telomeres have the worst cardiovascular outcomes
An additive interaction between 9p21 genotype and telomere length was found for CAD prognosis beyond either factor alone
. Telomere length (rs12696304 in TERC) therefore compounds with 9p21 for longevity risk, though compound action advice requires both results.

The 9p21 locus additionally interacts with insulin metabolism: fasting insulin levels modify who benefits from weight-loss intervention, and dietary prudence modifies the genetic risk itself. This bidirectional nutrient-gene-metabolic interaction makes this locus unusually amenable to targeted dietary management.

MCM6 -14010G>C — The African Lactase Persistence Switch

Lactase persistence — the ability to digest milk sugar into adulthood — is one of the most dramatic examples of recent human evolution. While the European variant rs4988235 is the most studied, at least five independent mutations arose on separate continents to produce the same outcome. The -14010G>C variant (rs145946881) is the East African solution to the same selective pressure, and it tells a story of convergent evolution every bit as compelling as the European lineage.

The Mechanism

The MCM6 gene sits immediately upstream of LCT on chromosome 2. Although MCM6 encodes a DNA replication protein11 DNA replication protein
Minichromosome Maintenance Complex Component 6 — part of the helicase that unwinds DNA before replication
, the relevant function here is entirely regulatory: a region within MCM6 intron 13 acts as a long-range enhancer for the LCT gene roughly 13.9 kb away.

The -14010G>C SNP is located between binding sites for the transcription factors Oct-1 and HNF1α. The derived C allele creates stronger affinity for Oct-1, which in turn holds the enhancer in its active state and prevents the normal post-weaning silencing of LCT expression. Jensen et al. (2011)22 Jensen et al. (2011)
Jensen TG et al. The -14010*C variant associated with lactase persistence is located between an Oct-1 and HNF1alpha binding site and increases lactase promoter activity. Hum Genet. 2011 Oct;130(4):483-93.
confirmed this through transfection experiments showing the C allele enhances LCT promoter activity compared to the ancestral G allele.

The Evidence

The discovery of this variant in African populations came from Tishkoff et al. (2007)33 Tishkoff et al. (2007)
Tishkoff SA et al. Convergent adaptation of human lactase persistence in Africa and Europe. Nat Genet. 2007 Jan;39(1):31-40.
, who genotyped 470 Tanzanians, Kenyans, and Sudanese. The C-14010 allele showed the strongest association of all 123 tested SNPs in Kenyan Nilo-Saharan and Tanzanian populations (p = 2.9 × 10⁻⁷, highly significant after Bonferroni correction). Extended haplotype homozygosity tracts of >2 Mb around this locus indicate strong recent positive selection over approximately 7,000 years — a timeframe consistent with the spread of pastoralism across East Africa.

Ranciaro et al. (2014)44 Ranciaro et al. (2014)
Ranciaro A et al. Genetic origins of lactase persistence and the spread of pastoralism in Africa. Am J Hum Genet. 2014 Apr;94(4):496-510.
sequenced 819 Africans from 63 populations and confirmed the eastern African origin of the C-14010 haplotype, tracking its spread into southern Africa (present in !Xhosa at 14.3%) on the same haplotype background as Kenyan and Tanzanian carriers.

Frequencies vary enormously by ethnic group: Kenyan Yaaku (53.6%), Tanzanian Afroasiatic speakers (42.1%), Kenyan Nilo-Saharan groups (27.9%), Kenyan Boni (20.8%), !Xhosa (14.3%), Tanzanian Sandawe (14.5%). The variant is essentially absent outside Africa and is not carried by European, East Asian, South Asian, or Latino populations.

Practical Actions

This variant has the same practical implications as the European rs4988235: individuals with the GG genotype (ancestral non-persistence) in African ancestry populations are likely to be lactose non-persistent and may experience symptoms — bloating, gas, cramps, or diarrhea — when consuming lactose-containing foods. The severity varies; small amounts, fermented dairy, and hard cheeses are often tolerated even without lactase enzyme. Lactase enzyme supplements and lactose-free dairy products provide reliable management strategies.

Interactions

This variant is one of at least five independent lactase persistence mutations in humans. The companion variants rs41380347 (G-13915, East African), rs41525747 (C-13907, East African), and the European rs4988235 (T-13910) are all functional in the same MCM6 enhancer region. Individuals of East African ancestry may carry one or more of these variants. Since all act through the same Oct-1/HNF1α enhancer mechanism to maintain LCT expression, compound carriage of two persistence alleles would not add incremental benefit — each allele is sufficient on its own to maintain lactase production. For comprehensive lactase persistence testing in people of African descent, all five variants should be evaluated together.

CA1 rs1532423 — A Zinc-Sequestration Variant in the Carbonic Anhydrase Cluster

Most serum zinc tests measure what's circulating in your plasma, but your body's largest zinc reservoir sits inside your red blood cells. The CA1 gene encodes carbonic anhydrase 111 carbonic anhydrase 1
A zinc-containing metalloenzyme found at extraordinarily high concentrations inside red blood cells — approximately 0.5 mM, or nearly 50 mg of CA1 protein per 100 mL of packed red cells, making it one of the most abundant proteins in the human erythrocyte
, an enzyme whose catalytic core is built around a single tightly-bound zinc ion. This puts CA1 at the interface of zinc biology and red blood cell metabolism in a way that few other genes can.

The rs1532423 variant, located within an intron of CA1 on chromosome 8q21.2, was identified in the first genome-wide association study of blood trace elements as one of only three loci reaching genome-wide significance for circulating zinc — alongside a variant in PPCDC on chromosome 15 and one on the X chromosome. The P-value of 6.40 × 10⁻¹² from 5,477 Australian and UK participants leaves no doubt about the association; the molecular pathway it reflects is the more interesting question.

The Mechanism

CA1 and its closely related paralog CA2 sit side-by-side on chromosome 8q21.2, with CA2 (carbonic anhydrase 2) located approximately 108 kilobases downstream of rs1532423. Both enzymes catalyze the reversible hydration of carbon dioxide into bicarbonate and a proton — a reaction critical for gas exchange, pH regulation, and bicarbonate reabsorption in the kidney proximal tubule.

The key zinc biology is straightforward: each molecule of CA1 coordinates one zinc ion at its active site, using it as a Lewis acid to polarize the water molecule that attacks CO₂. Red blood cells contain roughly 330 mg/dL of hemoglobin and approximately 15–25 mg of carbonic anhydrase per 100 mL22 roughly 330 mg/dL of hemoglobin and approximately 15–25 mg of carbonic anhydrase per 100 mL
By mass, carbonic anhydrase constitutes 1–2% of total erythrocyte protein, making it the dominant zinc-chelating protein in whole blood
. Because CA1 is expressed at this enormous concentration exclusively inside erythrocytes, any variant that alters CA1 expression or enzyme turnover directly changes how much zinc is sequestered in red blood cells versus circulating freely in plasma.

The rs1532423 variant is intronic and does not change any amino acid in CA1 or CA2. Intronic variants in this region can influence zinc-handling through at least two mechanisms: (1) altered transcription factor binding in regulatory elements embedded in the intron, changing CA1 expression level and thus erythrocyte zinc capacity; (2) modified splicing efficiency for adjacent exons, changing the proportion of active versus inactive CA1 isoforms. The precise mechanism has not been established in functional studies.

The Evidence

The foundational evidence comes from Evans DM et al. 201333 Evans DM et al. 2013
Evans DM, Zhu G, Dy V, et al. Genome-wide association study identifies loci affecting blood copper, selenium and zinc. Hum Mol Genet, 2013
, a study that combined two cohorts from the Queensland Institute of Medical Research (QIMR, 4,012 Australian twins and siblings) and the ALSPAC study (1,465 UK pregnant women) in a GWAS of trace element concentrations in whole blood. For zinc, three independent loci reached genome-wide significance. The chromosome 8 signal, marked by rs1532423, achieved P = 6.40 × 10⁻¹² — the most significant of the three zinc loci. The authors specifically noted that "the chromosome 8 locus for Zn contains multiple genes for the Zn-containing enzyme carbonic anhydrase," identifying the erythrocyte carbonic anhydrase system as the probable biological explanation.

rs1532423 has since been used as a standard genetic instrument for zinc status in Mendelian randomization studies44 Mendelian randomization studies
A method that uses genetic variants as natural experiments to test whether a biomarker causally influences a health outcome, analogous to a randomized trial but using nature's randomization of alleles at conception
. A 2018 two-sample MR analysis55 2018 two-sample MR analysis
Thun GA et al. Effects of copper and zinc on ischemic heart disease and myocardial infarction: a Mendelian randomization study. Am J Clin Nutr, 2018
found that genetically instrumented higher zinc was associated with slightly elevated ischemic heart disease risk (OR 1.06; 95% CI 1.02–1.11). This finding illustrates that zinc behaves on a U-shaped curve: both deficiency and excess associate with adverse outcomes, and the goal is optimization rather than maximization.

In COVID-19 pandemic research, Li et al. 202266 Li et al. 2022
Li Z et al. Genetically Predicted Circulating Concentrations of Micronutrients and COVID-19 Susceptibility and Severity. Front Nutr, 2022
included rs1532423 alongside rs2120019 as genetic instruments for zinc, finding limited evidence that genetically predicted zinc levels causally alter COVID-19 susceptibility or hospitalization (OR 1.06, 95% CI 0.81–1.39), though the study was underpowered for severe outcomes.

Practical Actions

The clinical significance of rs1532423 is that it shifts where in the blood zinc resides — and therefore what standard plasma zinc tests may or may not reveal. Because CA1 binds zinc inside erythrocytes, variants that reduce CA1 expression could lower whole-blood zinc without affecting plasma zinc (and vice versa). This is relevant for interpreting zinc blood tests: whole-blood zinc captures the erythrocyte pool; plasma zinc reflects extracellular status.

For dietary management, the fundamental actions are the same regardless of which allele drives the effect: ensure consistent dietary zinc adequacy, choose bioavailable forms, and monitor status periodically if diet is restrictive or if symptoms suggest deficiency (frequent infections, impaired wound healing, altered taste or smell). Supplementation should be calibrated to confirmed deficiency, not to genotype alone — Mendelian randomization data argues against pushing zinc above the population-normal range.

Interactions

rs1532423 is one of three established genome-wide significant zinc loci. The other two are rs2120019 (PPCDC, chromosome 15) and rs11638477 (chromosome X). All three are independent signals — they operate through different biological mechanisms and their effects on zinc status would compound. Individuals carrying zinc-lowering alleles at two or more of these loci have a larger genetically anchored downward shift in zinc than those carrying a single zinc variant.

CA2, located 108 kb downstream of rs1532423 on the same chromosome, encodes carbonic anhydrase 2 — an enzyme expressed in the kidney proximal tubule where it regulates bicarbonate reabsorption and mineral handling. Pathogenic CA2 mutations cause osteopetrosis with renal tubular acidosis (OMIM 259730), a rare recessive condition. rs1532423 has no known effect on this pathway, but the proximity means variants in strong LD with rs1532423 might also tag CA2 regulatory regions; this has not been specifically studied.