Footprint QTLs Show How Noncoding Variants Disrupt TF Binding, Drive Disease Risk in Liver

footprint-qtls-show-how-noncoding-variants-disrupt-tf-binding,-drive-disease-risk-in-liver

New insights into how genetic variants in noncoding regions of the genome can contribute to disease risk by disrupting transcription factor (TF) binding have been uncovered. Footprint quantitative trait locus (fpQTL) mapping could become a powerful tool for identifying causal regulatory variants across tissues, not just in the liver.

In a recent collaboration between Children’s Hospital of Philadelphia (CHOP) and Penn Medicine, researchers identified 809 fpQTLs using a high-resolution method that combines ATAC-seq with deep learning to detect DNA-protein interaction changes at base-pair precision. Their study, “Characterization of non-coding variants associated with transcription factor binding through ATAC-seq-defined footprint QTLs in liver,” was published in the American Journal of Human Genetics.

Decoding the genome’s regulatory “dark matter”

While genome-wide association studies (GWAS) have linked thousands of single nucleotide polymorphisms (SNPs) to complex traits and diseases, more than 90% of these variants fall in noncoding regions. These areas, once thought to be genomic “dark matter,” are now known to house regulatory sequences critical for controlling gene expression.

However, pinpointing which noncoding variants are functionally important has remained a challenge due to linkage disequilibrium and lack of direct functional data.

These noncoding variants are often enriched in regulatory regions containing transcription factor binding motifs. This pattern suggests that disrupted TF binding may be a core mechanism driving disease risk, the authors noted. In other words, if a variant alters the DNA sequence where a TF typically binds, it can deregulate gene expression—sometimes in ways that may contribute to disease. The researchers explain that the enrichment of noncoding variants in transcription factor binding sites suggests these variants may exert their effects not by altering proteins, but by rewiring the regulatory logic of gene expression.

Introducing footprint QTLs: a new mapping strategy

To explore this regulatory connection, the researchers applied an ATAC-seq-based method to reveal these less-understood regions of the genome. After ATAC-seq, the researchers used PRINT, a deep-learning-based approach that detects TF “footprints” by identifying regions of DNA where bound TFs protect against Tn5 transposase insertion. This footprint approach allows researchers to infer where and how strongly TFs bind without needing to know the TF identity in advance.

In 170 human liver samples, the team scanned open chromatin regions for associations between SNP genotypes and TF binding likelihood, ultimately identifying 809 significant fpQTLs. These fpQTLs were highly enriched near transcription start sites and known ChIP-seq peaks. Importantly, many fpQTLs overlapped with known eQTLs and GWAS loci for liver traits including total cholesterol and liver enzyme levels.

Compared to traditional QTL or GWAS approaches, fpQTLs offer the ability to fine-map regulatory variants with clinical relevance. Moreover, ATAC-seq footprinting has practical advantages, including cost and more technical uniformity, along with needing smaller sample input than ChIP-seq.

Looking ahead

By extending this approach to additional organs and diseases, researchers may be able to build tissue-specific regulatory maps that connect non-coding variation to disease mechanisms—and ultimately, to therapeutic targets. As the biotech field continues to embrace functional genomics, fpQTLs, in collaboration with deep-learning methods, may represent a promising bridge between association studies and actionable biology.