Single-cell Regulatory Occupancy Archive in Dementia

scROAD - A database offers comprehensive information on single-cell cCRE transcription factor occupancy data

Database: Single-cell Regulatory Occupancy Archive in DementiascROAD

How can scATAC-seq data be utilized to study gene regulation and chromatin accessibility in diseases?

scATAC-seq data enables us to explore chromatin accessibility at a single-cell level, allowing for the identification of regulatory elements that are active in specific cell types. This technique is particularly powerful for investigating transcription factor binding sites (TFBS) and understanding gene regulatory networks in the context of complex diseases like Alzheimer’s Disease (AD) and Pick’s Disease (PiD).

Our comprehensive approach integrates genomic data from various sources, including GWAS fine-mapping, single-nucleus ATAC-seq (snATAC-seq), and single-nucleus RNA-seq (snRNA-seq), to uncover the genetic underpinnings of neurodegenerative diseases. By mapping disease-associated loci to specific cell types, we can investigate the functional alternations due to genetic variants in disease contexts, such as AD and PiD.

I developed a wrapper package, scROAD, designed to streamline the extracting information from single-cell data generated by CellRanger ATAC, Seurat, and Signac into Cicero and TOBIAS for comprehensive analysis. We performed single cell co-accessibility analyses using Cicero to construct putative cis-regulatory enhancer-promoter links. Additionally, with the help from TOBIAS package, we can further explore transcription factor (TF) binding occupancy in ATAC-seq. This analysis allows us to detect differences in TF binding between disease and control samples, providing insights into how regulatory mechanisms are altered in specific cell types. By integrating TF binding data with co-accessibility analyses to create this scROAD interactive database, users can easily explore transcription factor binding activity and their implications in disease, providing a valuable resource for understanding gene regulation in neurodegeneration.

About the Database

This database (Shi et al., 2024) offers comprehensive information on single-cell cCRE transcription factor occupancy data generated from snATAC-seq analysis of human postmortem prefrontal cortex (PFC) tissue. The data specifically focuses on Alzheimer’s Disease and Pick’s Disease. For a more in-depth understanding of the database’s purpose and contents, please refer to the following publication. If you have any further questions, feel free to contact the principal investigator.

Fig.1 GWAS snATAC scCis-TF occupancy plot showcasing key transcription factor binding sites

Unveiling the Power of TF Footprinting: A New Frontier in Regulatory Network Analysis for Neurodegenerative Diseases

Introduction

Understanding transcription factor (TF) binding is essential for unraveling the complex regulatory networks underlying gene expression. Tools like scATAC-seq offer unprecedented opportunities to study chromatin accessibility at the single-cell level, shedding light on gene regulation and its dysregulation in disease. However, existing methods for analyzing TF networks often fall short of distinguishing functional TF binding events from non-functional motifs. In this blog, we explore a novel approach leveraging TF footprinting to overcome these limitations, offering new insights into neurodegenerative diseases like Alzheimer’s Disease (AD) and Pick’s Disease (PiD).

Challenges in Current Methods

Widely used methods such as Aibar et al., 2017 SCENIC and Bravo González-Blas et al., 2023 SCENIC+ rely on conventional motif enrichment analyses to construct gene regulatory networks (GRNs). While SCENIC uses promoter regions and co-expression patterns from scRNA-seq data, SCENIC+ extends this by integrating scATAC-seq and scRNA-seq data to link enhancers to target genes.

However, these methods have notable limitations:

  • They infer TF activity indirectly through overrepresented motifs, which can lead to false positives.

  • They struggle to distinguish functional enhancer-TF interactions from non-functional motifs.

For instance, a multi-omic study in Mathys et al., 2024 AD demonstrated the utility of SCENIC in constructing cell-type-level TF regulators in AD snRNA-seq data. SCENIC+, on the other hand, builds on this by using pycistarget, a wrapper for HOMER, for enhancer motif enrichment. Unfortunately, these approaches often miss the mark in identifying true TF binding events.

Limitations of Open Chromatin as a Regulatory Marker

A common assumption in chromatin accessibility studies is that open chromatin regions correlate with active regulatory elements. However, recent findings challenge this notion.

Studies like Xiong et al., 2023 revealed that increased chromatin accessibility in neurodegenerative diseases often reflects chromatin relaxation rather than functional regulation. Similarly, Frost et al., 2014 observed chromatin relaxation and heterochromatin loss in tauopathies such as PiD and AD. These ‘relaxed’ chromatin regions may not indicate meaningful regulatory activity, as they often lack functional TF binding.

Further complicating the picture, Baek et al., 2017 reported that 80% of TF binding motifs do not show measurable footprints, suggesting that open chromatin alone is an unreliable indicator of active regulation. Methods like SCENIC and SCENIC+ may still infer TF activity in these regions, potentially resulting in false positives.

This phenomenon is illustrated in Fig. 2 below, where we depict two scenarios: (1) open chromatin regions with TF binding (‘functional regulation’) and (2) open chromatin regions without TF binding (‘chromatin relaxation’).

Fig. 2 shows two key scenarios of transcription factor binding in open chromatin regions. Scenario 1 highlights regions where open chromatin coincides with functional TF binding, while Scenario 2 depicts regions with open chromatin but without active TF binding, indicating chromatin relaxation rather than regulation.

To address these challenges, we reimplement a bulk ATAC method on single cell data that incorporates TF footprinting and motif-flanking accessibility using the TOBIAS package. This approach offers a more accurate and confident way to identify active TF binding events.

  • TOBIAS calculates TF occupancy across all accessible chromatin regions, allowing us to:

    • Distinguish TF-occupied enhancers from non-functional motifs.

    • Assess footprinting at binding sites and motif-flanking accessibility.

    • Compare TF occupancy between disease and control conditions.

  • Key Advantages of the TF Footprinting Approach on Disease Related Regulatory Network:

    • Direct Measurement of TF Occupancy: Unlike SCENIC and SCENIC+, which rely on motif enrichment, our method directly measures TF binding, reducing false positives.

    • Higher-Resolution Insights: By distinguishing functional enhancer-TF interactions from non-functional motifs, we provide a more accurate view of regulatory networks.

    • Enhanced Integration: Combining snATAC-seq and snRNA-seq data allows for comprehensive analyses of TF-mediated regulation in specific cell types.

Implications for Neurodegenerative Disease Research

This method holds significant promise for advancing our understanding of transcriptional regulation in neurodegenerative diseases like AD and PiD. By uncovering true TF binding events and regulatory interactions, we can identify novel therapeutic targets and gain new insights into disease mechanisms.

For example, this approach can help clarify the role of chromatin accessibility changes in neurodegeneration, distinguishing functional regulation from non-functional chromatin relaxation. Additionally, it provides a valuable resource for exploring TF activity and its implications in disease contexts.

Conclusion

This funtional TF footprinting approach represents a major step forward in regulatory network analysis, addressing the limitations of conventional methods like SCENIC and SCENIC+. By leveraging tools like TOBIAS to measure TF occupancy directly, we can distinguish functional regulatory events from non-functional motifs, providing deeper insights into gene regulation in neurodegenerative diseases. As we continue to refine these methods, they will undoubtedly play a critical role in shaping the future of genomic research and therapeutic discovery.

Running the code for Single-cell Regulatory Occupancy Analysis


Code will be updated here soon

References

2024

  1. bioRxiv
    PiD_2024.png
    Single-nucleus multi-omics identifies shared and distinct pathways in Pick’s and Alzheimer’s disease
    Zechuan Shi, Sudeshna Das, Samuel Morabito, and 12 more authors
    bioRxiv, 2024