Hello World!
Human disease, genomics, and AI-augmented multi-omic discovery.
I am ZeChuan Shi, Ph.D., a computational biologist and bioinformatics scientist working at the intersection of human genetics, single-cell multi-omics, statistical modeling, machine learning, and software development. My work uses large-scale genomics and transcriptomics data to understand disease mechanisms, prioritize therapeutic hypotheses, and build reproducible computational tools for functional genomics discovery.
I earned my Ph.D. in Bioinformatics from the University of California, Irvine, where I worked in Dr. Vivek Swarup's lab on computational genomics and multi-omics modeling for disease biology. My doctoral work integrated single-nucleus RNA-seq, single-nucleus ATAC-seq, spatial transcriptomics, and large disease cohort datasets to characterize cell-type-specific gene expression, chromatin accessibility, regulatory programs, and disease-associated molecular signatures. Across these projects, I helped architect workflows for datasets with more than 2 million nuclei and contributed to studies in Alzheimer's disease, Pick's disease, and related human disease contexts.
Background - From Wet-Lab Biology To Computational Genomics
My training began in biotechnology and experimental disease biology. I earned my M.S. in Biotechnology from Johns Hopkins University, with a focus on bioinformatics, molecular targets, drug discovery technologies, and oncology. After Hopkins, I worked as a preclinical research specialist at the University of Pennsylvania in Dr. Lewis Chodosh's lab, contributing to a pharma-academic partnership for recurrent breast cancer drug development.
That role connected experimental therapeutics with computational analysis. I helped design and execute large-scale preclinical studies involving HER2+ transgenic models, treatment-response experiments, tumor-infiltrating immune cell profiling, circulating tumor cell analysis, and RNA-seq studies of therapeutic response. It shaped how I think about computational biology: strong analysis should stay close to the biological system, the experimental design, and the therapeutic question.
During my Ph.D., I moved deeper into computational genomics, single-cell multi-omics, and network biology. I developed and applied statistical and machine-learning approaches to model gene regulatory structure, perturbation responses, and disease-associated cellular programs. My work contributed to 15+ peer-reviewed publications, including studies in Science Advances, Nature Genetics, Nature Neuroscience, Immunity, Molecular Neurodegeneration, and STAR Protocols.
Focus Today
My current work centers on a simple thesis: human disease biology becomes more actionable when we can connect genetic evidence, cell-type-specific regulatory programs, network structure, and perturbation response. I build computational approaches that move from descriptive multi-omic maps toward testable, mechanistic hypotheses.
Human Genetics And Disease Mechanisms
- Integrating GWAS, eQTL analysis, disease-associated regulatory variation, and single-cell profiles to connect genetic risk with cell-type-specific biology.
- Studying how gene expression, chromatin accessibility, transcription factor activity, and co-expression modules change across human disease states.
- Prioritizing disease mechanisms and therapeutic hypotheses from large-scale clinical, NGS, and multi-omic datasets.
Single-Cell And Multi-Omic Systems Biology
- Analyzing single-cell RNA-seq, single-nucleus ATAC-seq, spatial transcriptomics, Xenium, bulk RNA-seq, proteomics, and multimodal datasets.
- Building reproducible workflows for large-scale disease cohorts and high-dimensional functional genomics data.
- Using network-based approaches to identify cell-state programs, regulatory modules, and disease-associated molecular signatures.
AI-Augmented Functional Genomics
- Applying machine learning, graph-based models, transformer-based models, variational autoencoders, diffusion models, and genomic foundation models to transcriptomic and regulatory genomics questions.
- Using AI-augmented workflows and LLM-assisted development to accelerate analysis, improve reproducibility, and explore perturbation effects.
- Modeling on-target efficacy and transcriptome-wide effects to support systematic target prioritization and therapeutic discovery.
How I Work
I like problems where computation has to stay biologically accountable: noisy human data, complex disease systems, large multi-omic datasets, and questions that require both statistical care and mechanistic imagination. My work combines R, Python, Bash, SQL, Git, HPC, Nextflow-style workflow thinking, cloud-aware analysis, and reproducible software development.
Across projects, I try to build analyses that are interpretable, reusable, and close enough to the biology to generate useful next experiments.