Unveiling Hidden Architectures: Multiresolution Clustering and Mediation in Genomic Data

The three speakers develop principled statistical or deep learning methods to uncover latent, multilevel structures in high-dimensional genomic/omics data.
Dr. Fu introduces a deep graph attention autoencoder that detects hierarchical communities within gene regulatory networks, revealing organization across scales.
Dr. Wen presents a general probabilistic framework that takes any initial clustering result and systematically explores nested, multiresolution cluster structures, reconciling inconsistencies and recovering interpretable patterns in genetic and spatial transcriptomics data.
Dr. Hu addresses a complementary question: once groups or features are identified, how do we robustly test which ones act as mediators linking exposures to outcomes? Her symmetric mediation statistics provide powerful FDR controlled inference for high-dimensional omics mediators.
Together, the three talks move from detecting hierarchical network communities, to unifying and reconciling multiscale clusters, to pinpointing causal mediators among high-dimensional molecular features — a logical progression that highlights how modern statistical learning can extract richer, more reliable biological insights from complex data.

Audrey Qiuyan Fu ( Wayne State University School of Medicine )

Yijuan Hu ( Peking University )

Xiaoquan William Wen ( University of Michigan )

Rongling Wu ( BIMSA , YMSC )

3rd ~ 3rd July, 2026

Weekday	Time	Venue	Online	ID	Password
Friday	15:00 - 17:00	A3-4-301	ZOOM 08	787 662 9899	BIMSA

Time\Date	Jul 3 Fri
15:00-15:00	Rong Ling Wu
15:00-15:40	Audrey Qiuyan Fu
15:40-16:20	Xiaoquan William Wen
16:20-17:00	Yijuan Hu

*All time in this webpage refers to Beijing Time (GMT+8).

3rd July, 2026

15:00-15:40 Audrey Qiuyan Fu

Deep learning-based hierarchical community detection for high-dimensional gene regulatory networks

Reconstructing genome-wide gene regulatory networks (GRNs) from genomic data is challenging due to high dimensionality and complexity. We propose a hierarchical model with three layers: individual genes at the bottom, gene communities in the middle, and communities of communities at the top, revealing patterns at different scales. We developed DeepHCD, a deep learning algorithm using a graph attention autoencoder to learn low-dimensional embeddings and infer community structures top-down. DeepHCD minimizes a multitask loss function encompassing graph reconstruction, attribute reconstruction, clustering, and modularity, requiring only rough upper bounds for community numbers at each level. Simulations across diverse network types demonstrate DeepHCD's superior performance in detecting middle-layer communities using homogeneity and completeness metrics. Applied to single-cell regulon activity data (243 regulons, 30,000+ cells), DeepHCD outperforms existing methods, producing clearer community structures with the highest intra-group correlations.

15:00-15:00 Rongling Wu

Open Remark

15:40-16:20 Xiaoquan William Wen

Probabilistic multiresolution clustering

Cluster analysis is a widely used unsupervised learning technique in genomics, with applications ranging from inferring genetic population structure to identifying spatial domains in spatial transcriptomics (ST) data. However, existing clustering methods often yield inconsistent results and typically focus on identifying a single optimal partition, overlooking the intrinsic relationships among the inferred clusters. In this work, we introduce a computational framework for systematically exploring multiresolution clustering structures in scientific data, starting from an initial configuration generated by \textit{\textbf{any}} existing clustering algorithm. The proposed framework provides a unified and principled approach for uncovering complex nested latent structures and reconciling discrepancies among clustering results. Through simulations and applications to large-scale, high-dimensional genetic and spatial transcriptomics data, we demonstrate the framework's ability to recover interpretable clustering patterns and reveal biologically meaningful multiresolution structures.

16:20-17:00 Yijuan Hu

SMS: Symmetric Mediation Statistics for Powerful High-Dimensional Mediation Analysis

Mediation analysis of high-dimensional features, particularly molecular-level omics features, provides important opportunities to uncover biological mechanisms underlying human health and disease. However, two central statistical challenges remain: testing the composite null hypothesis and maintaining power when the exposure--mediator and mediator--outcome associations differ substantially in statistical significance. Existing methods typically rely on accurate estimation of the proportions of the three null types or on the maximum of the two association p-values, and may not always control the FDR well and may have limited power under imbalanced significance.We propose SMS, a new statistical framework based on symmetric mediation statistics. By exploiting symmetry, SMS calibrates the rejection threshold for FDR control under the composite null as a whole. It also allows flexible combinations of the two p-values corresponding to the E--M and M--O associations, including the maximum, and then enables an omnibus test. Moreover, it permits direct use of effect size estimates, bypassing the need to compute p-values. SMS maintained accurate FDR control across a wide range of simulation scenarios while achieving a substantial power gain, approximately 20%, over existing methods including HDMT, DACT, and DEI-B. Applications to a metabolomics dataset and a DNA methylation dataset further corroborated these findings. Notably, SMS discovered five plausible mediators in the metabolomics dataset that were missed by all existing methods considered.