北京雁栖湖应用数学研究院 北京雁栖湖应用数学研究院

  • 关于我们
    • 院长致辞
    • 理事会
    • 协作机构
    • 参观来访
  • 人员
    • 管理层
    • 科研人员
    • 博士后
    • 来访学者
    • 行政团队
    • 学术支持
  • 学术研究
    • 研究团队
    • 公开课
    • 讨论班
  • 招生招聘
    • 教研人员
    • 博士后
    • 学生
  • 会议
    • 学术会议
    • 工作坊
    • 论坛
  • 学院生活
    • 住宿
    • 交通
    • 配套设施
    • 周边旅游
  • 新闻
    • 新闻动态
    • 通知公告
    • 资料下载
关于我们
院长致辞
理事会
协作机构
参观来访
人员
管理层
科研人员
博士后
来访学者
行政团队
学术支持
学术研究
研究团队
公开课
讨论班
招生招聘
教研人员
博士后
学生
会议
学术会议
工作坊
论坛
学院生活
住宿
交通
配套设施
周边旅游
新闻
新闻动态
通知公告
资料下载
清华大学 "求真书院"
清华大学丘成桐数学科学中心
清华三亚国际数学论坛
上海数学与交叉学科研究院
BIMSA > Statistical Genetics: Unveiling Unknowns in Life
Statistical Genetics: Unveiling Unknowns in Life
Statistical genetics is the combination between statistics and genetics, aimed to understand the genetic architecture of complex phenotypes and human health using powerful statistical tools. The core of statistical genetics is the development and application of statistical methods to analyze genetic and genomic data and understand how genes influence phenotypic traits and diseases. The objectives of this workshop are to introduce the state-of-art concepts and methodologies that can better disentangle the complexities in genetics and unveil the unknowns in this field. Speakers are active researchers at the frontiers in the methodological development of statistical genetics.
组织者
董昂 , 邬荣领 , 吴双
演讲者
关永涛 ( 北京雁栖湖应用数学研究院 )
汪作蘅 ( Yale University )
吴琦 ( 北京雁栖湖应用数学研究院 )
邬荣领 ( 北京雁栖湖应用数学研究院 , 清华丘成桐数学科学中心 )
杨登程 ( 北京雁栖湖应用数学研究院 )
日期
2025年07月09日 至 09日
位置
Weekday Time Venue Online ID Password
周三 10:00 - 19:00 A6-101 ZOOM 09 230 432 7880 BIMSA
日程安排
时间\日期 07-09
周三
10:00-10:20 邬荣领
10:20-11:10 汪作蘅
11:10-12:00 关永涛
14:00-14:50 吴琦
14:50-15:40 杨登程
15:40-16:30 邬荣领

*本页面所有时间均为北京时间(GMT+8)。

议程
    2025-07-09

    10:00-10:20 邬荣领

    Open Remark

    10:20-11:10 汪作蘅

    Deep learning integrating clinical and genetic data for disease risk prediction in Biobank data

    In the digital medicine era, electronic health records (EHRs) contain extensive patient data from diverse sources. Leveraging this readily available information is essential for personalized medicine and predictive healthcare. Although family health history is an important component to assess risk for common chronic diseases, research has so far adopted a limited view of family relations in healthcare research. We develop ALIGATEHR, which models inferred family relations in a graph attention network augmented with an attention-based medical ontology representation, thus accounting for the complex influence of genetics, shared environmental exposures, and disease dependencies. Furthermore, the integration of electronic health records and genetic data offers great potential to improve disease risk prediction by capturing both clinical and genetic risk factors. We further develop ALIGATEHR-Gen, a graph attention network that integrates multimodal patient data including diagnosis codes, demographics, and genetic information, along with external medical ontology knowledge. ALIGATEHR-Gen constructs unified patient representations by incorporating genetically inferred first-degree relationships and disease ontology embeddings. We evaluate the predictive performance of ALIGATEHR-Gen across 118 diseases in the UK Biobank and demonstrate that it outperforms state-of-the-art baseline models by an average of at least 6%. A case study on five primary fibrotic and closely related diseases reveals that ALIGATEHR-Gen effectively distinguishes patient subgroups based on clinical and genetic features. These findings illustrate the potential of ALIGATEHR-Gen to advance predictive and interpretable modeling in healthcare.

    11:10-12:00 关永涛

    Abundant Parent-of-origin Effect eQTL: The Framingham Heart Study

    Parent-of-origin effect (POE) is a phenomenon whereby an allele’s effect on a phenotype depends both on its allelic identity and parent from whom the allele is inherited, as exemplified by the polar overdominance in the ovine callypyge locus and the human obesity DLK1 locus. Systematic studies of POE of expression quantitative trait loci (eQTL) are lacking. In this study we use trios among participants in the Framingham Heart Study to examine to what extend POE exists for gene expression of whole blood using whole genome sequencing and RNA sequencing. For each gene and the SNPs in cis, we performed eQTL analysis using genotype, paternal, maternal, and joint models, where the genotype model enforces the identical effect sizes on paternal and maternal alleles, and the joint model allows them to have different effect sizes. We compared models using Bayes factors to identify paternal, maternal, and opposing eQTL, where paternal and maternal effects have opposite directions. The resultant variants are collectively called POE eQTL. The highlights of our study include: 1) There are more than 2, 000 genes harbor POE eQTL and majority POE eQTL are not in the vicinity of known imprinted genes; 2) Among 180 genes harboring opposing eQTL, 99 harbor exclusively opposing eQTL, and 58 of the 99 are phosphoprotein coding genes, reflecting significant enrichment; 3) Paternal eQTL are enriched with GWAS hits, and genes harboring paternal eQTL are enriched with drug targets. Our study demonstrates the abundance of POE in gene expression, illustrates the complexity of gene expression regulation, and provides a resource that is complementary to existing resources such as GTEx. We revisited two previous POE findings in light of our POE results. A SNP residing in KCNQ1 that is maternally associated with diabetes is a maternal eQTL of CDKN1C, not KCNQ1. A SNP residing in DLK1 that showed paternal polar overdominance for human obesity is a maternal eQTL of MEG3, offering an explanation for the baseline risk of homozygous samples through association between MEG3 expression and obesity. Finally, we advised caution on conducting Mendelian randomization using gene expression as the exposure.

    14:00-14:50 吴琦

    Evaluating population genetic diversity via alignment-free approaches

    Alignment-free sequence analysis methods have been widely applied in phylogenomic and metagenomic studies, primarily through the construction of distance matrices followed by distance-based tree inference. In this work We extended the application of alignment-free approaches to population genetic analyses, specifically for estimating genetic diversity. Two alignment-free methods—kmer frequency profiling and natural vector methodology—were employed to calculate sequence pairwise diversity (π) using yeast genomes and human mitochondrial datasets. The results demonstrate strong correlation between diversity estimates derived from alignment-free methods and SNP-based approaches. Notably, because alignment-free methods incorporate comprehensive sequence variation information, they systematically yield higher diversity values than SNP-based calculations. However, direct comparison of absolute diversity values across methodologies requires careful interpretation in future works.

    14:50-15:40 杨登程

    A Statistical Genetics Framework for Dissecting the Genetic Architecture of Plant Regeneration

    Here we introduce an integrated statistical genetics framework designed to uncover the genetic basis of a specific biological process. The framework combines genome-wide resequencing and time-series transcriptomic data, and includes several complementary components: A qualitative trait GWAS, which explicitly decomposes additive and dominance effects to identify key genetic variants and their modes of action influencing whether the biological process is initiated; Functional mapping, used to capture dynamic genetic effects during post-initiation development; Time-series differential expression and enrichment analyses, providing molecular validation of genetic signals; A gene regulatory network model based on transcriptomics, used to reveal potential epistatic regulatory interactions.We applied this framework to study Populus euphratica regeneration, a classical model of plant developmental plasticity, and systematically dissected the genetic architecture underlying its key regenerative stages.

    15:40-16:30 邬荣领

    面向多基因编辑的拓扑遗传学理论提出

    未来三十年将是基因编辑的时代。那时,许多复杂疾病可以通过目标基因编辑得以根治;许多复杂性状也可以通过多基因编辑得以改良。这一美好远景的实现需要我们对数量遗传学的充分而全面的理解与认识。通过这个演讲,我将展示能对多基因编辑提供充分理论支撑的新数量遗传学理论与方法。如果说R.A. Fisher开创了推动动植物育种的经典数量遗传学理论,那么一个世纪后在本次演讲提出的拓扑遗传学理论将有力推动基因编辑改变生命的进程。

更多

Journal

Data Analytics and Topology is a peer-reviewed open access journal owned by Beijing Institute of Mathematical Sciences and Applications, under the sponsorship of the International Press inaugurated by Prof. Shing-Tung Yau.

Data Analytics and Topology has been created to meet the needs of the growing mathematical data science community as they create innovative concepts, theories, models, and tools that can best manipulate, interpret, and utilize big data. The aim of collecting big data is to uncover and extract fundamental principles and rules underlying scientific problems behind them. We anticipate that neither statistics nor even AI alone will suffice to achieve this goal in a sustainable way, given the heterogeneous, dynamical, interdependent, and high dimensionality of big data. Overcoming these challenges can be facilitated through the seamless integration of mathematics, particularly topology, with statistics.

The journal endeavors to publish papers exploring the intersection between mathematics and statistics, especially as it applies to large-scale data sets. It welcomes contributions presenting new theories, methods, and interpretations of data within the realms of topological statistics or statistical topology.

For more infos, please refer to https://intlpress.com/journals/journalList?id=1879074441815207938.

We are currently seeking additional editors and associate editors for our journal. We need between 10 to 20 editors and approximately 20 associate editors. If you are interested in joining our team, please reach out to us at daatjournal@bimsa.cn.

We also invite you to submit your upcoming research papers or review articles to Data Analytics and Topology. We look forward to the opportunity of featuring your valuable work in our publication.

北京雁栖湖应用数学研究院
CONTACT

No. 544, Hefangkou Village Huaibei Town, Huairou District Beijing 101408

北京市怀柔区 河防口村544号
北京雁栖湖应用数学研究院 101408

Tel. 010-60661855 Tel. 010-60661855
Email. administration@bimsa.cn

版权所有 © 北京雁栖湖应用数学研究院

京ICP备2022029550号-1

京公网安备11011602001060 京公网安备11011602001060