| Weekday | Time | Venue | Online | ID | Password |
|---|---|---|---|---|---|
| 周三 | 13:30 - 18:20 | A6-101 | ZOOM 06 | 537 192 5549 | BIMSA |
| 时间\日期 | 03-11 周三 |
|---|---|
| 14:10-14:30 | 蔡云峰 |
| 14:30-14:50 | 邓攀 |
| 14:50-15:10 | 王忠 |
| 15:10-15:30 | 何源 |
| 15:50-16:10 | 崔跃华 |
| 16:10-16:30 | 鄞启进 |
| 16:30-16:50 | 杨登程 |
| 16:50-17:10 | 李秋熠 |
| 17:10-17:30 | 董昂 |
| 17:30-17:50 | 李承平 |
*本页面所有时间均为北京时间(GMT+8)。
14:10-14:30 蔡云峰
Reasoning on Knowledge Graph
Probabilistic soft logic (PSL) is a framework for modeling probabilistic and relational domains, and its achieves state-of-the-art results in many areas, e.g. social-network analysis, recommender system, etc. However, PSL does not scale for large databases due to its $O(\sum_j \ell_j|\mathcal{E}|^{n_j})$ computational complexity, where $\ell_j$ and $n_j$ are the length and the number of entity variables of the $j$-th rule, respectively, $|\mathcal{E}|$ is the number of entities. In this talk, we propose FastPSL, a fast logic reasoning method under the PSL framework, designed for large knowledge bases with long structured rules. FastPSL reduces the complexity to $O(\sum_j \ell_jn_j|\mathcal{E}|^3)$. Numerical experiments on both synthetic and real-world knowledge bases demonstrate that FastPSL achieves competitive results compared to other rule-based reasoning method, while showing better performance than other knowledge graph completion baseline over long and complex rule reasoning.
14:30-14:50 邓攀
Modeling and Navigating Cell State Transitions with Deep Learning
Cell fate decisions define development, regeneration, and disease. Yet we still lack a unified framework to systematically map, predict, and control transitions between cellular states. Building a computational system that can both model the global cell state landscape and rationally steer cells from one state to another represents one of the central challenges of modern biology. Leveraging advances in single-cell omics, deep learning, and the emerging concept of the virtual cell, we developed computational frameworks for modeling and manipulating cell states. We introduce CellNavi, which learns the cell state manifold and identifies genes driving state transitions, and GEMGen, a large language model–based framework for phenotype-oriented small molecule generation. Together, these methods uncover molecular determinants of state transitions during development and disease progression, and enable rational generation of compounds capable of inducing desired cell state changes. This work moves toward a programmable framework for understanding and steering cellular dynamics.
14:50-15:10 王忠
人工智能模型驱动的虚拟实验:建模方法与应用实践
王忠团队聚焦于基因表达调控机制,利用多组学数据与深度学习方法,系统解析转录因子、调控元件及三维染色质结构的复杂协同关系。在这一研究中,人工智能与数学模型共同发挥核心作用:深度学习依托高维数据挖掘潜在规律,为虚拟实验提供高效的计算方法;而微分方程与复杂动力学模型则刻画转录起始、暂停与延伸等动态过程,为虚拟细胞提供精准的机制解释。这种人工智能与数学模型的融合,不仅有助于降低实验成本,更能加速相关研究的进展,尤其提升了跨物种与单细胞层面的虚拟实验能力。多学科的交叉研究使生物学问题能够在理论、实验与计算的框架下得到系统阐释,也为虚拟实验和虚拟细胞的持续发展提供了方法论支撑。展望未来,该方向有望推动从基因组序列到三维结构再到表达水平的全链路建模,形成计算与实验深度融合的新范式,为精准医学与合成生物学开辟新的路径。
15:10-15:30 何源
解析人类疾病的遗传影响:从人群研究到AI大模型研究
人类遗传多样性与健康密切相关,涵盖了诸如单核苷酸多态性(SNP)和拷贝数变异(CNV)等影响多种疾病风险的基因组变异。大量的全基因组关联研究(GWAS)已鉴定出数千种性状的遗传关联,其中大多数位于非编码区,其机制尚不明确。作为中间层,基因表达谱揭示了遗传多样性如何通过调控基因表达影响性状,而表达数量性状位点(eQTL)分析则证实了遗传变异在表达中的作用。然而,eQTL如何影响基因表达仍知之甚少。报告人收集了所有公开的ATAC-seq数据集,从fastq文件中重建了基因组信息,并鉴定了caQTL,为解释eQTL与GWAS之间的关系提供了潜在的理论基础。此外,遗传多态性对基因表达的影响不仅体现在线性加性效应上,还体现在协同效应上,而传统的关联研究无法全面捕捉到这种协同效应。过去十年,随着人工智能的飞速发展,其在各个领域的卓越表现已得到充分展现。对人类基因组编码区的分析可以映射到人工智能中的自然语言处理问题。因此,可以开发大型语言模型来预测多态性组合如何影响分子机制,包括基因表达和染色质可及性。
15:50-16:10 崔跃华
统计方法在因果推断与空间组学中的应用:从遗传工具变量到单细胞与空间转录组建模
随着高通量测序技术和空间组学技术的快速发展,生物医学研究正在从关联分析走向机制解析与因果推断。然而,复杂的混杂结构、高维特征空间以及空间依赖性,使得传统统计方法面临新的挑战。本此交流将围绕因果推断与空间组学数据分析两个方向,介绍我们在统计建模与方法学方面的研究进展。首先,在因果推断方面,我们将介绍基于工具变量回归(Instrumental Variable, IV)的统计框架,重点讨论以遗传变异作为工具变量的因果推断方法(如孟德尔随机化)。将简单阐述工具变量方法的识别假设、统计估计策略及其在全基因组关联研究(GWAS)背景下的扩展,探讨如何利用遗传变异识别暴露因素与疾病结局之间的因果关系,并讨论弱工具变量与多效性等关键统计问题。其次,在单细胞测序与空间转录组数据分析方面,我们将介绍若干统计建模与计算方法,包括基于核的混合线性模型用于刻画细胞间表达相似性结构,空间转录组数据中的反卷积模型以估计空间点位的细胞类型组成,基于配体–受体信息的细胞通讯网络推断方法,以及基于深度学习的空间域识别方法,通过图结构建模挖掘空间表达模式,实现组织结构的自动分区与功能解析。
16:10-16:30 鄞启进
虚拟细胞建模
随着人工智能基础模型与高通量生物组学技术的深度融合,人工智能虚拟细胞(AIVC)正成为生命科学从“实验驱动”向“计算驱动”转型的关键驱动力,吸引了领域内广泛的关注。虚拟细胞的出现为生命科学研究提供了全新的视角,能够通过高效的数据整合和分析,加速复杂生物过程的理解。本报告将探讨虚拟细胞领域中的前沿研究趋势,重点包括多组学建模以及动态细胞模型的构建等内容。报告将结合当前的技术进展,介绍如何通过这些方法提升对细胞行为、发育过程以及生理状态的理解,并展示正在开展的相关研究工作,探讨其在个体化医学、药物开发等领域的潜力。
16:30-16:50 杨登程
一种非线性混合功能作图方法的软件开发和实现
动态表型广泛存在于生物生长发育和环境响应过程中,但传统基于单时间点的基因组关联分析难以刻画性状随时间变化的遗传调控机制。功能作图(functional mapping)通过将生长曲线模型与遗传分析相结合,为解析动态复杂性状提供了重要框架,但传统方法主要针对连锁分析群体,难以同时纳入群体结构、协变量及个体间亲缘关系。为此,我们实现了一种基于非线性混合模型的动态性状 GWAS 方法,在功能作图框架下将曲线参数建模为 SNP、群体结构协变量及多基因随机效应的函数。通过一阶 Taylor 展开将模型线性化并转化为线性混合模型进行估计,利用基因组亲缘关系矩阵和 SAD(1) 协方差结构刻画遗传相关与时间相关性。最后采用Wald test检验 SNP 对曲线参数的联合效应。该方法在保持功能作图生物学解释性的同时有效控制群体结构与亲缘关系,为解析动态复杂性状的遗传调控机制提供了一种高效的统计分析框架。
16:50-17:10 李秋熠
Gener系列基因组基础模型
分享一些关于DNA序列建模的尝试与思考,以及大语言模型技术在DNA序列设计、基因组注释等任务上的应用。
17:10-17:30 董昂
Multi-task Learning of Complex Networks via Nonlinear Ordinary Differential Equations
Networks are fundamental to understanding complex systems, characterized by many underlying entities and their intricate interactions. We contextualize evolutionary game theory and ecology niche theory into a unified framework to explain how the dynamic change of an entity is determined by its own strategy and the strategies of its interacting counterparts. We derive a system of nonlinear mixed ordinary differential equations (nMODEs) to quantify the contributions of these two types of strategies and encode them into informative, dynamic, omnidirectional, and personalized networks (idopNetworks). We implement multi-task learning (MTL) into the matrix representation of linearized nMODEs to choose a subset of the most significant entities (acting as predictors) jointly for all entities each viewed as a response. By integrating both group and elementwise sparsity, the model imposes double sparsity constraint—on regulatory edges and nonlinear features—yielding consistent edge selection and a compact, interpretable dynamical representation. In going beyond existing networking practice, idopNetworks can capture all-around interacting links, nonlinearities, and emergent properties of a complex system, which, to a larger extent, approximate the intricate and multifaceted nature of complex systems. We apply our model to learn gene regulatory idopNetworks from transcriptional data, identifying previously-unknown regulatory roles of several genes in mediating malaria infection. We perform computer simulation to validate the statistical relevance of the model. Our model provides a new insight of machine learning to analyze, model, and interpret complex data in a non-Euclidean space.
17:30-17:50 李承平
CYFE: Exploring Protein Evolution and Diversity Across All Domains of Life
Evolution underpins the diversity of life, and predicting the ancestral proteins and the future changes is critical for deciphering the molecular mechanisms driving biological diversity. We present CYFE, a novel paired protein foundation model based on over 100,000 rooted phylogenetic trees that predicts protein evolution bidirectionally—reconstructing ancestral sequences, generating intermediate states and predicting future variants. We validated CYFE across dozens of proteins, demonstrating its accuracy and generality. CYFE successfully bridged the evolutionary gap between IscB and Cas9, generating intermediate and novel family members and thereby expanding the known CRISPR−Cas9 subtype repertoire. These findings highlight CYFE’s role in advancing the understanding of the molecular events that drive protein evolution, offering new insights into evolutionary processes.