Few-Shot Learning in AI for Science
演讲者
时间
2025年03月21日 15:00 至 16:30
地点
A3-2a-302
线上
Zoom 637 734 0280
(BIMSA)
摘要
In the current field of AI-assisted scientific research (AI for Science), particularly in drug discovery and biomedicine, we often face the challenge of scarce labeled data. Few-shot learning has become a key technology to address this challenge, as it can effectively leverage limited data for learning and prediction. In this report, I will introduce a series of machine learning algorithms developed specifically to improve data efficiency and prediction accuracy in AI for Science under data scarcity. I will discuss the application of few-shot learning techniques in molecular property prediction, reviewing existing technologies and presenting our proposed Property-Aware Relationship Network (PAR) (NeurIPS 2021, TPAMI 2024) and parameter-efficient Graph Neural Network Adapter (PACIA) (IJCAI 2024). PAR optimizes the relationship representations between molecules by introducing a property-aware molecular encoder and a dependency-query-based relational graph learning module, thereby improving prediction accuracy for various chemical properties. Meanwhile, PACIA enhances few-shot molecular property prediction performance by generating a small number of adaptive parameters to modulate the information propagation process in graph neural networks. In addition, I will introduce the KnowDDI technique (Communications Medicine 2024), which enhances drug representations by leveraging large biomedical knowledge graphs and explains predicted drug-drug interactions (DDIs) by learning knowledge subgraphs of drug pairs, effectively addressing the issue of scarce known data. KnowDDI not only improves prediction performance but also enhances the interpretability of the model, making the prediction process more transparent and trustworthy. Finally, I will share the vision of applying few-shot learning techniques in broader scientific research.
演讲者介绍
王雅晴博士现为北京雁栖湖应用数学研究院副研究员,2019年于香港科技大学计算机科学及工程学系取得博士学位,师从倪明选教授和郭天佑教授,研究方向为机器学习。2019至2024年,她在百度研究院担任资深研究员,专注于标注样本稀缺的冷启动推荐、检索意图识别、大模型和智能体(Agent)优化以及AI4Science等领域的研究工作。王雅晴博士的研究方向涵盖机器学习与人工智能,重点围绕简约学习,聚焦小样本学习、稀疏学习、低秩学习等,以高效低成本的方式解决生物医药、推荐系统和自然语言处理中的实际问题。她已在国际顶级会议与期刊如NeurIPS, ICML, ICLR, KDD, TheWebConf, SIGIR, EMNLP, TPAMI, JMLR, 以及TIP上发表了多篇论文,其论文被引用次数超过四千次。王雅晴博士撰写的小样本学习综述被列为ACM Computing Surveys最近五年中最高引用论文,并成为ESI热点论文(前0.1%)。此外,作为项目骨干,她承担了科技部科技创新2030重大项目和国家自然科学基金面上项目。她长期担任IJCAI和AAAI的高级程序委员,并为ICML、NeurIPS、ICLR、TPAMI等顶级会议与期刊审稿。在2024年,王雅晴博士入选全球前2%顶尖科学家榜单。