The Transformer Model and Some Related Research
组织者
演讲者
时间
2024年06月29日 10:30 至 12:30
地点
A6-101
线上
Zoom 637 734 0280
(BIMSA)
摘要
The Transformer architecture underpins many of today's popular large language models. This report provides a detailed introduction to the key principles and core computational processes of the Transformer model. Through an in-depth analysis, we will explore why the Transformer has become so influential in the field of AI. Additionally, we will address some of the model's weaknesses by posing several open-ended questions aimed at improving the Transformer, particularly from a mathematical perspective.
演讲者介绍
谢海华2015年在美国爱荷华州立大学取得计算机博士学位,之后在北京大学数字出版技术国家重点实验室担任高级研究员和知识服务方向负责人,于2021年10月全职入职BIMSA。他的研究方向包括:自然语言处理和知识服务。他发表论文数量超过20篇,拥有7项发明专利,入选北京市高水平人才项目并当选北京市杰出专家。