The Transformer Model and Some Related Research
Organizer
Speaker
Time
Saturday, June 29, 2024 10:30 AM - 12:30 PM
Venue
A6-101
Online
Zoom 637 734 0280
(BIMSA)
Abstract
The Transformer architecture underpins many of today's popular large language models. This report provides a detailed introduction to the key principles and core computational processes of the Transformer model. Through an in-depth analysis, we will explore why the Transformer has become so influential in the field of AI. Additionally, we will address some of the model's weaknesses by posing several open-ended questions aimed at improving the Transformer, particularly from a mathematical perspective.
Speaker Intro
Dr. Haihua Xie receives a Ph.D. in Computer Science at Iowa State University in 2015. Before joining BIMSA in Oct. 2021, Dr. Xie worked in the State Key Lab of Digital Publishing Technology, Peking University from 2015-2021. His research interests include Natural Language Processing and Knowledge Service. He published more than 20 papers and obtained 7 invention patents. In 2018, Dr. Xie was selected in the 13th batch of overseas high-level talents in Beijing and was hornored as a "Beijing Distinguished Expert".