The Transformer Model and Some Related Research

Organizer

Rongling Wu

Speaker

Haihua Xie

Time

Saturday, June 29, 2024 10:30 AM - 12:30 PM

Venue

A6-101

Online

Zoom 637 734 0280 (BIMSA)

Abstract

The Transformer architecture underpins many of today's popular large language models. This report provides a detailed introduction to the key principles and core computational processes of the Transformer model. Through an in-depth analysis, we will explore why the Transformer has become so influential in the field of AI. Additionally, we will address some of the model's weaknesses by posing several open-ended questions aimed at improving the Transformer, particularly from a mathematical perspective.

Speaker Intro

Dr. Haihua Xie receives a Ph.D. in Computer Science at Iowa State University in 2015. Before joining BIMSA in Oct. 2021, Dr. Xie worked in the State Key Lab of Digital Publishing Technology, Peking University from 2015-2021. His research interests include Natural Language Processing and Knowledge Service. He published more than 20 papers and obtained 7 invention patents. In 2018, Dr. Xie was selected in the 13th batch of overseas high-level talents in Beijing and was hornored as a "Beijing Distinguished Expert".