Beijing Institute of Mathematical Sciences and Applications Beijing Institute of Mathematical Sciences and Applications

  • About
    • President
    • Governance
    • Partner Institutions
    • Visit
  • People
    • Management
    • Faculty
    • Postdocs
    • Visiting Scholars
    • Staff
  • Research
    • Research Groups
    • Courses
    • Seminars
  • Join Us
    • Faculty
    • Postdocs
    • Students
  • Events
    • Conferences
    • Workshops
    • Forum
  • Life @ BIMSA
    • Accommodation
    • Transportation
    • Facilities
    • Tour
  • News
    • News
    • Announcement
    • Downloads
About
President
Governance
Partner Institutions
Visit
People
Management
Faculty
Postdocs
Visiting Scholars
Staff
Research
Research Groups
Courses
Seminars
Join Us
Faculty
Postdocs
Students
Events
Conferences
Workshops
Forum
Life @ BIMSA
Accommodation
Transportation
Facilities
Tour
News
News
Announcement
Downloads
Qiuzhen College, Tsinghua University
Yau Mathematical Sciences Center, Tsinghua University (YMSC)
Tsinghua Sanya International  Mathematics Forum (TSIMF)
Shanghai Institute for Mathematics and  Interdisciplinary Sciences (SIMIS)
BIMSA > Advances in Artificial Intelligence A Comprehensive and Explainable Approach to Evaluating LLMs’ Defense Capabilities
A Comprehensive and Explainable Approach to Evaluating LLMs’ Defense Capabilities
Organizers
Ming Ming Sun , Ya Qing Wang
Speaker
Yue Feng
Time
Thursday, November 28, 2024 2:00 PM - 4:00 PM
Venue
A3-1-301
Online
Zoom 230 432 7880 (BIMSA)
Abstract
Given the importance of large language models (LLMs) safety, evaluating their defense capabilities against jailbreak attacks has become a key area of focus. However, current evaluation methods often fail to generalize to complex scenarios and lack transparency, leading to incomplete and inaccurate assessments. To address these limitations, we introduce JAILJUDGE, a comprehensive and explainable benchmark designed to assess LLMs’ defense capabilities. JAILJUDGE covers a wide array of risk scenarios, including synthetic, adversarial, in-the-wild, and multilingual prompts. It also offers detailed explanations to ensure transparent and reliable evaluations.
Speaker Intro
Yue Feng is an assistant professor at the University of Birmingham. She got her Ph.D. from University College London. Her research interests lie in natural language processing and information retrieval. She has published more than 30 papers in top conferences (e.g., ACL, SIGIR, EMNLP, WSDM, etc). She also won the Amazon Alexa Prize TaskBot Challenge and was awarded the Baidu Outstanding Research Intern Star.
Beijing Institute of Mathematical Sciences and Applications
CONTACT

No. 544, Hefangkou Village Huaibei Town, Huairou District Beijing 101408

北京市怀柔区 河防口村544号
北京雁栖湖应用数学研究院 101408

Tel. 010-60661855
Email. administration@bimsa.cn

Copyright © Beijing Institute of Mathematical Sciences and Applications

京ICP备2022029550号-1

京公网安备11011602001060 京公网安备11011602001060