北京雁栖湖应用数学研究院 北京雁栖湖应用数学研究院

  • 关于我们
    • 院长致辞
    • 理事会
    • 协作机构
    • 参观来访
  • 人员
    • 管理层
    • 科研人员
    • 博士后
    • 来访学者
    • 行政团队
    • 学术支持
  • 学术研究
    • 研究团队
    • 公开课
    • 讨论班
  • 招生招聘
    • 教研人员
    • 博士后
    • 学生
  • 会议
    • 学术会议
    • 工作坊
    • 论坛
  • 学院生活
    • 住宿
    • 交通
    • 配套设施
    • 周边旅游
  • 新闻
    • 新闻动态
    • 通知公告
    • 资料下载
关于我们
院长致辞
理事会
协作机构
参观来访
人员
管理层
科研人员
博士后
来访学者
行政团队
学术支持
学术研究
研究团队
公开课
讨论班
招生招聘
教研人员
博士后
学生
会议
学术会议
工作坊
论坛
学院生活
住宿
交通
配套设施
周边旅游
新闻
新闻动态
通知公告
资料下载
清华大学 "求真书院"
清华大学丘成桐数学科学中心
清华三亚国际数学论坛
上海数学与交叉学科研究院
BIMSA > BIMSA Digital Economy Lab Seminar Integrating managerial and investor textual data for financial distress prediction: A framework combining multi-source financial information fusion network with LLM
Integrating managerial and investor textual data for financial distress prediction: A framework combining multi-source financial information fusion network with LLM
组织者
高瑞泽 , 韩立岩 , 李振 , 龙飞 , 史冬波 , 汤珂 , 张琦
演讲者
高瑞泽
时间
2025年09月26日 15:00 至 16:00
地点
A3-2-303
线上
Zoom 435 529 7909 (BIMSA)
摘要
Leveraging multi-source data for financial distress prediction (FDP) has gradually attracted growing attention. In this study, we propose a novel FDP framework that integrates simultaneously financial ratios, Management Discussion and Analysis (MD&A), and investor comments from social media. First, we develop a fine-grained feature extraction approach that leverages both large language model (LLM) and the BERT model to capture aspect-level sentiments and rich semantic information from MD&A data. Second, we utilize FinBERT to extract sentiment features from investor comments. These textual features are then combined with financial ratios and integrated into a Multi-source Financial Information Fusion Network (MFIFN), which is trained using a focal loss function to effectively address data imbalance in FDP. Based on the dataset of 24,429 firm-year samples from Chinese listed companies between 2014 and 2023 (including both distressed and non-distressed firms), experimental results demonstrate that incorporating social media and MD&A features enhances predictive performance compared with financial ratios alone. In particular, the proposed MFIFN model achieves an AUC of 0.9541. Furthermore, the LLM-BERT based triplet extraction method improves feature quality, delivering consistent performance gains across compared with traditional textual feature extraction methods.
演讲者介绍
Ruize Gao is an assistant professor at Beijing Institute of Mathematical Sciences and Applications. His research interests include digital economy and data mining. He has published in leading journals such as Decision Support Systems, Information Sciences, Knowledge-based Systems, Financial Innovation, Expert Systems with Applications, Technology in Society, International Journal of Accounting Information Systems. He hosts one funding project under the China Postdoctoral Science Foundation.
北京雁栖湖应用数学研究院
CONTACT

No. 544, Hefangkou Village Huaibei Town, Huairou District Beijing 101408

北京市怀柔区 河防口村544号
北京雁栖湖应用数学研究院 101408

Tel. 010-60661855 Tel. 010-60661855
Email. administration@bimsa.cn

版权所有 © 北京雁栖湖应用数学研究院

京ICP备2022029550号-1

京公网安备11011602001060 京公网安备11011602001060