BIMSA >
生物信息讨论班
A symmetric Natural Vector Method for Predicting Ambiguous Non-standard Base Codes and Research on gene regulatory relationships
A symmetric Natural Vector Method for Predicting Ambiguous Non-standard Base Codes and Research on gene regulatory relationships
组织者
演讲者
时间
2024年11月04日 10:00 至 10:30
地点
Online
摘要
In this report, we introduce a novel approach based on the Asymmetric Natural Vector (ANV) method to address the problem of ambiguity in DNA sequences. We propose using ANV to predict the bases represented by non-standard codes in DNA sequences. Our approach involves developing a deep learning framework to establish a correspondence between DNA sequences (in FASTA format) and natural vectors, which encode relevant sequence properties. By training on a large dataset, we learn the distribution of these ambiguous base codes within the dataset. This method allows us to accurately predict masked or ambiguous bases in genomic fragments. It is particularly applicable to datasets, such as the COVID-19 genome data, which contain numerous non-standard base codes like R, Y, S, W, K, M, B, D, H, and V. By employing our algorithm, we can effectively estimate the corresponding standard bases and assign confidence scores to each prediction, aiding in the resolution of sequencing uncertainties. In addition, we will introduce research on gene regulatory relationships. Our ultimate goal:Given a genome sequence (1) Determine whether it is a regulatory factor (2) If so, which genomes does it have regulatory relationships with (3) Is this regulatory relationship promotion or inhibition.
演讲者介绍
博士毕业后主要从事无线通信领域方面的工作,先后在朗讯,阿尔卡特-朗讯,诺基亚公司任职,资深工程师,具有23年无线通讯领域的丰富知识和经验。目前就职于北京雁栖湖应用数学研究院的研究员从事生物数学,人工智能,神经网络,机器学习,大数据,非线性滤波方面的研究。