BIMSA >
Seminar on Bioinformatics
A symmetric Natural Vector Method for Predicting Ambiguous Non-standard Base Codes and Research on gene regulatory relationships
A symmetric Natural Vector Method for Predicting Ambiguous Non-standard Base Codes and Research on gene regulatory relationships
Organizer
Speaker
Time
Monday, November 4, 2024 10:00 AM - 10:30 AM
Venue
Online
Abstract
In this report, we introduce a novel approach based on the Asymmetric Natural Vector (ANV) method to address the problem of ambiguity in DNA sequences. We propose using ANV to predict the bases represented by non-standard codes in DNA sequences. Our approach involves developing a deep learning framework to establish a correspondence between DNA sequences (in FASTA format) and natural vectors, which encode relevant sequence properties. By training on a large dataset, we learn the distribution of these ambiguous base codes within the dataset. This method allows us to accurately predict masked or ambiguous bases in genomic fragments. It is particularly applicable to datasets, such as the COVID-19 genome data, which contain numerous non-standard base codes like R, Y, S, W, K, M, B, D, H, and V. By employing our algorithm, we can effectively estimate the corresponding standard bases and assign confidence scores to each prediction, aiding in the resolution of sequencing uncertainties. In addition, we will introduce research on gene regulatory relationships. Our ultimate goal:Given a genome sequence (1) Determine whether it is a regulatory factor (2) If so, which genomes does it have regulatory relationships with (3) Is this regulatory relationship promotion or inhibition.
Speaker Intro
After graduating with a PhD, he mainly worked in the field of wireless communications. He has worked in Lucent, Alcatel-Lucent, and Nokia as a senior engineer. He has 23 years of rich knowledge and experience in the field of wireless communications. Currently he is a research fellow of Beijing Institute of Mathematical Sciences and Applications (BIMSA) engaged in research on 4G/5G Wireless Communication, Biomathematics, Neural Network, Artificial Intelligence, Big Data, Machine Learning, nonlinear filter.