Grand biological universe paper revision
组织者
丘成栋
演讲者
时间
2024年05月20日 15:30 至 16:00
地点
理科楼A-304
摘要
Understanding the differences in genome sequences of different organisms is crucial for biological classification and phylogenetic evolution. The k-mer natural vector method encodes sequences into numerical vectors, transforming the problem of sequence comparison in genomic space into vector comparison in high-dimensional Euclidean space. We downloaded all reliable sequences from seven datasets in NCBI and determined the embedding Euclidean dimension and natural metric in genomic space. We proposed the concept of a large biological universe, where the convex hulls formed by the seven datasets are mutually exclusive, and the convex hulls formed by different biological populations within each dataset are also mutually exclusive. This study provides a new perspective for molecular biology and enables accurate comparison of large-scale sequences in real time, revealing the differences in metrics across the universe and standardizing metrics that are not suitable for comprehensive analysis.
演讲者介绍
孙楠目前是北京雁栖湖应用数学研究院的博士后。她的研究方向包括生物信息学、机器学习和应用数学,在The Innovation, Computational and Structural Biotechnology Journal, BMC Bioinformatics, Frontiers in Cellular and Infection Microbiology, Journal of Computational Biology, Genes等期刊发表多篇论文,参与多项国家自然科学基金及北京市自然科学基金项目,主持中国博士后科学基金第78批面上资助。