The progress of the grand project
Organizer
Stephen S-T. Yau
Speaker
Time
Wednesday, November 23, 2022 9:45 PM - 10:00 PM
Venue
Online
Abstract
The validation datasets of genome universe were downloaded from the RefSeq database (https://ftp.ncbi.nlm.nih.gov/refseq/release/), Assembly GCF database (https://ftp.ncbi.nlm.nih.gov/genomes/refseq/) and Assembly GCA database (https://ftp.ncbi.nlm.nih.gov/genomes/genbank/). Up to now, we have collected 24719 nucleoid sequences belonging to 425 families for bacteria, 440 chromosome sequences belonging to 7 phyla for archaea, 2628 chromosome sequences belonging to 23 families for fungi, 400 chromosome sequences belonging to 22 families for plant, 1200 chromosome sequences belonging to 20 families for protozoa, 390 chromosome sequences belonging to 72 families for vertebrates. The corresponding genome space and natural metric have been determined: their genome spaces are sitting in 48, 48, 40, 24, 36, 28, respectively, and the metrics are d9, D10, d9, d2, D9, D2, respectively. Only Invertebrates’ result is absent. I will report on the progress of the project.
Speaker Intro
孙楠目前是北京雁栖湖应用数学研究院的博士后。她的研究方向包括生物信息学、机器学习和应用数学,在The Innovation, Computational and Structural Biotechnology Journal, BMC Bioinformatics, Frontiers in Cellular and Infection Microbiology, Journal of Computational Biology, Genes等期刊发表多篇论文,参与多项国家自然科学基金及北京市自然科学基金项目,主持中国博士后科学基金第78批面上资助。