The progress of the grand project
组织者
丘成栋
演讲者
时间
2022年11月23日 21:45 至 22:00
地点
Online
摘要
The validation datasets of genome universe were downloaded from the RefSeq database (https://ftp.ncbi.nlm.nih.gov/refseq/release/), Assembly GCF database (https://ftp.ncbi.nlm.nih.gov/genomes/refseq/) and Assembly GCA database (https://ftp.ncbi.nlm.nih.gov/genomes/genbank/). Up to now, we have collected 24719 nucleoid sequences belonging to 425 families for bacteria, 440 chromosome sequences belonging to 7 phyla for archaea, 2628 chromosome sequences belonging to 23 families for fungi, 400 chromosome sequences belonging to 22 families for plant, 1200 chromosome sequences belonging to 20 families for protozoa, 390 chromosome sequences belonging to 72 families for vertebrates. The corresponding genome space and natural metric have been determined: their genome spaces are sitting in 48, 48, 40, 24, 36, 28, respectively, and the metrics are d9, D10, d9, d2, D9, D2, respectively. Only Invertebrates’ result is absent. I will report on the progress of the project.
演讲者介绍
孙楠目前是北京雁栖湖应用数学研究院的博士后。她的研究方向包括生物信息学、机器学习和应用数学,在The Innovation, Computational and Structural Biotechnology Journal, BMC Bioinformatics, Frontiers in Cellular and Infection Microbiology, Journal of Computational Biology, Genes等期刊发表多篇论文,参与多项国家自然科学基金及北京市自然科学基金项目,主持中国博士后科学基金第78批面上资助。