{"doi":"10.1101/2023.09.08.556781","title":"Benchmarking Hayai-Annotation Plants: A Re-evaluation Using Standard Evaluation Metrics","abstract":"<jats:title>Abstract</jats:title>\n                <jats:p>\n                  The rapid growth of next-generation sequencing (NGS) technology has led to a surge in the determination of whole genome sequences in plants. This has created a need for functional annotation of newly predicted gene sequences in the assembled genomes. To address this, “Hayai-Annotation Plants” was developed as a gene functional annotation tool for plant species. In this report, we compared Hayai-Annotation Plants with Blast2GO and TRAPID, focusing on the three primary gene-ontology (GO) domains: Biological Process (BP), Molecular Function (MF), and Cellular Component (CC). Using the\n                  <jats:italic>Arabidopsis thaliana</jats:italic>\n                  GO annotation as a benchmark, we evaluated each tool using two approaches: the area under the precision-recall curve (AUC-PR) and the metrics used at the critical assessment of functional annotation (CAFA). In the latter case, a CAFA-evaluator, was used to determine the F-score, weighted F-score, and S-score for each domain. Hayai-Annotation Plants showed better performances in all three GO domains. Our results thus reaffirm the effectiveness of Hayai-Annotation Plants for functional gene annotation in plant species. In this era of extensive whole genome sequencing, Hayai-Annotation Plants will serve as a valuable tool that facilitates simplified and accurate gene function annotation for numerous users, thereby making a significant contribution to plant research.\n                </jats:p>","journal":null,"year":null,"id":30936,"datarank":0.10397207708399181,"base_score":0.6931471805599453,"endowment":0.6931471805599453,"self_citation_contribution":0.10397207708399181,"citation_network_contribution":0.0,"self_endowment_contribution":0.10397207708399181,"citer_contribution":0.0,"corpus_percentile":null,"corpus_rank":null,"citation_count":1,"citer_count":0,"citers_with_citation_signal":0,"citers_with_endowment":0,"datacite_reuse_total":0,"is_dataset":false,"is_dataset_confidence":null,"is_oa":false,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":null,"fair_score":null,"fair_percentile":null,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":119794,"name":"Kenta Shirasawa","orcid":"0000-0001-7880-6221","position":1,"is_corresponding":false},{"id":119796,"name":"Sachiko Isobe","orcid":"0000-0002-9555-5054","position":2,"is_corresponding":false},{"id":119793,"name":"Andrea Ghelfi","orcid":"0000-0001-9617-3309","position":0,"is_corresponding":false}],"reference_count":0,"raw_metadata":{"has_enrichment":true,"base_score":0.6931471805599453,"endowment":0.6931471805599453,"datacite_reuse_total":0,"file_count":0,"downloads":0,"views":0,"has_version_chain":false,"is_dataset":false,"is_oa":false,"pmid":"24523987","pmcid":null,"openalex_id":"https://openalex.org/W4386637333","authors":[],"funders":[],"total_grants":0,"fwci":null,"citation_percentile":null,"influential_citations":0,"citation_trend":[{"year":2024,"count":1}],"oa_status":"green","license":"cc-by","oa_locations":[{"url":"https://www.biorxiv.org/content/biorxiv/early/2023/09/12/2023.09.08.556781.full.pdf","host_type":"repository"},{"url":"https://www.biorxiv.org/content/biorxiv/early/2023/09/12/2023.09.08.556781.full.pdf","host_type":"GREEN"},{"url":"https://www.biorxiv.org/content/biorxiv/early/2023/09/12/2023.09.08.556781.full.pdf","host_type":"repository"},{"url":"https://syndication.highwire.org/content/doi/10.1101/2023.09.08.556781","host_type":"publisher"},{"url":"http://dx.doi.org/10.1101/2023.09.08.556781","host_type":"repository"}],"fields_of_study":["Genomics and Phylogenetic Studies","Bioinformatics and Genomic Networks","Gene expression and cancer classification","Biology","Environmental Science","Computer Science"],"mesh_terms":[],"keywords":["Annotation","Benchmarking","Gene Annotation","Benchmark (surveying)","Genome","Computational biology","Function (biology)","DNA sequencing","Genome project","Precision and recall","Computer science","Gene ontology","Biology","Gene","Information retrieval","Artificial intelligence","Genetics","Geography","Cartography"],"sdg_mappings":[{"sdg_number":0,"sdg_label":"Life in Land"}],"linked_datasets":[],"clinical_trials":[],"software_tools":[],"database_accessions":[],"source":"live","citation_network_status":"fetched"},"created_at":"2026-06-09T05:57:59.359766Z","pmid":null,"pmcid":null,"fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":null,"license":null,"views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":null,"fair_a":null,"fair_i":null,"fair_r":null,"fair_zscore":null,"fair_rationale":null,"fair_model":null,"fair_agent_version":null,"fair_fulltext_source":null,"fair_has_llm":null,"fair_computed_at":null,"clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}