{"doi":"10.1038/s41587-022-01369-0","title":"Standardized annotation of translated open reading frames","abstract":"Ribosome profiling (Ribo-seq) has extended our understanding of the translational ‘vocabulary’ of the human genome, uncovering thousands of open reading frames (ORFs) within long noncoding RNAs (lncRNAs) and presumed untranslated regions (UTRs) of protein-coding genes. However, reference gene annotation projects have been circumspect in their incorporation of these ORFs because of uncertainties about their experimental reproducibility and physiological roles. Yet, it is clear that certain ‘Ribo-seq ORFs’ make stable proteins, others mediate gene regulation, and many have medical implications. Ultimately, the absence of standardized ORF annotation has created a circular problem: while Ribo-seq ORFs remain unrecognized by reference annotation databases, this lack of recognition will thwart studies examining their roles. Here, we outline a community-led effort involving Ensembl/GENCODE, the HUGO Gene Nomenclature Committee (HGNC), UniProtKB, HUPO/HPP and PeptideAtlas to produce a standardized catalog of 7,264 human Ribo-seq ORFs; a path to bring protein-level evidence for Ribo-seq ORFs into reference annotation databases; and a roadmap to facilitate research in the global community.","journal":"Nature Biotechnology","year":2022,"id":11481,"datarank":3.5194962562157484,"base_score":5.44673737166631,"endowment":5.44673737166631,"self_citation_contribution":0.8170106057499466,"citation_network_contribution":2.702485650465802,"self_endowment_contribution":0.8170106057499466,"citer_contribution":2.702485650465802,"corpus_percentile":68.75508543531326,"corpus_rank":385,"citation_count":245,"citer_count":136,"citers_with_citation_signal":131,"citers_with_endowment":131,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.8604,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2022-07-01","fair_score":37.1667,"fair_percentile":18.601583113456464,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":17867,"name":"Jorge Ruiz-Orera","orcid":"0000-0002-8317-0034","position":1,"is_corresponding":false},{"id":35299,"name":"John R. Prensner","orcid":"0000-0002-7024-636X","position":2,"is_corresponding":false},{"id":17796,"name":"Marie A. Brunet","orcid":"0000-0001-5973-3522","position":3,"is_corresponding":false},{"id":59138,"name":"Ferriol Calvet","orcid":"0000-0003-1841-9881","position":4,"is_corresponding":false},{"id":292,"name":"Irwin Jungreis","orcid":"0000-0002-3197-5367","position":5,"is_corresponding":false},{"id":37971,"name":"Jose Manuel Gonzalez","orcid":"0000-0001-5569-0705","position":6,"is_corresponding":false},{"id":92022,"name":"Michele Magrane","orcid":"0000-0003-3544-996X","position":7,"is_corresponding":false},{"id":17848,"name":"Thomas F. Martinez","orcid":"0000-0002-4011-8164","position":8,"is_corresponding":false},{"id":92023,"name":"Jana Felicitas Schulz","orcid":"0000-0002-8157-2224","position":9,"is_corresponding":false},{"id":59139,"name":"Yucheng T. Yang","orcid":"0000-0002-6873-5279","position":10,"is_corresponding":false},{"id":17786,"name":"M. Mar Albà","orcid":"0000-0002-7963-7375","position":11,"is_corresponding":false},{"id":17788,"name":"Julie L. Aspden","orcid":"0000-0002-8537-6204","position":12,"is_corresponding":false},{"id":17938,"name":"Pavel V. Baranov","orcid":"0000-0001-9017-0270","position":13,"is_corresponding":false},{"id":78835,"name":"Ariel Alejandro Bazzini","orcid":"0000-0002-2251-5174","position":14,"is_corresponding":false},{"id":78836,"name":"Elspeth A. Bruford","orcid":"0000-0002-8380-5247","position":15,"is_corresponding":false},{"id":72194,"name":"Maria Jesus Martin","orcid":null,"position":16,"is_corresponding":false},{"id":17799,"name":"Lorenzo Calviello","orcid":"0000-0002-5600-0988","position":17,"is_corresponding":false},{"id":17800,"name":"Anne-Ruxandra Carvunis","orcid":"0000-0002-6474-6413","position":18,"is_corresponding":false},{"id":92024,"name":"Jin Chen","orcid":"0000-0002-6634-4397","position":19,"is_corresponding":false},{"id":92025,"name":"Juan Pablo Couso","orcid":"0000-0002-8547-7312","position":20,"is_corresponding":false},{"id":6362,"name":"Eric W. Deutsch","orcid":"0000-0001-8732-0928","position":21,"is_corresponding":false},{"id":20075,"name":"David B. Jaffe","orcid":"0000-0001-8739-568X","position":22,"is_corresponding":false},{"id":29021,"name":"Michael G. FitzGerald","orcid":"0000-0002-0488-0530","position":23,"is_corresponding":false},{"id":42990,"name":"Grigorios Georgolopoulos","orcid":"0000-0002-9906-4797","position":24,"is_corresponding":false},{"id":17830,"name":"Norbert Hübner","orcid":"0000-0002-1218-6223","position":25,"is_corresponding":false},{"id":92026,"name":"Nicholas T. Ingolia","orcid":"0000-0002-3395-1545","position":26,"is_corresponding":false},{"id":14693,"name":"Sharon L. R. Kardia","orcid":"0000-0002-9853-3379","position":27,"is_corresponding":false},{"id":17853,"name":"Gerben Menschaert","orcid":"0000-0002-7575-2085","position":28,"is_corresponding":false},{"id":78845,"name":"Robert L. Moritz","orcid":"0000-0002-3216-9447","position":29,"is_corresponding":false},{"id":28491,"name":"Uwe Ohler","orcid":"0000-0002-0881-3116","position":30,"is_corresponding":false},{"id":17865,"name":"Xavier Roucou","orcid":"0000-0001-9370-5584","position":31,"is_corresponding":false},{"id":17868,"name":"Alan Saghatelian","orcid":"0000-0002-0427-563X","position":32,"is_corresponding":false},{"id":17890,"name":"Jonathan S. Weissman","orcid":"0000-0003-2445-670X","position":33,"is_corresponding":false},{"id":78847,"name":"Sebastiaan van Heesch","orcid":"0000-0001-9593-1980","position":34,"is_corresponding":false},{"id":57226,"name":"María Martin","orcid":"0000-0001-5454-2815","position":35,"is_corresponding":false},{"id":24517,"name":"Shinichi Morishita","orcid":"0000-0002-6201-8885","position":0,"is_corresponding":true}],"reference_count":30,"raw_metadata":{"citation_network_status":"fetched"},"created_at":"2026-03-01T18:20:47.508186Z","pmid":"35831657","pmcid":"PMC9757701","fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":"hybrid","license":"cc-by","views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":69.0,"fair_a":53.0,"fair_i":10.0,"fair_r":16.6667,"fair_zscore":-0.7277,"fair_rationale":{"fair_score":37.17,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":69.0,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"datacite=0, pmcid=True, pmid=True","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.25,"signal":null,"rationale":"The paper describes a community-led effort to produce a standardized catalog of ORFs but does not mention any machine-readable metadata or structured data formats for the catalog."}]},"A":{"name":"Accessible","score":53.0,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":0.25,"signal":null,"rationale":"The paper outlines a roadmap and involvement of databases but does not specify a clear protocol for accessing the underlying data or code, such as a repository or download link."}]},"I":{"name":"Interoperable","score":10.0,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper mentions involvement of standard annotation databases (Ensembl/GENCODE, UniProtKB) and nomenclature (HGNC), indicating use of some standard vocabularies, but does not specify standard data formats or identifiers for the ORF catalog."}]},"R":{"name":"Reusable","score":16.67,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.167,"signal":null,"rationale":"The paper lacks a data-availability statement, explicit license, or mention of reproducibility measures; it only describes a planned catalog without stating how it can be reused."}]}},"suggestions":["Provide the ORF catalog in a machine-readable format (e.g., JSON, XML) with structured metadata.","Include a clear data access protocol, such as a public repository URL or download instructions.","Specify use of standard identifiers (e.g., ORF IDs from HGNC or UniProt) and file formats (e.g., GFF3, BED).","Add a data-availability statement with a license (e.g., CC0 or CC-BY) and describe how to reproduce the catalog."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v2","fulltext_source":"abstract_only"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v2","fair_fulltext_source":"abstract_only","fair_has_llm":true,"fair_computed_at":"2026-06-18T00:36:22.746128Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}