{"doi":"10.1038/s41587-024-02414-w","title":"A community effort to optimize sequence-based deep learning models of gene regulation","abstract":"A systematic evaluation of how model architectures and training strategies impact genomics model performance is needed. To address this gap, we held a DREAM Challenge where competitors trained models on a dataset of millions of random promoter DNA sequences and corresponding expression levels, experimentally determined in yeast. For a robust evaluation of the models, we designed a comprehensive suite of benchmarks encompassing various sequence types. All top-performing models used neural networks but diverged in architectures and training strategies. To dissect how architectural and training choices impact performance, we developed the Prix Fixe framework to divide models into modular building blocks. We tested all possible combinations for the top three models, further improving their performance. The DREAM Challenge models not only achieved state-of-the-art results on our comprehensive yeast dataset but also consistently surpassed existing benchmarks on Drosophila and human genomic datasets, demonstrating the progress that can be driven by gold-standard genomics datasets.","journal":"Nature Biotechnology","year":2024,"id":11122,"datarank":0.5930093961907373,"base_score":3.2188758248682006,"endowment":3.2188758248682006,"self_citation_contribution":0.48283137373023016,"citation_network_contribution":0.11017802246050715,"self_endowment_contribution":0.48283137373023016,"citer_contribution":0.11017802246050715,"corpus_percentile":52.888527257933276,"corpus_rank":580,"citation_count":29,"citer_count":20,"citers_with_citation_signal":10,"citers_with_endowment":10,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.8215,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2024-10-11","fair_score":46.6667,"fair_percentile":43.733509234828496,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":89794,"name":"Daria Nogina","orcid":"0000-0002-0593-2726","position":1,"is_corresponding":false},{"id":77537,"name":"Dmitry Penzar","orcid":"0000-0001-7960-9385","position":2,"is_corresponding":false},{"id":89795,"name":"Dohoon Lee","orcid":"0000-0003-2163-5489","position":3,"is_corresponding":false},{"id":89796,"name":"Danyeong Lee","orcid":null,"position":4,"is_corresponding":false},{"id":89797,"name":"Nayeon Kim","orcid":null,"position":5,"is_corresponding":false},{"id":89798,"name":"Sangyeup Kim","orcid":null,"position":6,"is_corresponding":false},{"id":89799,"name":"Dohyeon Kim","orcid":"0000-0002-0116-7756","position":7,"is_corresponding":false},{"id":89800,"name":"Yeojin Shin","orcid":"0000-0001-8920-8964","position":8,"is_corresponding":false},{"id":89801,"name":"Il-Youp Kwak","orcid":"0000-0002-7117-7669","position":9,"is_corresponding":false},{"id":89802,"name":"Georgy Meshcheryakov","orcid":null,"position":10,"is_corresponding":false},{"id":89803,"name":"Andrey Lando","orcid":null,"position":11,"is_corresponding":false},{"id":89804,"name":"Arsenii Zinkevich","orcid":"0000-0001-9450-4629","position":12,"is_corresponding":false},{"id":89805,"name":"Byeong-Chan Kim","orcid":null,"position":13,"is_corresponding":false},{"id":89806,"name":"Juhyun Lee","orcid":"0000-0002-9609-4376","position":14,"is_corresponding":false},{"id":89807,"name":"Taein Kang","orcid":null,"position":15,"is_corresponding":false},{"id":2692,"name":"Eeshit Dhaval Vaishnav","orcid":"0000-0003-3720-8051","position":16,"is_corresponding":false},{"id":24052,"name":"Payman Yadollahpour","orcid":"0000-0003-1984-5014","position":17,"is_corresponding":false},{"id":89808,"name":"Random Promoter DREAM Challenge Consortium","orcid":null,"position":18,"is_corresponding":false},{"id":89809,"name":"Susanne Bornelöv","orcid":"0000-0001-9276-9981","position":19,"is_corresponding":false},{"id":89810,"name":"Fredrik Svensson","orcid":"0000-0002-5556-8133","position":20,"is_corresponding":false},{"id":89811,"name":"Maria-Anna Trapotsi","orcid":null,"position":21,"is_corresponding":false},{"id":12237,"name":"Tin Nguyen","orcid":"0000-0001-8001-9470","position":23,"is_corresponding":false},{"id":89813,"name":"Xinming Tu","orcid":"0009-0004-0833-6876","position":24,"is_corresponding":false},{"id":89814,"name":"Wuwei Zhang","orcid":null,"position":25,"is_corresponding":false},{"id":89815,"name":"Wei Qiu","orcid":"0000-0001-8093-3094","position":26,"is_corresponding":false},{"id":89816,"name":"Rohan Ghotra","orcid":null,"position":27,"is_corresponding":false},{"id":89817,"name":"Yiyang Yu","orcid":"0000-0001-7594-1401","position":28,"is_corresponding":false},{"id":89818,"name":"Ethan Labelson","orcid":null,"position":29,"is_corresponding":false},{"id":89819,"name":"Aayush Prakash","orcid":null,"position":30,"is_corresponding":false},{"id":89820,"name":"Ashwin Narayanan","orcid":"0000-0002-2609-8881","position":31,"is_corresponding":false},{"id":89821,"name":"Peter Koo","orcid":null,"position":32,"is_corresponding":false},{"id":89822,"name":"Xiaoting Chen","orcid":null,"position":33,"is_corresponding":false},{"id":89823,"name":"David T. Jones","orcid":"0000-0001-8626-3765","position":34,"is_corresponding":false},{"id":4926,"name":"Yuanfang Guan","orcid":"0000-0001-8275-2852","position":36,"is_corresponding":false},{"id":89825,"name":"Maolin Ding","orcid":"0009-0007-2837-9533","position":37,"is_corresponding":false},{"id":13086,"name":"Calvin Wing Yiu Chan","orcid":"0000-0002-3656-7709","position":38,"is_corresponding":false},{"id":89826,"name":"Yuedong Yang","orcid":null,"position":39,"is_corresponding":false},{"id":54470,"name":"Ke Ding","orcid":"0000-0001-9016-812X","position":40,"is_corresponding":false},{"id":89827,"name":"Gunjan Dixit","orcid":"0000-0003-4609-4316","position":41,"is_corresponding":false},{"id":41931,"name":"Jiayu Wen","orcid":"0000-0003-1249-6456","position":42,"is_corresponding":false},{"id":7295,"name":"Zhihan Zhou","orcid":"0000-0002-9475-465X","position":43,"is_corresponding":false},{"id":74322,"name":"Pratik Dutta","orcid":"0000-0002-1579-8946","position":44,"is_corresponding":false},{"id":89828,"name":"Rekha Sathian","orcid":null,"position":45,"is_corresponding":false},{"id":89829,"name":"Pallavi Surana","orcid":"0009-0004-4241-9181","position":46,"is_corresponding":false},{"id":89830,"name":"Yanrong Ji","orcid":"0000-0002-0134-664X","position":47,"is_corresponding":false},{"id":247,"name":"Han Liu","orcid":"0009-0003-8160-5780","position":48,"is_corresponding":false},{"id":89831,"name":"Ramana V. Davuluri","orcid":"0000-0002-7053-1064","position":49,"is_corresponding":false},{"id":89832,"name":"Yu Hiratsuka","orcid":null,"position":50,"is_corresponding":false},{"id":89833,"name":"Mao Takatsu","orcid":null,"position":51,"is_corresponding":false},{"id":89834,"name":"Tsai-Min Chen","orcid":null,"position":52,"is_corresponding":false},{"id":89835,"name":"Chih-Han Huang","orcid":"0000-0001-7339-1194","position":53,"is_corresponding":false},{"id":89836,"name":"Hsuan-Kai Wang","orcid":null,"position":54,"is_corresponding":false},{"id":89837,"name":"Edward S. C. Shih","orcid":null,"position":55,"is_corresponding":false},{"id":89838,"name":"Sz-Hau Chen","orcid":null,"position":56,"is_corresponding":false},{"id":89839,"name":"Chih-Hsun Wu","orcid":null,"position":57,"is_corresponding":false},{"id":89840,"name":"Jhih-Yu Chen","orcid":"0000-0003-1652-8566","position":58,"is_corresponding":false},{"id":89841,"name":"Kuei-Lin Huang","orcid":null,"position":59,"is_corresponding":false},{"id":89842,"name":"Ibrahim Alsaggaf","orcid":"0009-0007-7379-5915","position":60,"is_corresponding":false},{"id":89843,"name":"Patrick Greaves","orcid":null,"position":61,"is_corresponding":false},{"id":89844,"name":"Carl Barton","orcid":"0000-0003-1589-0432","position":62,"is_corresponding":false},{"id":89845,"name":"Cen Wan","orcid":"0000-0002-3872-0340","position":63,"is_corresponding":false},{"id":89846,"name":"Nicholas Abad","orcid":null,"position":64,"is_corresponding":false},{"id":13091,"name":"Klev Diamanti","orcid":"0000-0002-4922-8415","position":65,"is_corresponding":false},{"id":13073,"name":"Lars Feuerbach","orcid":"0000-0003-1503-437X","position":66,"is_corresponding":false},{"id":13301,"name":"Benedikt Brors","orcid":"0000-0001-5940-3101","position":67,"is_corresponding":false},{"id":12577,"name":"Yichao Li","orcid":"0000-0001-5791-096X","position":68,"is_corresponding":false},{"id":89848,"name":"Sebastian Röner","orcid":"0000-0002-8578-1269","position":69,"is_corresponding":false},{"id":89849,"name":"Pyaree Mohan Dash","orcid":"0000-0002-1005-0437","position":70,"is_corresponding":false},{"id":89851,"name":"Onuralp Soylemez","orcid":null,"position":72,"is_corresponding":false},{"id":89852,"name":"Andreas Møller","orcid":"0000-0002-4073-5568","position":73,"is_corresponding":false},{"id":89853,"name":"Gabija Kavaliauskaite","orcid":"0000-0001-7719-9108","position":74,"is_corresponding":false},{"id":89854,"name":"Jesper Madsen","orcid":null,"position":75,"is_corresponding":false},{"id":89855,"name":"Zhixiu Lu","orcid":"0009-0000-9904-9741","position":76,"is_corresponding":false},{"id":89856,"name":"Owen Queen","orcid":"0009-0009-1675-5313","position":77,"is_corresponding":false},{"id":89857,"name":"Ashley Babjac","orcid":"0000-0002-0991-7726","position":78,"is_corresponding":false},{"id":89858,"name":"Scott Emrich","orcid":"0000-0002-5741-4517","position":79,"is_corresponding":false},{"id":89859,"name":"Konstantinos Kardamiliotis","orcid":"0000-0002-5175-0552","position":80,"is_corresponding":false},{"id":89860,"name":"Konstantinos Kyriakidis","orcid":"0000-0002-1696-4838","position":81,"is_corresponding":false},{"id":89861,"name":"Andigoni Malousi","orcid":"0000-0002-1968-7020","position":82,"is_corresponding":false},{"id":89862,"name":"Ashok Palaniappan","orcid":"0000-0003-2841-9527","position":83,"is_corresponding":false},{"id":89863,"name":"Krishnakant Gupta","orcid":null,"position":84,"is_corresponding":false},{"id":89864,"name":"Prasanna Kumar S","orcid":null,"position":85,"is_corresponding":false},{"id":89865,"name":"Jake Bradford","orcid":null,"position":86,"is_corresponding":false},{"id":6606,"name":"Dimitri Perrin","orcid":"0000-0002-4007-5256","position":87,"is_corresponding":false},{"id":89866,"name":"Robert Salomone","orcid":"0000-0002-6808-6918","position":88,"is_corresponding":false},{"id":89867,"name":"Carl Schmitz","orcid":"0009-0005-6620-633X","position":89,"is_corresponding":false},{"id":89868,"name":"Chen JiaXing","orcid":null,"position":90,"is_corresponding":false},{"id":89869,"name":"Wang JingZhe","orcid":null,"position":91,"is_corresponding":false},{"id":89870,"name":"Yang AiWei","orcid":null,"position":92,"is_corresponding":false},{"id":89871,"name":"Sun Kim","orcid":null,"position":93,"is_corresponding":false},{"id":29633,"name":"Prisca Liberali","orcid":"0000-0003-0695-6081","position":95,"is_corresponding":false},{"id":17840,"name":"Ivan V. Kulakovskiy","orcid":"0000-0002-6554-8128","position":97,"is_corresponding":false},{"id":3029,"name":"Carl G. de Boer","orcid":"0000-0001-8935-5921","position":99,"is_corresponding":false},{"id":89875,"name":"N. S. KIM","orcid":"0000-0003-1222-8742","position":100,"is_corresponding":false},{"id":89876,"name":"G. A. Meshcheryakov","orcid":"0000-0003-0751-8286","position":101,"is_corresponding":false},{"id":89877,"name":"Byeongchan Kim","orcid":"0009-0007-7113-9906","position":102,"is_corresponding":false},{"id":89878,"name":"Maria‐Anna Trapotsi","orcid":"0000-0002-9177-4241","position":103,"is_corresponding":false},{"id":89879,"name":"Tsai‐Min Chen","orcid":"0000-0002-1143-0677","position":105,"is_corresponding":false},{"id":89880,"name":"Edward S.C. Shih","orcid":"0000-0002-8175-0393","position":106,"is_corresponding":false},{"id":89881,"name":"Chih‐Hsun Wu","orcid":"0000-0003-4613-5915","position":107,"is_corresponding":false},{"id":89882,"name":"P W Greaves","orcid":null,"position":108,"is_corresponding":false},{"id":89883,"name":"Nicholas Allen Baclig Abad","orcid":"0009-0004-8322-564X","position":109,"is_corresponding":false},{"id":15079,"name":"Onuralp Söylemez","orcid":"0000-0001-8308-6855","position":110,"is_corresponding":false},{"id":89884,"name":"Jesper Grud Skat Madsen","orcid":"0000-0002-0518-0800","position":111,"is_corresponding":false},{"id":89885,"name":"Krishna Kant Gupta","orcid":"0000-0002-4703-6452","position":112,"is_corresponding":false},{"id":89886,"name":"Prasanna Kumar Saravanam","orcid":"0000-0002-8238-0419","position":113,"is_corresponding":false},{"id":89887,"name":"Jacob Bradford","orcid":"0000-0003-3619-5682","position":114,"is_corresponding":false},{"id":89888,"name":"Jiaxing Chen","orcid":"0000-0001-5795-6722","position":115,"is_corresponding":false},{"id":89889,"name":"Jingzhe Wang","orcid":"0000-0001-8332-7997","position":116,"is_corresponding":false},{"id":89890,"name":"﻿Sun Kim","orcid":"0000-0001-5385-9546","position":117,"is_corresponding":false},{"id":89793,"name":"Abdul Muntakim Rafi","orcid":"0000-0002-0387-5430","position":0,"is_corresponding":true}],"reference_count":71,"raw_metadata":{"citation_network_status":"fetched"},"created_at":"2026-03-01T18:20:47.508186Z","pmid":"39394483","pmcid":"PMC12339383","fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":"hybrid","license":"cc-by","views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":52.5,"fair_a":67.5,"fair_i":25.0,"fair_r":41.6667,"fair_zscore":0.1316,"fair_rationale":{"fair_score":46.67,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":52.5,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"datacite=0, pmcid=True, pmid=True","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.25,"signal":null,"rationale":"The paper provides a DOI and links to data repositories (GEO, Zenodo) but does not describe machine-readable metadata (e.g., structured metadata schemas, ontologies, or standardized data descriptors)."}]},"A":{"name":"Accessible","score":67.5,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":0.75,"signal":null,"rationale":"The paper states that data are available from GEO (GSE254493) and Zenodo (10.5281/zenodo.10633252) and code from GitHub, providing clear access protocols, though no explicit authentication or download instructions are detailed."}]},"I":{"name":"Interoperable","score":25.0,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper uses standard file formats (e.g., FASTA, CSV) and common identifiers (e.g., GEO accession numbers), but does not specify use of controlled vocabularies or community standards for data representation."}]},"R":{"name":"Reusable","score":41.67,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.667,"signal":null,"rationale":"The paper includes a data-availability statement, a Creative Commons license (CC BY 4.0), and open-source code, but lacks explicit reproducibility instructions (e.g., containerization, environment specifications) and a formal license for the code."}]}},"suggestions":["Provide machine-readable metadata using a schema like DataCite or schema.org, including structured descriptions of datasets and variables.","Include explicit authentication or download instructions for data access, such as API keys or direct download links.","Adopt controlled vocabularies (e.g., Gene Ontology, EDAM) for describing data types and experimental methods.","Add a formal software license (e.g., MIT, Apache 2.0) to the GitHub repository and include a reproducibility checklist (e.g., container, environment.yml).","Provide a detailed data dictionary or README file describing column names, units, and encoding schemes for all datasets."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v2","fulltext_source":"epmc_xml"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v2","fair_fulltext_source":"epmc_xml","fair_has_llm":true,"fair_computed_at":"2026-06-18T00:45:04.732458Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}