{"doi":"10.1038/msb4100050","title":"Construction of\n                    <i>Escherichia coli</i>\n                    K‐12 in‐frame, single‐gene knockout mutants: the Keio collection","abstract":"We have systematically made a set of precisely defined, single-gene deletions of all nonessential genes in Escherichia coli K-12. Open-reading frame coding regions were replaced with a kanamycin cassette flanked by FLP recognition target sites by using a one-step method for inactivation of chromosomal genes and primers designed to create in-frame deletions upon excision of the resistance cassette. Of 4288 genes targeted, mutants were obtained for 3985. To alleviate problems encountered in high-throughput studies, two independent mutants were saved for every deleted gene. These mutants-the 'Keio collection'-provide a new resource not only for systematic analyses of unknown gene functions and gene regulatory networks but also for genome-wide testing of mutational effects in a common strain background, E. coli K-12 BW25113. We were unable to disrupt 303 genes, including 37 of unknown function, which are candidates for essential genes. Distribution is being handled via GenoBase (http://ecoli.aist-nara.ac.jp/).","journal":"Molecular Systems Biology","year":2006,"id":2384,"datarank":19.156811278950798,"base_score":8.996156562033445,"endowment":8.996156562033445,"self_citation_contribution":1.3494234843050168,"citation_network_contribution":17.80738779464578,"self_endowment_contribution":1.3494234843050168,"citer_contribution":17.80738779464578,"corpus_percentile":93.73474369406021,"corpus_rank":78,"citation_count":8210,"citer_count":194,"citers_with_citation_signal":194,"citers_with_endowment":194,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.5407,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2006-01-01","fair_score":41.4583,"fair_percentile":20.734388742304308,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":29258,"name":"Takeshi Ara","orcid":"0000-0003-1754-2837","position":1,"is_corresponding":false},{"id":29259,"name":"Miki Hasegawa","orcid":"0000-0003-0082-2576","position":2,"is_corresponding":false},{"id":29260,"name":"Yuki Takai","orcid":"0000-0001-6506-5313","position":3,"is_corresponding":false},{"id":29261,"name":"Yoshiko Okumura","orcid":null,"position":4,"is_corresponding":false},{"id":29262,"name":"Miki Baba","orcid":null,"position":5,"is_corresponding":false},{"id":29263,"name":"Kirill A. Datsenko","orcid":null,"position":6,"is_corresponding":false},{"id":29264,"name":"Masaru Tomita","orcid":"0000-0003-3423-377X","position":7,"is_corresponding":false},{"id":29265,"name":"Barry L. Wanner","orcid":"0000-0002-8703-1517","position":8,"is_corresponding":false},{"id":29266,"name":"Hirotada Mori","orcid":"0000-0003-3855-778X","position":9,"is_corresponding":false},{"id":29257,"name":"Tomoya Baba","orcid":"0000-0001-8986-5905","position":0,"is_corresponding":true}],"reference_count":53,"raw_metadata":{"citation_network_status":"fetched"},"created_at":"2026-03-01T18:20:47.508186Z","pmid":"16738554","pmcid":"PMC1681482","fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":null,"license":null,"views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":52.5,"fair_a":55.0,"fair_i":25.0,"fair_r":33.3333,"fair_zscore":-0.3395,"fair_rationale":{"fair_score":41.46,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":52.5,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"datacite=0, pmcid=True, pmid=True","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.25,"signal":null,"rationale":"The paper provides detailed narrative metadata (e.g., strain, gene lists, primer sequences) but lacks machine-readable structured metadata (e.g., schema.org/JSON-LD, formal ontology terms) for the mutant collection."}]},"A":{"name":"Accessible","score":55.0,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper states distribution via GenoBase (http://ecoli.aist-nara.ac.jp/) and mentions supplementary tables, but does not provide a direct, persistent link to the actual data/code repository or any formal data access protocol (e.g., API, download instructions)."}]},"I":{"name":"Interoperable","score":25.0,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"Standard identifiers (e.g., GenBank accessions AY048744, AY048746) and standard plasmid names are used, but the paper does not employ community standard vocabularies for phenotypes or growth data, nor does it reference formal data formats for the supplementary tables."}]},"R":{"name":"Reusable","score":33.33,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.5,"signal":null,"rationale":"The paper explicitly states distribution via a public repository and provides extensive supplementary tables, yet it lacks an explicit data-availability statement, a usage license, and detailed metadata for growth data reproducibility (e.g., raw plate-reader output files)."}]}},"suggestions":["Provide machine-readable structured metadata (e.g., JSON-LD with schema.org/Dataset) for the mutant collection.","Include a persistent identifier (e.g., DOI) for the data repository and specify a formal access protocol (e.g., FTP, REST API).","Use community standard vocabularies (e.g., Ontology for bacterial phenotypes) and standard data formats (e.g., CSV with explicit column definitions) for all supplementary data.","Add an explicit data-availability statement with a reuse license (e.g., CC-BY) and include raw growth curve data in a standard format (e.g., ODM or CSV) to enable full reproducibility."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v2","fulltext_source":"epmc_xml"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v2","fair_fulltext_source":"epmc_xml","fair_has_llm":true,"fair_computed_at":"2026-06-18T00:27:02.696639Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}