{"doi":"10.1038/s41598-019-39576-6","title":"Metagenome-assembled genomes provide new insight into the microbial diversity of two thermal pools in Kamchatka, Russia","abstract":"Culture-independent methods have contributed substantially to our understanding of global microbial diversity. Recently developed algorithms to construct whole genomes from environmental samples have further refined, corrected and revolutionized understanding of the tree of life. Here, we assembled draft metagenome-assembled genomes (MAGs) from environmental DNA extracted from two hot springs within an active volcanic ecosystem on the Kamchatka peninsula, Russia. This hydrothermal system has been intensively studied previously with regard to geochemistry, chemoautotrophy, microbial isolation, and microbial diversity. We assembled genomes of bacteria and archaea using DNA that had previously been characterized via 16S rRNA gene clone libraries. We recovered 36 MAGs, 29 of medium to high quality, and inferred their placement in a phylogenetic tree consisting of 3,240 publicly available microbial genomes. We highlight MAGs that were taxonomically assigned to groups previously underrepresented in available genome data. This includes several archaea (Korarchaeota, Bathyarchaeota and Aciduliprofundum) and one potentially new species within the bacterial genus Sulfurihydrogenibium. Putative functions in both pools were compared and are discussed in the context of their diverging geochemistry. This study adds comprehensive information about phylogenetic diversity and functional potential within two hot springs in the caldera of Kamchatka.","journal":"Scientific Reports","year":2019,"id":2438,"datarank":2.710565498963514,"base_score":4.68213122712422,"endowment":4.68213122712422,"self_citation_contribution":0.7023196840686331,"citation_network_contribution":2.008245814894881,"self_endowment_contribution":0.7023196840686331,"citer_contribution":2.008245814894881,"corpus_percentile":66.80227827502034,"corpus_rank":409,"citation_count":108,"citer_count":115,"citers_with_citation_signal":74,"citers_with_endowment":74,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.7621,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2019-02-28","fair_score":52.9167,"fair_percentile":79.11169744942832,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":29674,"name":"Cassandra L. Ettinger","orcid":"0000-0001-7334-403X","position":1,"is_corresponding":false},{"id":29675,"name":"Guillaume Jospin","orcid":"0000-0002-8746-2632","position":2,"is_corresponding":false},{"id":7454,"name":"Jonathan A. Eisen","orcid":"0000-0002-0159-2197","position":3,"is_corresponding":false},{"id":29673,"name":"Laetitia G. E. Wilkins","orcid":"0000-0003-0756-8339","position":0,"is_corresponding":true}],"reference_count":101,"raw_metadata":{"citation_network_status":"fetched"},"created_at":"2026-03-01T18:20:47.508186Z","pmid":"30816235","pmcid":"PMC6395817","fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":"gold","license":"cc-by","views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":65.0,"fair_a":67.5,"fair_i":37.5,"fair_r":41.6667,"fair_zscore":0.697,"fair_rationale":{"fair_score":52.92,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":65.0,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"datacite=0, pmcid=True, pmid=True","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper provides detailed textual metadata and links to online repositories, but does not include structured machine-readable metadata such as an ISA-Tab or Schema.org annotation."}]},"A":{"name":"Accessible","score":67.5,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":0.75,"signal":null,"rationale":"The Data Availability section explicitly lists NCBI SRA, BioProject, and DASH identifiers for data access, but does not provide a software repository or container for computational reproducibility."}]},"I":{"name":"Interoperable","score":37.5,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.75,"signal":null,"rationale":"Standard file formats (FASTA, Newick) and identifiers (NCBI, KEGG) are used, but no formal community ontology or vocabulary (e.g., MIxS) is cited and MAG quality is based on subjective manual binning."}]},"R":{"name":"Reusable","score":41.67,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.667,"signal":null,"rationale":"Data are publicly archived under a permissive CC-BY license with detailed methods, but no explicit statement of the license for the code/scripts and no clear description of reuse conditions for third-party materials."}]}},"suggestions":["Include structured machine-readable metadata (e.g., ISA-Tab or DataCite XML) alongside the paper.","Deposit analysis scripts and workflows in a version-controlled public repository with a persistent identifier.","Provide explicit license statements for all software and code used in the analysis.","Cite and use community standards (e.g., MIxS for environmental sample metadata) in a dedicated metadata section."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v2","fulltext_source":"epmc_xml"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v2","fair_fulltext_source":"epmc_xml","fair_has_llm":true,"fair_computed_at":"2026-06-18T00:39:40.517651Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}