{"doi":"10.1186/s13059-025-03865-3","title":"HGMT: a database of human gut microbiota for tumors and immunotherapy response","abstract":"HGMT is a database designed to analyze, explore, and visualize gut microbiomes from diverse tumor types. We process metagenomic datasets from 18,630 stool samples across 37 tumor types, including 2,207 samples from immunotherapy-treated patients across 12 tumor types. HGMT provides an interactive portal for querying taxonomic and functional profiles, visualizing cross-dataset differential abundance taxa in tumors, and identifying their pan-tumor associations. Our analysis reveals the capability of gut microbiota in diagnosing gastrointestinal tumors and predicting immunotherapy response for non-small cell lung carcinoma. HGMT represents a valuable resource for investigating the roles of gut microbiota in tumors and immunotherapy response.","journal":"Genome Biology","year":2025,"id":12150,"datarank":0.10397207708399181,"base_score":0.6931471805599453,"endowment":0.6931471805599453,"self_citation_contribution":0.10397207708399181,"citation_network_contribution":0.0,"self_endowment_contribution":0.10397207708399181,"citer_contribution":0.0,"corpus_percentile":37.91700569568755,"corpus_rank":716,"citation_count":1,"citer_count":1,"citers_with_citation_signal":0,"citers_with_endowment":0,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.9458,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2025-11-24","fair_score":49.7917,"fair_percentile":77.9023746701847,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":49475,"name":"Mingyu Wang","orcid":"0009-0002-9177-9509","position":1,"is_corresponding":false},{"id":96410,"name":"Chentao Xu","orcid":null,"position":2,"is_corresponding":false},{"id":39637,"name":"Longhao Jia","orcid":"0000-0002-3490-840X","position":3,"is_corresponding":false},{"id":39638,"name":"Senying Lai","orcid":"0000-0003-0557-2393","position":4,"is_corresponding":false},{"id":96411,"name":"Zi-Chao Zhang","orcid":"0000-0001-5747-3093","position":5,"is_corresponding":false},{"id":96412,"name":"Jinglong Zhang","orcid":"0000-0001-6795-1423","position":6,"is_corresponding":false},{"id":23546,"name":"Wei-Hua Chen","orcid":"0000-0001-5160-4398","position":7,"is_corresponding":false},{"id":59139,"name":"Yucheng T. Yang","orcid":"0000-0002-6873-5279","position":8,"is_corresponding":false},{"id":96413,"name":"C. F. Xu","orcid":"0009-0005-8484-9849","position":10,"is_corresponding":false},{"id":23552,"name":"Xing‐Ming Zhao","orcid":"0000-0002-4531-3970","position":13,"is_corresponding":false},{"id":58054,"name":"Jinxin Liu","orcid":"0000-0003-0753-5342","position":0,"is_corresponding":true}],"reference_count":126,"raw_metadata":null,"created_at":"2026-03-01T18:20:47.508186Z","pmid":"41286929","pmcid":"PMC12642048","fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":"gold","license":"cc-by-nc-nd","views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":65.0,"fair_a":67.5,"fair_i":25.0,"fair_r":41.6667,"fair_zscore":0.4143,"fair_rationale":{"fair_score":49.79,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":65.0,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"datacite=0, pmcid=True, pmid=True","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper provides curated metadata (country, sex, age, BMI, disease status) and uses MeSH terminology, but does not describe machine-readable metadata standards (e.g., schema.org, DCAT) or persistent identifiers for samples beyond NCBI accessions."}]},"A":{"name":"Accessible","score":67.5,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":0.75,"signal":null,"rationale":"The data are publicly accessible via a web portal (https://mai.fudan.edu.cn/hgmt) and source code via GitHub and Zenodo, but the text does not specify a formal access protocol (e.g., API, authentication requirements) or guarantee long-term availability."}]},"I":{"name":"Interoperable","score":25.0,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper uses standard formats (e.g., FASTA, relative abundance tables) and NCBI Taxonomy IDs, but does not mention use of community-standard vocabularies (e.g., MIxS, OBO Foundry) or formal semantic interoperability measures."}]},"R":{"name":"Reusable","score":41.67,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.667,"signal":null,"rationale":"A data-availability statement is provided with a license (MIT for code, CC BY-NC-ND 4.0 for article), but the license for the data itself is not explicitly stated, and reproducibility is limited by the lack of a complete software environment specification (e.g., container, workflow)."}]}},"suggestions":["Add machine-readable metadata (e.g., JSON-LD with schema.org) to the web portal to improve findability by search engines.","Provide a formal data access protocol (e.g., REST API documentation) and a persistent identifier (e.g., DOI) for the database itself.","Adopt community standards like MIxS for metadata and use OBO Foundry ontologies for terms to enhance interoperability.","Specify a clear license for the data (e.g., CC0 or CC BY) and include a reproducible computational environment (e.g., Docker/Singularity container) for the analysis pipeline.","Include versioning information for the database and code, and provide a citation format for users to reference the resource."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v1","fulltext_source":"epmc_xml"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v1","fair_fulltext_source":"epmc_xml","fair_has_llm":true,"fair_computed_at":"2026-06-17T23:02:15.149171Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}