{"doi":"10.1534/g3.116.033514","title":"Comprehensive Cross-Population Analysis of High-Grade Serous Ovarian Cancer Supports No More Than Three Subtypes","abstract":"Four gene expression subtypes of high-grade serous ovarian cancer (HGSC) have been previously described. In these early studies, a fraction of samples that did not fit well into the four subtype classifications were excluded. Therefore, we sought to systematically determine the concordance of transcriptomic HGSC subtypes across populations without removing any samples. We created a bioinformatics pipeline to independently cluster the five largest mRNA expression datasets using k-means and nonnegative matrix factorization (NMF). We summarized differential expression patterns to compare clusters across studies. While previous studies reported four subtypes, our cross-population comparison does not support four. Because these results contrast with previous reports, we attempted to reproduce analyses performed in those studies. Our results suggest that early results favoring four subtypes may have been driven by the inclusion of serous borderline tumors. In summary, our analysis suggests that either two or three, but not four, gene expression subtypes are most consistent across datasets.","journal":"G3 Genes|Genomes|Genetics","year":2016,"id":3124,"datarank":1.3899912238767747,"base_score":3.8918202981106265,"endowment":3.8918202981106265,"self_citation_contribution":0.5837730447165941,"citation_network_contribution":0.8062181791601807,"self_endowment_contribution":0.5837730447165941,"citer_contribution":0.8062181791601807,"corpus_percentile":61.18795768917819,"corpus_rank":478,"citation_count":50,"citer_count":30,"citers_with_citation_signal":21,"citers_with_endowment":21,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.7391,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2016-12-01","fair_score":46.6667,"fair_percentile":43.733509234828496,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":6566,"name":"James Rudd","orcid":"0000-0002-3435-4558","position":1,"is_corresponding":false},{"id":34354,"name":"Chen Wang","orcid":"0000-0003-2638-3081","position":2,"is_corresponding":false},{"id":34355,"name":"Habib Hamidi","orcid":"0000-0003-0196-9852","position":3,"is_corresponding":false},{"id":34356,"name":"Brooke L. Fridley","orcid":"0000-0001-7739-7956","position":4,"is_corresponding":false},{"id":33595,"name":"Gottfried E. Konecny","orcid":"0000-0001-6083-657X","position":5,"is_corresponding":false},{"id":7630,"name":"Ellen L. Goode","orcid":"0000-0002-9094-8326","position":6,"is_corresponding":false},{"id":308,"name":"Casey S. Greene","orcid":"0000-0001-8713-9213","position":7,"is_corresponding":false},{"id":309,"name":"Jennifer Anne Doherty","orcid":"0000-0002-1454-8187","position":8,"is_corresponding":false},{"id":301,"name":"Gregory P. Way","orcid":"0000-0002-0503-9348","position":0,"is_corresponding":true}],"reference_count":30,"raw_metadata":{"citation_network_status":"fetched"},"created_at":"2026-03-01T18:20:47.508186Z","pmid":"27729437","pmcid":"PMC5144978","fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":"gold","license":"cc-by","views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":52.5,"fair_a":67.5,"fair_i":25.0,"fair_r":41.6667,"fair_zscore":0.1316,"fair_rationale":{"fair_score":46.67,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":52.5,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"datacite=0, pmcid=True, pmid=True","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.25,"signal":null,"rationale":"The paper provides human-readable metadata (e.g., GEO accession GSE74357, DOI) but lacks machine-readable structured metadata (e.g., JSON-LD, RDF) or formal metadata schema compliance."}]},"A":{"name":"Accessible","score":67.5,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":0.75,"signal":null,"rationale":"The paper states that data are publicly available via GEO (GSE74357) and provides a Docker image and open-source code, but does not specify a formal access protocol or authentication requirements."}]},"I":{"name":"Interoperable","score":25.0,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper uses standard microarray platforms and gene symbols, but does not employ formal ontologies, controlled vocabularies, or persistent identifiers for variables or samples beyond GEO accession."}]},"R":{"name":"Reusable","score":41.67,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.667,"signal":null,"rationale":"The paper includes a data-availability statement, open-source code under a permissive license (CC BY 4.0), and a Docker container for reproducibility, but lacks a formal license for the code and does not specify reuse conditions for all derived data."}]}},"suggestions":["Provide machine-readable metadata (e.g., JSON-LD or RDF) with structured descriptions of datasets, variables, and methods.","Include a formal data access protocol (e.g., specifying authentication, download procedures, or API endpoints) for all datasets.","Adopt community-standard ontologies (e.g., OBI, EFO) and persistent identifiers (e.g., ORCID for authors, RRID for resources) to enhance interoperability.","Add a clear software license (e.g., MIT or Apache 2.0) to the code repository and specify reuse conditions for all supplementary files.","Deposit all processed data (e.g., cluster assignments, moderated t-scores) in a FAIR-aligned repository with versioning and a DOI."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v2","fulltext_source":"epmc_xml"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v2","fair_fulltext_source":"epmc_xml","fair_has_llm":true,"fair_computed_at":"2026-06-18T00:43:07.790480Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}