{"doi":"10.1093/bioinformatics/btr260","title":"Molecular signatures database (MSigDB) 3.0","abstract":"<jats:title>Abstract</jats:title>\n               <jats:p>Motivation: Well-annotated gene sets representing the universe of the biological processes are critical for meaningful and insightful interpretation of large-scale genomic data. The Molecular Signatures Database (MSigDB) is one of the most widely used repositories of such sets.</jats:p>\n               <jats:p>Results: We report the availability of a new version of the database, MSigDB 3.0, with over 6700 gene sets, a complete revision of the collection of canonical pathways and experimental signatures from publications, enhanced annotations and upgrades to the web site.</jats:p>\n               <jats:p>Availability and Implementation: MSigDB is freely available for non-commercial use at http://www.broadinstitute.org/msigdb.</jats:p>\n               <jats:p>Contact:  gsea@broadinstitute.org</jats:p>","journal":"Bioinformatics","year":2011,"id":12022,"datarank":17.96915732749616,"base_score":8.912338567117548,"endowment":8.912338567117548,"self_citation_contribution":1.3368507850676323,"citation_network_contribution":16.632306542428527,"self_endowment_contribution":1.3368507850676323,"citer_contribution":16.632306542428527,"corpus_percentile":91.70056956875509,"corpus_rank":103,"citation_count":7680,"citer_count":193,"citers_with_citation_signal":193,"citers_with_endowment":193,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.9591,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2011-05-05","fair_score":36.75,"fair_percentile":18.535620052770447,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":16653,"name":"Aravind Subramanian","orcid":null,"position":1,"is_corresponding":false},{"id":20428,"name":" Helga Thorvaldsdottir","orcid":null,"position":3,"is_corresponding":false},{"id":32013,"name":"Pablo Tamayo","orcid":"0000-0002-9360-4668","position":4,"is_corresponding":false},{"id":25136,"name":"Jill P. Mesirov","orcid":"0000-0002-9755-2818","position":5,"is_corresponding":false},{"id":35265,"name":"Reid M. Pinchback","orcid":null,"position":6,"is_corresponding":false},{"id":28642,"name":"Arthur Liberzon","orcid":null,"position":0,"is_corresponding":true}],"reference_count":13,"raw_metadata":{"citation_network_status":"fetched"},"created_at":"2026-03-01T18:20:47.508186Z","pmid":"21546393","pmcid":"PMC3106198","fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":null,"license":null,"views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":64.0,"fair_a":58.0,"fair_i":5.0,"fair_r":20.0,"fair_zscore":-0.7654,"fair_rationale":{"fair_score":36.75,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":64.0,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"datacite=0, pmcid=True, pmid=True","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.0,"signal":null,"rationale":"No machine-readable metadata or structured description of data is mentioned; the text only provides a brief abstract and a web link."}]},"A":{"name":"Accessible","score":58.0,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper states MSigDB is freely available for non-commercial use at a URL, but does not describe any authentication, API, or download protocol beyond the link."}]},"I":{"name":"Interoperable","score":5.0,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.25,"signal":null,"rationale":"The text mentions 'annotations' but gives no specifics about standard formats (e.g., file format), controlled vocabularies, or identifiers used for gene sets."}]},"R":{"name":"Reusable","score":20.0,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.333,"signal":null,"rationale":"The license is partially stated ('free for non-commercial use') but no data-availability statement for the paper itself, nor any code or reproducibility details are provided."}]}},"suggestions":["Provide structured metadata (e.g., in JSON-LD or XML) describing MSigDB content, version, and download methods to improve findability.","Specify access protocols including API endpoints, file formats (e.g., GMT, XML), and any authentication requirements for automated access.","Use standard community vocabularies (e.g., Gene Ontology terms) and persistent identifiers (e.g., DOIs) for gene sets to enhance interoperability.","Include a formal data-availability statement with a license (e.g., Creative Commons) and reference to the exact version of the database used in the paper.","Describe software and scripts used to generate or analyze MSigDB, and deposit them in a repository with version control to enable reproducibility."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v2","fulltext_source":"abstract_only"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v2","fair_fulltext_source":"abstract_only","fair_has_llm":true,"fair_computed_at":"2026-06-18T00:27:27.936849Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}