{"doi":"10.1093/nar/gkaa1074","title":"The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets","abstract":"Cellular life depends on a complex web of functional associations between biomolecules. Among these associations, protein-protein interactions are particularly important due to their versatility, specificity and adaptability. The STRING database aims to integrate all known and predicted associations between proteins, including both physical interactions as well as functional associations. To achieve this, STRING collects and scores evidence from a number of sources: (i) automated text mining of the scientific literature, (ii) databases of interaction experiments and annotated complexes/pathways, (iii) computational interaction predictions from co-expression and from conserved genomic context and (iv) systematic transfers of interaction evidence from one organism to another. STRING aims for wide coverage; the upcoming version 11.5 of the resource will contain more than 14 000 organisms. In this update paper, we describe changes to the text-mining system, a new scoring-mode for physical interactions, as well as extensive user interface features for customizing, extending and sharing protein networks. In addition, we describe how to query STRING with genome-wide, experimental data, including the automated detection of enriched functionalities and potential biases in the user's query data. The STRING resource is available online, at https://string-db.org/.","journal":"Nucleic Acids Research","year":2020,"id":8667,"datarank":11.610564491083593,"base_score":9.009691898489343,"endowment":9.009691898489343,"self_citation_contribution":1.3514537847734018,"citation_network_contribution":10.259110706310192,"self_endowment_contribution":1.3514537847734018,"citer_contribution":10.259110706310192,"corpus_percentile":82.01790073230268,"corpus_rank":222,"citation_count":8484,"citer_count":194,"citers_with_citation_signal":194,"citers_with_endowment":194,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.9362,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2020-11-25","fair_score":58.125,"fair_percentile":91.84256816182938,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":54669,"name":"Annika L. Gable","orcid":"0000-0002-8965-0848","position":1,"is_corresponding":false},{"id":54670,"name":"David Lyon","orcid":"0000-0001-5794-0456","position":3,"is_corresponding":false},{"id":75001,"name":"Rebecca Kirsch","orcid":null,"position":4,"is_corresponding":false},{"id":75002,"name":"Sampo Pyysalo","orcid":"0000-0002-6279-5000","position":5,"is_corresponding":false},{"id":54675,"name":"Nadezhda T. Doncheva","orcid":"0000-0002-8806-6850","position":6,"is_corresponding":false},{"id":75003,"name":"Marc Legeay","orcid":"0000-0001-7984-326X","position":7,"is_corresponding":false},{"id":75004,"name":"Tao Fang","orcid":"0000-0001-9659-7726","position":8,"is_corresponding":false},{"id":19778,"name":"Peer Bork","orcid":"0000-0002-2627-833X","position":9,"is_corresponding":false},{"id":13184,"name":"Christian von Mering","orcid":"0000-0001-7734-9102","position":11,"is_corresponding":false},{"id":75006,"name":"Katerina Nastou","orcid":"0000-0003-3611-5726","position":12,"is_corresponding":false},{"id":54677,"name":"Lars Juhl Jensen","orcid":"0000-0001-7885-715X","position":13,"is_corresponding":false},{"id":54668,"name":"Damian Szklarczyk","orcid":"0000-0002-4052-5069","position":0,"is_corresponding":true}],"reference_count":60,"raw_metadata":{"citation_network_status":"fetched"},"created_at":"2026-03-01T18:20:47.508186Z","pmid":"33237311","pmcid":"PMC7779004","fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":"gold","license":"cc-by","views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":65.0,"fair_a":80.0,"fair_i":37.5,"fair_r":50.0,"fair_zscore":1.1681,"fair_rationale":{"fair_score":58.12,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":65.0,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"datacite=0, pmcid=True, pmid=True","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper mentions the database content and download availability but lacks explicit description of rich, machine-readable metadata standards."}]},"A":{"name":"Accessible","score":80.0,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":1.0,"signal":null,"rationale":"Clear access protocol provided: website, REST API, Cytoscape app, R package, and free distribution under CC BY 4.0."}]},"I":{"name":"Interoperable","score":37.5,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.75,"signal":null,"rationale":"Uses standard identifiers (UniProt, KEGG) and licenses, but does not specify standard data formats or vocabularies for export."}]},"R":{"name":"Reusable","score":50.0,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.833,"signal":null,"rationale":"Data freely available under CC BY 4.0 license with downloadable content, but lacks detailed reproducibility steps for the scoring pipeline."}]}},"suggestions":["Provide machine-readable metadata using schemas like schema.org or DCAT.","Document all export data formats with explicit standard schemas (e.g., JSON, XML).","Assign a persistent identifier (e.g., DOI) to the dataset for easy citation.","Include a reproducibility section for the scoring and evidence integration pipeline.","Add versioning information for all downloadable files."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v2","fulltext_source":"epmc_xml"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v2","fair_fulltext_source":"epmc_xml","fair_has_llm":true,"fair_computed_at":"2026-06-18T00:26:27.722374Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}