{"doi":"10.1371/journal.pone.0244839","title":"A scientometric overview of CORD-19","abstract":"As the COVID-19 pandemic unfolds, researchers from all disciplines are coming together and contributing their expertise. CORD-19, a dataset of COVID-19 and coronavirus publications, has been made available alongside calls to help mine the information it contains and to create tools to search it more effectively. We analyse the delineation of the publications included in CORD-19 from a scientometric perspective. Based on a comparison to the Web of Science database, we find that CORD-19 provides an almost complete coverage of research on COVID-19 and coronaviruses. CORD-19 contains not only research that deals directly with COVID-19 and coronaviruses, but also research on viruses in general. Publications from CORD-19 focus mostly on a few well-defined research areas, in particular: coronaviruses (primarily SARS-CoV, MERS-CoV and SARS-CoV-2); public health and viral epidemics; molecular biology of viruses; influenza and other families of viruses; immunology and antivirals; clinical medicine. CORD-19 publications that appeared in 2020, especially editorials and letters, are disproportionately popular on social media. While we fully endorse the CORD-19 initiative, it is important to be aware that CORD-19 extends beyond research on COVID-19 and coronaviruses.","journal":"PLOS ONE","year":2021,"id":8388,"datarank":2.9930970772571346,"base_score":4.356708826689592,"endowment":4.356708826689592,"self_citation_contribution":0.6535063240034389,"citation_network_contribution":2.3395907532536957,"self_endowment_contribution":0.6535063240034389,"citer_contribution":2.3395907532536957,"corpus_percentile":67.860048820179,"corpus_rank":396,"citation_count":80,"citer_count":75,"citers_with_citation_signal":60,"citers_with_endowment":60,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.9212,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2021-01-07","fair_score":41.4583,"fair_percentile":20.734388742304308,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":846,"name":"Rodrigo Costas","orcid":"0000-0002-7465-6462","position":1,"is_corresponding":false},{"id":12102,"name":"Vincent A. Traag","orcid":"0000-0003-3170-3879","position":2,"is_corresponding":false},{"id":235,"name":"Nees Jan van Eck","orcid":"0000-0001-8448-4521","position":3,"is_corresponding":false},{"id":236,"name":"Ludo Waltman","orcid":"0000-0001-8249-1752","position":5,"is_corresponding":false},{"id":16458,"name":"Thed N. van Leeuwen","orcid":"0000-0001-7238-6289","position":6,"is_corresponding":false},{"id":6451,"name":"Giovanni Colavizza","orcid":"0000-0002-9806-084X","position":0,"is_corresponding":true}],"reference_count":41,"raw_metadata":null,"created_at":"2026-03-01T18:20:47.508186Z","pmid":"33411846","pmcid":"PMC7790270","fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":"gold","license":"cc-by","views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":52.5,"fair_a":55.0,"fair_i":25.0,"fair_r":33.3333,"fair_zscore":-0.3395,"fair_rationale":{"fair_score":41.46,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":52.5,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"datacite=0, pmcid=True, pmid=True","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.25,"signal":null,"rationale":"The paper provides a DOI and ORCIDs for authors, but no machine-readable metadata beyond the text is described."}]},"A":{"name":"Accessible","score":55.0,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper states that code is available at a GitHub repository and that some analyses require access to proprietary services, but does not specify a clear protocol for accessing the underlying data."}]},"I":{"name":"Interoperable","score":25.0,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper uses standard identifiers like DOIs and PubMed IDs, but does not mention use of standard data formats or controlled vocabularies for the data itself."}]},"R":{"name":"Reusable","score":33.33,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.5,"signal":null,"rationale":"The paper includes a data availability statement pointing to a GitHub repository and mentions a Creative Commons license, but states that replication requires access to proprietary services, limiting full reusability."}]}},"suggestions":["Provide structured metadata (e.g., JSON-LD) in the paper or repository to enhance machine findability.","Specify a clear, step-by-step protocol for accessing the proprietary data (Altmetric, Dimensions, etc.) used in the study.","Use standard data formats (e.g., CSV, JSON) and controlled vocabularies (e.g., MeSH) for the released code and data.","Include a formal data citation with a persistent identifier (e.g., DOI) for the dataset used.","Document all software dependencies and versions in the repository to improve reproducibility."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v2","fulltext_source":"epmc_xml"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v2","fair_fulltext_source":"epmc_xml","fair_has_llm":true,"fair_computed_at":"2026-06-18T00:40:52.094084Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}