{"doi":"10.1101/2024.09.16.613258","title":"Updated science-wide author databases of standardized citation indicators including retraction data","abstract":"<h4>ABSTRACT</h4> Citation metrics are widely used in research appraisal, but they provide incomplete views of scientists’ impact and research track record. Other indicators of research practices should be linked to citation data. We have updated a Scopus-based database of highly-cited scientists (top-2% in each scientific subfield according to a composite citation indicator) to incorporate retraction data. Using data from the Retraction Watch database (RWDB), retraction records were linked to Scopus citation data. Of 55,237 items in RWDB as of August 15, 2024, we excluded non-retractions, retractions clearly not due to any author error, retractions where the paper had been republished, and items not linkable to Scopus records. Eventually 39,468 eligible retractions were linked to Scopus. Among 217,097 top-cited scientists in career-long impact and 223,152 in single recent year (2023) impact, 7,083 (3.3%) and 8,747 (4.0%), respectively, had at least one retraction. Scientists with retracted publications had younger publication age, higher self-citation rates, and larger publication volume than those without any retracted publications. Retractions were more common in the life sciences and rare or nonexistent in several other disciplines. In several developing countries, very high proportions of top-cited scientists had retractions (highest in Senegal (66.7%), Ecuador (28.6%) and Pakistan (27.8%) in career-long citation impact lists). Variability in retraction rates across fields and countries suggests differences in research practices, scrutiny, and ease of retraction. Addition of retraction data enhances the granularity of top-cited scientists’ profiles, aiding in responsible research evaluation. However, caution is needed when interpreting retractions, as they do not always signify misconduct; further analysis on a case-by-case basis is essential. The database should hopefully provide a resource for meta-research and deeper insights into scientific practices.","journal":null,"year":2024,"id":6753,"datarank":0.9864578712581109,"base_score":3.091042453358316,"endowment":3.091042453358316,"self_citation_contribution":0.4636563680037475,"citation_network_contribution":0.5228015032543634,"self_endowment_contribution":0.4636563680037475,"citer_contribution":0.5228015032543634,"corpus_percentile":58.25874694873881,"corpus_rank":514,"citation_count":27,"citer_count":22,"citers_with_citation_signal":9,"citers_with_endowment":9,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.8756,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2024-09-17","fair_score":27.0833,"fair_percentile":8.597185576077397,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":4553,"name":"Angelo Maria Pezzullo","orcid":"0000-0002-8252-4654","position":1,"is_corresponding":false},{"id":13027,"name":"Antonio Cristiano","orcid":"0000-0001-7055-8577","position":2,"is_corresponding":false},{"id":3357,"name":"Stefania Boccia","orcid":"0000-0002-1864-749X","position":3,"is_corresponding":false},{"id":11483,"name":"Jeroen Baas","orcid":"0000-0001-8005-4153","position":4,"is_corresponding":false},{"id":148,"name":"John P. A. Ioannidis","orcid":"0000-0003-3118-6859","position":0,"is_corresponding":true}],"reference_count":23,"raw_metadata":null,"created_at":"2026-03-01T18:20:47.508186Z","pmid":null,"pmcid":null,"fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":"green","license":"cc-by","views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":37.0,"fair_a":48.0,"fair_i":10.0,"fair_r":13.3333,"fair_zscore":-1.6398,"fair_rationale":{"fair_score":27.08,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":37.0,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"datacite=0, pmcid=False, pmid=False","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.25,"signal":null,"rationale":"The paper mentions using Scopus and Retraction Watch databases but provides no details about machine-readable metadata or structured metadata descriptions for the new database."}]},"A":{"name":"Accessible","score":48.0,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":0.0,"signal":null,"rationale":"No access protocol (e.g., repository link, data file location, or code availability) is stated anywhere in the provided text."}]},"I":{"name":"Interoperable","score":10.0,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper uses standard databases (Scopus, Retraction Watch) and some field-standard indicators (top-2%, composite citation indicator), but does not specify standard formats, vocabularies, or persistent identifiers used in the final database."}]},"R":{"name":"Reusable","score":13.33,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.0,"signal":null,"rationale":"No data-availability statement, license, or reproducibility instructions are provided in the text; the database is merely described as a resource, but no means to access or reuse it are given."}]}},"suggestions":["Provide a clear data-availability statement with a persistent identifier (e.g., DOI) and a public repository for the linked database.","Include a detailed description of the data format (e.g., CSV, JSON) and any controlled vocabularies used to link retractions to citations.","Publish the code and methodology for linking Retraction Watch and Scopus records as a separate reproducible package.","Add a license (e.g., CC-BY or CC0) to the database to clarify reuse rights.","Provide a data dictionary or schema of the generated database fields, including identifiers like ORCID for authors and DOI for papers."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v2","fulltext_source":"abstract_only"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v2","fair_fulltext_source":"abstract_only","fair_has_llm":true,"fair_computed_at":"2026-06-18T00:46:03.048804Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}