{"doi":"10.1101/2024.04.19.590243","title":"The Single-cell Pediatric Cancer Atlas: Data portal and open-source tools for single-cell transcriptomics of pediatric tumors","abstract":"The Single-cell Pediatric Cancer Atlas (ScPCA) Portal ( https://scpca.alexslemonade.org/ ) is a data resource for uniformly processed single-cell and single-nuclei RNA sequencing (RNA-seq) data and de-identified metadata from pediatric tumor samples. Originally comprised of data from 10 projects funded by Alex’s Lemonade Stand Foundation (ALSF), the Portal currently contains summarized gene expression data for over 700 samples across 55 cancer types from ALSF-funded and community-contributed datasets. Downloads include gene expression data as SinglecellExperiment or AnnData objects containing raw and normalized counts, PCA and UMAP coordinates, and automated cell type annotations, along with summary reports. Some samples have additional data from bulk RNA-seq, spatial transcriptomics, and/or feature barcoding (e.g., CITE-seq and cell hashing) included in the download. All data on the Portal were uniformly processed using scpca-nf , an efficient and open-source Nextflow workflow that uses alevin-fry to quantify gene expression. Comprehensive documentation, including descriptions of file contents and a guide to getting started, is available at https://scpca.readthedocs.io .","journal":null,"year":2024,"id":3639,"datarank":0.42378465534224197,"base_score":2.3978952727983707,"endowment":2.3978952727983707,"self_citation_contribution":0.3596842909197557,"citation_network_contribution":0.06410036442248629,"self_endowment_contribution":0.3596842909197557,"citer_contribution":0.06410036442248629,"corpus_percentile":50.20341741253051,"corpus_rank":613,"citation_count":11,"citer_count":11,"citers_with_citation_signal":5,"citers_with_endowment":5,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.9516,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2024-04-24","fair_score":34.5833,"fair_percentile":16.402814423922603,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":29534,"name":"Joshua A. Shapiro","orcid":"0000-0002-6224-0347","position":1,"is_corresponding":false},{"id":32643,"name":"Stephanie J. Spielman","orcid":"0000-0002-9090-4788","position":2,"is_corresponding":false},{"id":37315,"name":"David S. Mejia","orcid":"0000-0003-1679-0353","position":3,"is_corresponding":false},{"id":37316,"name":"Deepashree Venkatesh Prasad","orcid":"0000-0001-5756-4083","position":4,"is_corresponding":false},{"id":37317,"name":"Nozomi Ichihara","orcid":null,"position":5,"is_corresponding":false},{"id":37318,"name":"Arkadii Yakovets","orcid":null,"position":6,"is_corresponding":false},{"id":37319,"name":"Avrohom M. Gottlieb","orcid":null,"position":7,"is_corresponding":false},{"id":32644,"name":"Chanté J. Bethell","orcid":"0000-0001-9653-8128","position":9,"is_corresponding":false},{"id":2966,"name":"Steven M. Foltz","orcid":"0000-0002-9526-8194","position":10,"is_corresponding":false},{"id":308,"name":"Casey S. Greene","orcid":"0000-0001-8713-9213","position":12,"is_corresponding":false},{"id":2967,"name":"Jaclyn N. Taroni","orcid":"0000-0003-4734-4508","position":13,"is_corresponding":false},{"id":37322,"name":"K. Wheeler","orcid":"0000-0002-0640-2903","position":14,"is_corresponding":false},{"id":37323,"name":"Jennifer T. O’Malley","orcid":null,"position":15,"is_corresponding":false},{"id":37314,"name":"Allegra G. Hawkins","orcid":"0000-0001-6026-3660","position":0,"is_corresponding":true}],"reference_count":87,"raw_metadata":null,"created_at":"2026-03-01T18:20:47.508186Z","pmid":null,"pmcid":null,"fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":"green","license":"cc-by","views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":42.0,"fair_a":63.0,"fair_i":10.0,"fair_r":23.3333,"fair_zscore":-0.9614,"fair_rationale":{"fair_score":34.58,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":42.0,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"datacite=0, pmcid=False, pmid=False","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper mentions de-identified metadata and automated cell type annotations, but does not describe machine-readable metadata standards or structured metadata schemas."}]},"A":{"name":"Accessible","score":63.0,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":0.75,"signal":null,"rationale":"The paper provides a clear URL for the data portal and documentation, but does not specify an access protocol (e.g., API, authentication) or conditions for data access."}]},"I":{"name":"Interoperable","score":10.0,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper uses standard formats (SinglecellExperiment, AnnData) and mentions uniform processing, but does not specify use of standard vocabularies or persistent identifiers for samples or cell types."}]},"R":{"name":"Reusable","score":23.33,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.5,"signal":null,"rationale":"The paper states data are available under a license (implied by open-source tools) and provides documentation, but lacks an explicit data-availability statement, license for the data, and reproducibility details for the processing workflow."}]}},"suggestions":["Include a formal data-availability statement with a license (e.g., CC0 or CC-BY) for the data.","Add machine-readable metadata using structured schemas (e.g., schema.org, DCAT) and persistent identifiers (e.g., DOIs) for datasets.","Specify an access protocol (e.g., REST API, OAuth) and conditions for data access in the paper.","Use standard vocabularies (e.g., Cell Ontology, Uberon) for cell types and anatomical sites, and reference them in the paper.","Provide a reproducible workflow container (e.g., Docker/Singularity) and versioned code repository for the processing pipeline."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v2","fulltext_source":"abstract_only"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v2","fair_fulltext_source":"abstract_only","fair_has_llm":true,"fair_computed_at":"2026-06-18T00:49:43.108175Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}