{"doi":"10.1038/s41597-019-0220-5","title":"Tracing diagnosis trajectories over millions of patients reveal an unexpected risk in schizophrenia","abstract":"The identification of novel disease associations using big-data for patient care has had limited success. In this study, we created a longitudinal disease network of traced readmissions (disease trajectories), merging data from over 10.4 million inpatients through the Healthcare Cost and Utilization Project, which allowed the representation of disease progression mapping over 300 diseases. From these disease trajectories, we discovered an interesting association between schizophrenia and rhabdomyolysis, a rare muscle disease (incidence < 1E-04) (relative risk, 2.21 [1.80-2.71, confidence interval = 0.95], P-value 9.54E-15). We validated this association by using independent electronic medical records from over 830,000 patients at the University of California, San Francisco (UCSF) medical center. A case review of 29 rhabdomyolysis incidents in schizophrenia patients at UCSF demonstrated that 62% are idiopathic, without the use of any drug known to lead to this adverse event, suggesting a warning to physicians to watch for this unexpected risk of schizophrenia. Large-scale analysis of disease trajectories can help physicians understand potential sequential events in their patients.","journal":"Scientific Data","year":2019,"id":1240,"datarank":0.9163355019620426,"base_score":2.833213344056216,"endowment":2.833213344056216,"self_citation_contribution":0.42498200160843247,"citation_network_contribution":0.49135350035361014,"self_endowment_contribution":0.42498200160843247,"citer_contribution":0.49135350035361014,"corpus_percentile":57.36371033360456,"corpus_rank":525,"citation_count":16,"citer_count":14,"citers_with_citation_signal":12,"citers_with_endowment":12,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.8197,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2019-10-15","fair_score":49.7917,"fair_percentile":77.9023746701847,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":15914,"name":"Matthew J. Kan","orcid":"0000-0003-4840-2917","position":1,"is_corresponding":false},{"id":15915,"name":"Nadav Rappoport","orcid":"0000-0002-7218-2558","position":2,"is_corresponding":false},{"id":15916,"name":"Dexter Hadley","orcid":"0000-0003-0990-4674","position":3,"is_corresponding":false},{"id":2824,"name":"Marina Sirota","orcid":"0000-0002-7246-6083","position":4,"is_corresponding":false},{"id":11850,"name":"BIN CHEN","orcid":"0000-0001-8858-874X","position":5,"is_corresponding":false},{"id":15917,"name":"Udi Manber","orcid":null,"position":6,"is_corresponding":false},{"id":15918,"name":"Seong Beom Cho","orcid":null,"position":7,"is_corresponding":false},{"id":51,"name":"Atul Janardhan Butte","orcid":"0000-0002-7433-2740","position":8,"is_corresponding":false},{"id":15919,"name":"Harikrishna Paik","orcid":"0000-0002-3994-0695","position":9,"is_corresponding":false},{"id":15913,"name":"Hyojung Paik","orcid":null,"position":0,"is_corresponding":true}],"reference_count":32,"raw_metadata":{"citation_network_status":"fetched"},"created_at":"2026-03-01T18:20:47.508186Z","pmid":"31615985","pmcid":"PMC6794302","fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":"gold","license":"cc-by","views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":52.5,"fair_a":67.5,"fair_i":37.5,"fair_r":41.6667,"fair_zscore":0.4143,"fair_rationale":{"fair_score":49.79,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":52.5,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"datacite=0, pmcid=True, pmid=True","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.25,"signal":null,"rationale":"The paper does not describe any machine-readable metadata or structured metadata files for the datasets."}]},"A":{"name":"Accessible","score":67.5,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":0.75,"signal":null,"rationale":"The paper clearly states that HCUP data are available via material transfer agreement and UCSF EHRs via inter-institutional agreement, with code on GitHub, but access is not fully open."}]},"I":{"name":"Interoperable","score":37.5,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.75,"signal":null,"rationale":"The paper uses standard ICD-9-CM codes and common programming languages, but does not mention use of standard identifiers or formal data formats."}]},"R":{"name":"Reusable","score":41.67,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.667,"signal":null,"rationale":"The paper provides a data availability statement, code on GitHub, and is open access under CC BY 4.0, but data access restrictions limit full reproducibility."}]}},"suggestions":["Provide machine-readable metadata (e.g., DataCite XML) for the datasets to enhance findability.","Make the code and data more openly accessible, e.g., by depositing a synthetic or de-identified subset in a public repository with a DOI.","Document the data schema and variable definitions in a standard format (e.g., CSV with codebook) to improve interoperability.","Add a license to the code repository (e.g., MIT or Apache 2.0) to clarify reuse terms."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v2","fulltext_source":"epmc_xml"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v2","fair_fulltext_source":"epmc_xml","fair_has_llm":true,"fair_computed_at":"2026-06-18T00:48:38.116862Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}