{"doi":"10.1038/s41586-025-09140-6","title":"Complex genetic variation in nearly complete human genomes","abstract":"Diverse sets of complete human genomes are required to construct a pangenome reference and to understand the extent of complex structural variation. Here we sequence 65 diverse human genomes and build 130 haplotype-resolved assemblies (median continuity of 130 Mb), closing 92% of all previous assembly gaps<sup>1,2</sup> and reaching telomere-to-telomere status for 39% of the chromosomes. We highlight complete sequence continuity of complex loci, including the major histocompatibility complex (MHC), SMN1/SMN2, NBPF8 and AMY1/AMY2, and fully resolve 1,852 complex structural variants. In addition, we completely assemble and validate 1,246 human centromeres. We find up to 30-fold variation in α-satellite higher-order repeat array length and characterize the pattern of mobile element insertions into α-satellite higher-order repeat arrays. Although most centromeres predict a single site of kinetochore attachment, epigenetic analysis suggests the presence of two hypomethylated regions for 7% of centromeres. Combining our data with the draft pangenome reference<sup>1</sup> significantly enhances genotyping accuracy from short-read data, enabling whole-genome inference<sup>3</sup> to a median quality value of 45. Using this approach, 26,115 structural variants per individual are detected, substantially increasing the number of structural variants now amenable to downstream disease association studies.","journal":"Nature","year":2025,"id":6477,"datarank":0.8622397659511468,"base_score":3.6888794541139363,"endowment":3.6888794541139363,"self_citation_contribution":0.5533319181170905,"citation_network_contribution":0.30890784783405634,"self_endowment_contribution":0.5533319181170905,"citer_contribution":0.30890784783405634,"corpus_percentile":56.712774613506916,"corpus_rank":533,"citation_count":67,"citer_count":50,"citers_with_citation_signal":15,"citers_with_endowment":15,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.9358,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2025-07-23","fair_score":61.25,"fair_percentile":92.70008795074759,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":19693,"name":"Peter Ebert","orcid":"0000-0001-7441-532X","position":1,"is_corresponding":false},{"id":19694,"name":"Peter A. Audano","orcid":"0000-0002-5187-0415","position":2,"is_corresponding":false},{"id":19695,"name":"Mark Loftus","orcid":"0000-0002-6279-6855","position":3,"is_corresponding":false},{"id":19696,"name":"David Porubsky","orcid":"0000-0001-8414-8966","position":4,"is_corresponding":false},{"id":19697,"name":"Feyza Yilmaz","orcid":"0000-0001-8795-5800","position":6,"is_corresponding":false},{"id":19698,"name":"Pille Hallast","orcid":"0000-0002-0588-3987","position":7,"is_corresponding":false},{"id":19699,"name":"Timofey Prodanov","orcid":"0000-0001-7469-6651","position":8,"is_corresponding":false},{"id":19700,"name":"DongAhn Yoo","orcid":"0000-0003-0033-3721","position":9,"is_corresponding":false},{"id":19701,"name":"Carolyn A. Paisie","orcid":"0000-0003-4306-4154","position":10,"is_corresponding":false},{"id":19702,"name":"William T. Harvey","orcid":"0000-0003-0646-7528","position":11,"is_corresponding":false},{"id":19703,"name":"Xuefang Zhao","orcid":"0000-0003-4036-9577","position":12,"is_corresponding":false},{"id":19704,"name":"Gianni V. Martino","orcid":"0009-0005-4143-7465","position":13,"is_corresponding":false},{"id":19705,"name":"Mir Henglin","orcid":"0000-0003-3604-4868","position":14,"is_corresponding":false},{"id":19706,"name":"Katherine M. Munson","orcid":"0000-0001-8413-6498","position":15,"is_corresponding":false},{"id":2121,"name":"Chen-Shan Chin","orcid":"0000-0003-4394-2455","position":17,"is_corresponding":false},{"id":19708,"name":"Bida Gu","orcid":"0000-0001-8575-997X","position":18,"is_corresponding":false},{"id":19709,"name":"Hufsah Ashraf","orcid":"0000-0001-7760-0627","position":19,"is_corresponding":false},{"id":19726,"name":"Stephan Scholz","orcid":"0009-0000-0268-1979","position":20,"is_corresponding":false},{"id":19710,"name":"Olanrewaju Austine-Orimoloye","orcid":"0000-0002-4390-1437","position":21,"is_corresponding":false},{"id":19711,"name":"Parithi Balachandran","orcid":"0000-0003-3256-1403","position":22,"is_corresponding":false},{"id":19712,"name":"Marc Jan Bonder","orcid":"0000-0002-8431-3180","position":23,"is_corresponding":false},{"id":1288,"name":"Haoyu Cheng","orcid":"0000-0002-9209-5793","position":24,"is_corresponding":false},{"id":13351,"name":"Zechen Chong","orcid":"0000-0001-5750-1808","position":25,"is_corresponding":false},{"id":19713,"name":"Jonathan Crabtree","orcid":"0000-0002-7286-5690","position":26,"is_corresponding":false},{"id":42990,"name":"Grigorios Georgolopoulos","orcid":"0000-0002-9906-4797","position":27,"is_corresponding":false},{"id":19714,"name":"Lisbeth A. Guethlein","orcid":"0000-0002-1301-8301","position":28,"is_corresponding":false},{"id":19715,"name":"Patrick Hasenfeld","orcid":"0000-0003-2319-2482","position":29,"is_corresponding":false},{"id":6314,"name":"Hickey, Glenn","orcid":"0000-0002-2280-9404","position":30,"is_corresponding":false},{"id":19716,"name":"Kendra Hoekzema","orcid":"0000-0002-8058-0177","position":31,"is_corresponding":false},{"id":19717,"name":"Sarah E. Hunt","orcid":"0000-0002-8350-1235","position":32,"is_corresponding":false},{"id":19718,"name":"Matthew Jensen","orcid":"0000-0002-5153-8543","position":33,"is_corresponding":false},{"id":19719,"name":"Yunzhe Jiang","orcid":"0000-0001-8768-0050","position":34,"is_corresponding":false},{"id":2118,"name":"Sergey Koren","orcid":"0000-0002-1472-8962","position":35,"is_corresponding":false},{"id":19743,"name":"Young-Jun Kwon","orcid":"0000-0002-5024-2134","position":36,"is_corresponding":false},{"id":19721,"name":"Chong Li","orcid":"0000-0003-1949-4074","position":37,"is_corresponding":false},{"id":30887,"name":"Alexandra P. Lewis","orcid":"0000-0002-6195-4786","position":38,"is_corresponding":false},{"id":18070,"name":"Jiaqi Li","orcid":"0000-0003-1587-5910","position":39,"is_corresponding":false},{"id":19722,"name":"Paul J. Norman","orcid":"0000-0001-8370-7703","position":40,"is_corresponding":false},{"id":19723,"name":"Keisuke K. Oshima","orcid":"0009-0002-2229-8998","position":41,"is_corresponding":false},{"id":30899,"name":"Nathan D. Olson","orcid":"0000-0003-2585-3037","position":42,"is_corresponding":false},{"id":2122,"name":"Adam  M. Phillippy","orcid":"0000-0003-2983-8934","position":43,"is_corresponding":false},{"id":19724,"name":"Nicholas R. Pollock","orcid":"0000-0003-0114-528X","position":44,"is_corresponding":false},{"id":13956,"name":"Tobias Rausch","orcid":"0000-0001-5773-5620","position":45,"is_corresponding":false},{"id":30918,"name":"Allison A. Regier","orcid":"0000-0002-1932-8714","position":46,"is_corresponding":false},{"id":19727,"name":"Yuwei Song","orcid":"0000-0003-2537-4343","position":47,"is_corresponding":false},{"id":19728,"name":"Arda Soylev","orcid":"0000-0003-2198-1920","position":48,"is_corresponding":false},{"id":19729,"name":"Arvis Sulovari","orcid":"0000-0003-4354-9020","position":49,"is_corresponding":false},{"id":19730,"name":"Likhitha Surapaneni","orcid":"0000-0002-0575-7673","position":50,"is_corresponding":false},{"id":19731,"name":"Vasiliki Tsapalou","orcid":"0009-0002-3588-7003","position":51,"is_corresponding":false},{"id":19732,"name":"Weichen Zhou","orcid":"0000-0003-4755-1072","position":52,"is_corresponding":false},{"id":14882,"name":"Ying Zhou","orcid":"0000-0002-8107-3927","position":53,"is_corresponding":false},{"id":19733,"name":"Qihui Zhu","orcid":"0000-0003-2401-8443","position":54,"is_corresponding":false},{"id":6273,"name":"Michael C. Zody","orcid":"0000-0001-6594-7199","position":55,"is_corresponding":false},{"id":19734,"name":"Ryan E. Mills","orcid":"0000-0003-3425-6998","position":56,"is_corresponding":false},{"id":19735,"name":"Scott E. Devine","orcid":"0000-0001-7629-8331","position":57,"is_corresponding":false},{"id":19736,"name":"Xinghua Shi","orcid":"0000-0003-4662-3177","position":58,"is_corresponding":false},{"id":1025,"name":"Michael E. Talkowski","orcid":"0000-0003-2889-0992","position":59,"is_corresponding":false},{"id":19738,"name":"Mark J. P. Chaisson","orcid":"0000-0001-5395-1457","position":60,"is_corresponding":false},{"id":19739,"name":"Alexander T Dilthey","orcid":"0000-0002-6394-4581","position":61,"is_corresponding":false},{"id":19740,"name":"Miriam K. Konkel","orcid":"0000-0002-3190-1667","position":62,"is_corresponding":false},{"id":65595,"name":"Natalia Koralewska","orcid":"0000-0001-7096-0128","position":63,"is_corresponding":false},{"id":19741,"name":"Charles Lee","orcid":"0000-0001-7317-6662","position":64,"is_corresponding":false},{"id":19742,"name":"Christine R. Beck","orcid":"0000-0001-7821-8489","position":65,"is_corresponding":false},{"id":2125,"name":"Evan E. Eichler","orcid":"0000-0002-8246-4014","position":66,"is_corresponding":false},{"id":6321,"name":"Tobias Marschall","orcid":"0000-0002-9376-1030","position":67,"is_corresponding":false},{"id":19707,"name":"K Siddique-e Rabbani","orcid":"0009-0004-1448-2167","position":69,"is_corresponding":false},{"id":19692,"name":"Glennis A. Logsdon","orcid":"0000-0003-2396-0656","position":0,"is_corresponding":true}],"reference_count":136,"raw_metadata":{"citation_network_status":"fetched"},"created_at":"2026-03-01T18:20:47.508186Z","pmid":"40702183","pmcid":"PMC12350169","fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":"hybrid","license":"cc-by","views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":77.5,"fair_a":80.0,"fair_i":37.5,"fair_r":50.0,"fair_zscore":1.4508,"fair_rationale":{"fair_score":61.25,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":77.5,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"datacite=0, pmcid=True, pmid=True","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.75,"signal":null,"rationale":"The paper provides extensive structured metadata (e.g., accession numbers, sample IDs, population groups, assembly statistics) as part of the text, but the full machine-readability and exposure in a searchable registry beyond the journal and data repositories are not detailed in the text itself."}]},"A":{"name":"Accessible","score":80.0,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":1.0,"signal":null,"rationale":"The paper clearly specifies multiple data access protocols, including primary sequence data accessions (PRJEB58376, etc.), a dedicated FTP and Globus endpoint for released resources, and a public GitHub repository for code, all of which are described in the Data availability and Code availability sections."}]},"I":{"name":"Interoperable","score":37.5,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.75,"signal":null,"rationale":"The paper uses standard genomic formats (FASTA, BED, VCF), common reference genomes (T2T-CHM13, GRCh38), and community ontologies (e.g., GENCODE, IPD-IMGT/HLA), but explicit reference to a formal standard for data interchange or vocabulary beyond genomics is not provided."}]},"R":{"name":"Reusable","score":50.0,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.833,"signal":null,"rationale":"The paper explicitly licenses content under CC BY 4.0, provides comprehensive data availability statements with accession numbers, and includes code and workflows in a public repository, ensuring reproducibility, though the license covers the article and data per the open-access statement, but not all third-party components are confirmed under the same license."}]}},"suggestions":["Add a formal machine-readable metadata record (e.g., DataCite or schema.org JSON-LD) to the paper's landing page to enhance Findability.","Include a persistent identifier (e.g., DOI) for the exact version of the analysis code and workflows, beyond the general GitHub repository.","Specify which formal standards (e.g., GA4GH, VCF v4.3, BED v1.0) are followed explicitly for variant and annotation data files.","For Accessible, ensure that the FTP and Globus endpoints are accompanied by a stable, versioned DOI for the entire data release.","Provide a comprehensive reusable data dictionary that defines all column headers in supplementary tables, improving Interoperability and Reusability."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v2","fulltext_source":"epmc_xml"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v2","fair_fulltext_source":"epmc_xml","fair_has_llm":true,"fair_computed_at":"2026-06-18T00:42:07.794651Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}