{"doi":"10.1101/2022.12.01.518724","title":"The complete sequence of a human Y chromosome","abstract":"The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure including long palindromes, tandem repeats, and segmental duplications 1–3 . As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished 4, 5 . Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029 base pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, revealing the complete ampliconic structures of TSPY , DAZ , and RBMY gene families; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a prior assembly of the CHM13 genome 4 and mapped available population variation, clinical variants, and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.","journal":null,"year":2022,"id":5410,"datarank":0.9307200161918967,"base_score":3.7612001156935624,"endowment":3.7612001156935624,"self_citation_contribution":0.5641800173540344,"citation_network_contribution":0.36653999883786226,"self_endowment_contribution":0.5641800173540344,"citer_contribution":0.36653999883786226,"corpus_percentile":57.44507729861676,"corpus_rank":524,"citation_count":42,"citer_count":25,"citers_with_citation_signal":19,"citers_with_endowment":19,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.9432,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2022-12-01","fair_score":47.9167,"fair_percentile":44.83289357959543,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":24611,"name":"Sergey Nurk","orcid":"0000-0003-1301-5749","position":1,"is_corresponding":false},{"id":49014,"name":"Savannah J. Hoyt","orcid":"0000-0001-7804-3236","position":3,"is_corresponding":false},{"id":53912,"name":"Dylan J. Taylor","orcid":"0000-0001-5806-4494","position":4,"is_corresponding":false},{"id":24431,"name":"Nicolas Altemose","orcid":"0000-0002-7231-6026","position":5,"is_corresponding":false},{"id":53913,"name":"Paul W. Hook","orcid":"0000-0002-3912-1999","position":6,"is_corresponding":false},{"id":2118,"name":"Sergey Koren","orcid":"0000-0002-1472-8962","position":7,"is_corresponding":false},{"id":30918,"name":"Allison A. Regier","orcid":"0000-0002-1932-8714","position":8,"is_corresponding":false},{"id":24428,"name":"Ivan A. Alexandrov","orcid":"0000-0003-4342-2003","position":9,"is_corresponding":false},{"id":19730,"name":"Likhitha Surapaneni","orcid":"0000-0002-0575-7673","position":10,"is_corresponding":false},{"id":24434,"name":"Mobin Asri","orcid":"0000-0002-7194-5138","position":11,"is_corresponding":false},{"id":49009,"name":"Andrey V. Bzikadze","orcid":"0000-0002-7928-7950","position":12,"is_corresponding":false},{"id":53914,"name":"Nae-Chyun Chen","orcid":"0000-0002-4140-4568","position":13,"is_corresponding":false},{"id":2121,"name":"Chen-Shan Chin","orcid":"0000-0003-4394-2455","position":14,"is_corresponding":false},{"id":16842,"name":"Mark Diekhans","orcid":"0000-0002-0430-0989","position":15,"is_corresponding":false},{"id":20075,"name":"David B. Jaffe","orcid":"0000-0001-8739-568X","position":16,"is_corresponding":false},{"id":21287,"name":"Giulio Formenti","orcid":"0000-0002-7554-5991","position":17,"is_corresponding":false},{"id":49023,"name":"Arkarachai Fungtammasan","orcid":"0000-0003-2398-0358","position":18,"is_corresponding":false},{"id":30915,"name":"Carlos García Girón","orcid":"0000-0002-0935-7271","position":19,"is_corresponding":false},{"id":62052,"name":"Nicholas F. Parrish","orcid":"0000-0002-6971-8016","position":20,"is_corresponding":false},{"id":49012,"name":"Ariel Gershman","orcid":"0000-0001-8899-8781","position":21,"is_corresponding":false},{"id":49051,"name":"Patrick G. S. Grady","orcid":"0000-0003-0180-7810","position":23,"is_corresponding":false},{"id":19201,"name":"Andrea Guarracino","orcid":"0000-0001-9744-131X","position":24,"is_corresponding":false},{"id":24475,"name":"Leanne Haggerty","orcid":"0000-0001-8843-3596","position":25,"is_corresponding":false},{"id":53915,"name":"Reza Halabian","orcid":"0000-0002-0360-4171","position":26,"is_corresponding":false},{"id":7658,"name":"Nancy F. Hansen","orcid":"0000-0002-0950-0699","position":27,"is_corresponding":false},{"id":19563,"name":"Robert S. Harris","orcid":"0000-0001-5464-6892","position":28,"is_corresponding":false},{"id":49025,"name":"Gabrielle A. Hartley","orcid":"0000-0002-5672-2171","position":29,"is_corresponding":false},{"id":19702,"name":"William T. Harvey","orcid":"0000-0003-0646-7528","position":30,"is_corresponding":false},{"id":13532,"name":"Eoghan Harrington","orcid":"0000-0002-4850-2486","position":31,"is_corresponding":false},{"id":39977,"name":"Jakob Heinz","orcid":"0000-0002-9218-2643","position":32,"is_corresponding":false},{"id":30880,"name":"Thibaut Hourlier","orcid":"0000-0003-4894-7773","position":33,"is_corresponding":false},{"id":53916,"name":"Robert M. Hubley","orcid":"0000-0001-9261-3821","position":34,"is_corresponding":false},{"id":19717,"name":"Sarah E. Hunt","orcid":"0000-0002-8350-1235","position":35,"is_corresponding":false},{"id":53917,"name":"Stephen Hwang","orcid":"0000-0003-0299-569X","position":36,"is_corresponding":false},{"id":21329,"name":"Erich  D. Jarvis","orcid":"0000-0001-8931-5049","position":37,"is_corresponding":false},{"id":53918,"name":"Rupesh K. Kesharwani","orcid":"0000-0002-4678-5419","position":38,"is_corresponding":false},{"id":30887,"name":"Alexandra P. Lewis","orcid":"0000-0002-6195-4786","position":39,"is_corresponding":false},{"id":19692,"name":"Glennis A. Logsdon","orcid":"0000-0003-2396-0656","position":41,"is_corresponding":false},{"id":24603,"name":"Julian K. Lucas","orcid":"0000-0001-9163-2756","position":42,"is_corresponding":false},{"id":53919,"name":"Wojciech Makalowski","orcid":"0000-0003-2303-9541","position":43,"is_corresponding":false},{"id":19693,"name":"Peter Ebert","orcid":"0000-0001-7441-532X","position":44,"is_corresponding":false},{"id":24505,"name":"Fergal J. Martin","orcid":"0000-0002-1672-050X","position":45,"is_corresponding":false},{"id":24445,"name":"Ann M. Mc Cartney","orcid":"0000-0003-3191-3200","position":46,"is_corresponding":false},{"id":49046,"name":"Rajiv C. McCoy","orcid":"0000-0003-0615-146X","position":47,"is_corresponding":false},{"id":30894,"name":"Jennifer McDaniel","orcid":"0000-0003-1987-0914","position":48,"is_corresponding":false},{"id":53920,"name":"Brandy M. McNulty","orcid":null,"position":49,"is_corresponding":false},{"id":53921,"name":"Paul Medvedev","orcid":"0000-0003-3143-594X","position":50,"is_corresponding":false},{"id":49010,"name":"Alla Mikheenko","orcid":"0000-0003-3400-9719","position":51,"is_corresponding":false},{"id":19706,"name":"Katherine M. Munson","orcid":"0000-0001-8413-6498","position":52,"is_corresponding":false},{"id":2099,"name":"Terence D. Murphy","orcid":"0000-0001-9311-9745","position":53,"is_corresponding":false},{"id":30898,"name":"Hugh E. Olsen","orcid":"0000-0002-7293-8853","position":54,"is_corresponding":false},{"id":30899,"name":"Nathan D. Olson","orcid":"0000-0003-2585-3037","position":55,"is_corresponding":false},{"id":53922,"name":"Luis F. Paulin","orcid":"0000-0003-2567-3773","position":56,"is_corresponding":false},{"id":19696,"name":"David Porubsky","orcid":"0000-0001-8414-8966","position":57,"is_corresponding":false},{"id":30861,"name":"Tamara Potapova","orcid":"0000-0003-2761-1795","position":58,"is_corresponding":false},{"id":24536,"name":"Fedor Ryabov","orcid":"0000-0001-8728-9465","position":59,"is_corresponding":false},{"id":4334,"name":"Steven L. Salzberg","orcid":"0000-0002-8859-7432","position":60,"is_corresponding":false},{"id":53923,"name":"Michael E.G. Sauria","orcid":"0000-0001-5556-9446","position":61,"is_corresponding":false},{"id":49032,"name":"Fritz J. Sedlazeck","orcid":"0000-0001-6040-2691","position":62,"is_corresponding":false},{"id":24544,"name":"Kishwar Shafin","orcid":"0000-0001-5252-3434","position":63,"is_corresponding":false},{"id":53924,"name":"Valery A. Shepelev","orcid":null,"position":64,"is_corresponding":false},{"id":49034,"name":"Alaina Shumate","orcid":"0000-0002-4450-1857","position":65,"is_corresponding":false},{"id":49039,"name":"Jessica M. Storer","orcid":"0000-0002-9619-5265","position":66,"is_corresponding":false},{"id":53925,"name":"Angela M. Taravella Oill","orcid":"0000-0002-4408-7211","position":68,"is_corresponding":false},{"id":2130,"name":"Françoise Thibaud‐Nissen","orcid":"0000-0003-4957-7807","position":69,"is_corresponding":false},{"id":49049,"name":"Winston Timp","orcid":"0000-0003-2083-6027","position":70,"is_corresponding":false},{"id":53926,"name":"Marta Tomaszkiewicz","orcid":"0000-0003-1523-200X","position":71,"is_corresponding":false},{"id":24561,"name":"Mitchell R. Vollger","orcid":"0000-0002-8651-1615","position":72,"is_corresponding":false},{"id":4363,"name":"Brian P. Walenz","orcid":"0000-0001-8431-1428","position":73,"is_corresponding":false},{"id":53927,"name":"Allison C. Watwood","orcid":"0000-0003-1868-057X","position":74,"is_corresponding":false},{"id":53928,"name":"Matthias H. Weissensteiner","orcid":"0000-0001-9302-798X","position":75,"is_corresponding":false},{"id":24565,"name":"Aaron M. Wenger","orcid":"0000-0003-1183-0432","position":76,"is_corresponding":false},{"id":53929,"name":"Melissa A. Wilson","orcid":"0000-0002-2614-0285","position":77,"is_corresponding":false},{"id":49044,"name":"Samantha Zarate","orcid":"0000-0001-5570-2059","position":78,"is_corresponding":false},{"id":29191,"name":"Yiming Zhu","orcid":"0009-0001-3486-2368","position":79,"is_corresponding":false},{"id":30911,"name":"Aleksey V. Zimin","orcid":"0000-0001-5091-3092","position":80,"is_corresponding":false},{"id":2125,"name":"Evan E. Eichler","orcid":"0000-0002-8246-4014","position":81,"is_corresponding":false},{"id":49048,"name":"Rachel J. O’Neill","orcid":"0000-0002-1525-6821","position":82,"is_corresponding":false},{"id":24539,"name":"Michael C. Schatz","orcid":"0000-0002-4118-4446","position":83,"is_corresponding":false},{"id":19591,"name":"Kateryna D. Makova","orcid":"0000-0002-6212-9526","position":85,"is_corresponding":false},{"id":2122,"name":"Adam  M. Phillippy","orcid":"0000-0003-2983-8934","position":86,"is_corresponding":false},{"id":21289,"name":"Jack  A. Medico","orcid":"0000-0003-1855-0855","position":87,"is_corresponding":false},{"id":53930,"name":"В. А. Шепелев","orcid":"0000-0002-9321-7692","position":88,"is_corresponding":false},{"id":21320,"name":"Arang Rhie","orcid":"0000-0002-9809-8127","position":0,"is_corresponding":true}],"reference_count":160,"raw_metadata":null,"created_at":"2026-03-01T18:20:47.508186Z","pmid":null,"pmcid":null,"fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":null,"license":null,"views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":45.0,"fair_a":67.5,"fair_i":37.5,"fair_r":41.6667,"fair_zscore":0.2447,"fair_rationale":{"fair_score":47.92,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":45.0,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"datacite=0, pmcid=False, pmid=False","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"The paper provides rich descriptive metadata and tables about the sequence, but no evidence of machine-readable or structured metadata (e.g., JSON-LD, schema.org) is presented."}]},"A":{"name":"Accessible","score":67.5,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":0.75,"signal":null,"rationale":"Clear protocols for assembly, validation, and variant calling are described, and a 'Data Availability' section lists URLs for download, but no explicit persistent identifier (e.g., DOI) for the exact dataset is provided."}]},"I":{"name":"Interoperable","score":37.5,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.75,"signal":null,"rationale":"Standard formats (FASTA, VCF) and community vocabularies (RefSeq, GENCODE, Dfam) are used, but no evidence of adherence to formal data standards like MIAME or ISA-Tab."}]},"R":{"name":"Reusable","score":41.67,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.667,"signal":null,"rationale":"A data-availability statement and code repository are provided, but no explicit license is mentioned for the data or code, and reproducibility steps are described but not fully automated for reuse."}]}},"suggestions":["Deposit the exact assembly and annotation files at a public repository with a permanent DOI and include that DOI in the paper.","Add a clear license (e.g., CC0 for data, MIT for code) in the data-availability section and code repository.","Include machine-readable metadata (e.g., schema.org markup or an ISA-Tab file) describing the data, methods, and provenance.","Provide a containerized or fully automated workflow (e.g., Docker/Singularity and Nextflow/Snakemake) to exactly reproduce the assembly and analyses.","Use standard identifiers (e.g., ORCID for authors, RRID for cell lines, BioSample accession for HG002) throughout the manuscript."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v2","fulltext_source":"unpaywall_pdf"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v2","fair_fulltext_source":"unpaywall_pdf","fair_has_llm":true,"fair_computed_at":"2026-06-18T00:43:37.593690Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}