{"doi":"10.1038/nature04338","title":"Genome sequence, comparative analysis and haplotype structure of the domestic dog","abstract":"Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.","journal":"Nature","year":2005,"id":12413,"datarank":11.589161914757835,"base_score":7.857093864902493,"endowment":7.857093864902493,"self_citation_contribution":1.178564079735374,"citation_network_contribution":10.410597835022461,"self_endowment_contribution":1.178564079735374,"citer_contribution":10.410597835022461,"corpus_percentile":81.85516680227828,"corpus_rank":224,"citation_count":2623,"citer_count":135,"citers_with_citation_signal":135,"citers_with_endowment":135,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":0.8933,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2005-12-01","fair_score":28.125,"fair_percentile":9.916446789797714,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":96610,"name":"Claire M. Wade","orcid":"0000-0003-3413-4771","position":2,"is_corresponding":false},{"id":7053,"name":"Tarjei S. Mikkelsen","orcid":"0000-0002-8133-3135","position":3,"is_corresponding":false},{"id":96554,"name":"Elinor K. Karlsson","orcid":"0000-0002-4343-3776","position":4,"is_corresponding":false},{"id":20075,"name":"David B. Jaffe","orcid":"0000-0001-8739-568X","position":5,"is_corresponding":false},{"id":96553,"name":"Michael Kamal","orcid":null,"position":6,"is_corresponding":false},{"id":20072,"name":"Michele Clamp","orcid":null,"position":7,"is_corresponding":false},{"id":96558,"name":"Edward J. Kulbokas","orcid":null,"position":9,"is_corresponding":false},{"id":6273,"name":"Michael C. Zody","orcid":"0000-0001-6594-7199","position":10,"is_corresponding":false},{"id":25077,"name":"Evan Mauceli","orcid":null,"position":11,"is_corresponding":false},{"id":7153,"name":"Xiaohui Xie","orcid":"0000-0002-5479-6345","position":12,"is_corresponding":false},{"id":98039,"name":"Matthew Breen","orcid":"0000-0002-8901-4155","position":13,"is_corresponding":false},{"id":98040,"name":"Robert K. Wayne","orcid":"0000-0003-3537-2245","position":14,"is_corresponding":false},{"id":98041,"name":"Elaine A. Ostrander","orcid":"0000-0001-6075-9738","position":15,"is_corresponding":false},{"id":49036,"name":"Arian F. A. Smit","orcid":"0000-0003-2088-3165","position":17,"is_corresponding":false},{"id":98043,"name":"Douglas R. Smith","orcid":null,"position":18,"is_corresponding":false},{"id":98044,"name":"Pieter J. deJong","orcid":null,"position":19,"is_corresponding":false},{"id":98045,"name":"Ewen Kirkness","orcid":null,"position":20,"is_corresponding":false},{"id":98046,"name":"Pablo Alvarez","orcid":null,"position":21,"is_corresponding":false},{"id":98047,"name":"Tara Biagi","orcid":null,"position":22,"is_corresponding":false},{"id":98048,"name":"William Brockman","orcid":null,"position":23,"is_corresponding":false},{"id":98049,"name":"Jonathan Butler","orcid":null,"position":24,"is_corresponding":false},{"id":98050,"name":"Chee-Wye Chin","orcid":null,"position":25,"is_corresponding":false},{"id":19041,"name":"Kathryn Beal","orcid":"0000-0001-5271-8733","position":26,"is_corresponding":false},{"id":32862,"name":"Vyacheslav Amstislavskiy","orcid":"0000-0002-1384-7599","position":27,"is_corresponding":false},{"id":24722,"name":"Coleen Damcott","orcid":"0000-0001-6233-7395","position":28,"is_corresponding":false},{"id":85505,"name":"David DeCaprio","orcid":"0000-0001-8931-9461","position":29,"is_corresponding":false},{"id":20074,"name":"Sante Gnerre","orcid":null,"position":30,"is_corresponding":false},{"id":50709,"name":"Manfred Grabherr","orcid":"0000-0001-8792-6508","position":31,"is_corresponding":false},{"id":14693,"name":"Sharon L. R. Kardia","orcid":"0000-0002-9853-3379","position":32,"is_corresponding":false},{"id":98052,"name":"Michael Kleber","orcid":null,"position":33,"is_corresponding":false},{"id":98053,"name":"Carolyne Bardeleben","orcid":null,"position":34,"is_corresponding":false},{"id":96541,"name":"Leo Goodstadt","orcid":null,"position":35,"is_corresponding":false},{"id":94853,"name":"Andreas Heger","orcid":"0000-0001-7720-0447","position":36,"is_corresponding":false},{"id":98054,"name":"Christophe Hitte","orcid":"0000-0003-1714-437X","position":37,"is_corresponding":false},{"id":98056,"name":"Heidi G. Parker","orcid":"0000-0002-9707-6380","position":40,"is_corresponding":false},{"id":98057,"name":"John P. Pollinger","orcid":"0000-0001-7278-2660","position":41,"is_corresponding":false},{"id":37973,"name":"Stephen M. J. Searle","orcid":null,"position":42,"is_corresponding":false},{"id":98058,"name":"Nathan B. Sutter","orcid":"0000-0003-3541-4986","position":43,"is_corresponding":false},{"id":98059,"name":"Rachael Thomas","orcid":"0000-0002-3029-8798","position":44,"is_corresponding":false},{"id":19978,"name":"Nick Goldman","orcid":"0000-0001-8486-2211","position":45,"is_corresponding":false},{"id":18887,"name":"Matthew E. Hurles","orcid":"0000-0002-2333-7015","position":46,"is_corresponding":false},{"id":18085,"name":"Kerstin Lindblad‐Toh","orcid":"0000-0001-8338-0253","position":47,"is_corresponding":false},{"id":89180,"name":"Michèle Clamp","orcid":null,"position":48,"is_corresponding":false},{"id":96600,"name":"Andrew R. Smith","orcid":"0000-0001-8580-278X","position":49,"is_corresponding":false},{"id":98060,"name":"Pieter DeJong","orcid":null,"position":50,"is_corresponding":false},{"id":98061,"name":"Ewen F. Kirkness","orcid":null,"position":51,"is_corresponding":false},{"id":98062,"name":"Pablo Álvarez","orcid":"0000-0002-0079-3392","position":52,"is_corresponding":false},{"id":98063,"name":"William W. Brockman","orcid":null,"position":53,"is_corresponding":false},{"id":96523,"name":"Jonathan A. Butler","orcid":"0000-0002-1323-8611","position":54,"is_corresponding":false},{"id":30839,"name":"Klaus‐Peter Koepfli","orcid":"0000-0001-7281-0676","position":55,"is_corresponding":false},{"id":41916,"name":"Broad Institute Sequencing Platform and Whole Genome Assembly Team","orcid":null,"position":0,"is_corresponding":true}],"reference_count":119,"raw_metadata":{"citation_network_status":"fetched"},"created_at":"2026-03-01T18:20:47.508186Z","pmid":null,"pmcid":null,"fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":null,"license":null,"views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":20.0,"fair_a":42.5,"fair_i":25.0,"fair_r":25.0,"fair_zscore":-1.5456,"fair_rationale":{"fair_score":28.12,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":20.0,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"datacite=0, pmcid=False, pmid=False","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.0,"signal":null,"rationale":"The paper text does not provide any machine-readable metadata (e.g., structured keywords, ontologies, or schema.org markup) for the genome sequence, SNP map, or haplotype data."}]},"A":{"name":"Accessible","score":42.5,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":1.0,"signal":"Open Access","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"0 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":0.25,"signal":null,"rationale":"Data access is vaguely implied via 'public release' and URLs like http://www.broad.mit.edu/tools/data.html, but no explicit protocol (e.g., license, authentication steps, or repository name) is given for the final data products."}]},"I":{"name":"Interoperable","score":25.0,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"Standard formats such as FASTA and dbSNP are mentioned, and UCSC genome browser identifiers (e.g., hg17, mm5) are used, but no formal vocabulary or data dictionary is described for the SNP annotations or gene models."}]},"R":{"name":"Reusable","score":25.0,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.333,"signal":null,"rationale":"No explicit data-availability statement, license, or reproducibility instructions are provided; data are said to be 'deposited' or available at a web address, but terms of reuse, provenance documentation, and versioning are not stated."}]}},"suggestions":["Deposit the genome assembly and SNP data in a certified repository (e.g., NCBI GenBank/ dbSNP) with explicit persistent identifiers and a data-access license.","Provide a formal data-availability statement in the main text or a dedicated section, including a clear reuse license (e.g., Creative Commons or equivalent).","Describe the machine-readable metadata format (e.g., XML-based genome annotations, structured JSON for variants) and any controlled vocabularies used (e.g., SO terms, GO terms).","Specify the exact version of the assembly (CanFam1.0 vs 2.0) and the exact software/parameter settings (e.g., ARACHNE version) required to reproduce the analysis, ideally as a computational workflow.","Add a reproducibility section that lists all input data, scripts, and command-line parameters, or link to a code repository with an open-source license."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v2","fulltext_source":"unpaywall_pdf"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v2","fair_fulltext_source":"unpaywall_pdf","fair_has_llm":true,"fair_computed_at":"2026-06-18T00:29:59.430408Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}