{"doi":"10.1002/gepi.21647","title":"Genotype Imputation for <scp>A</scp>frican <scp>A</scp>mericans Using Data From <scp>H</scp>ap<scp>M</scp>ap Phase <scp>II</scp> Versus 1000 <scp>G</scp>enomes <scp>P</scp>rojects","abstract":"<jats:p>Genotype imputation provides imputation of untyped single nucleotide polymorphisms (<jats:styled-content style=\"fixed-case\">SNP</jats:styled-content>s) that are present on a reference panel such as those from the <jats:styled-content style=\"fixed-case\">H</jats:styled-content>ap<jats:styled-content style=\"fixed-case\">M</jats:styled-content>ap Project. It is popular for increasing statistical power and comparing results across studies using different platforms. Imputation for <jats:styled-content style=\"fixed-case\">A</jats:styled-content>frican <jats:styled-content style=\"fixed-case\">A</jats:styled-content>merican populations is challenging because their linkage disequilibrium blocks are shorter and also because no ideal reference panel is available due to admixture. In this paper, we evaluated three imputation strategies for <jats:styled-content style=\"fixed-case\">A</jats:styled-content>frican <jats:styled-content style=\"fixed-case\">A</jats:styled-content>mericans. The intersection strategy used a combined panel consisting of <jats:styled-content style=\"fixed-case\">SNP</jats:styled-content>s polymorphic in both <jats:styled-content style=\"fixed-case\">CEU</jats:styled-content> and <jats:styled-content style=\"fixed-case\">YRI</jats:styled-content>. The union strategy used a panel consisting of <jats:styled-content style=\"fixed-case\">SNP</jats:styled-content>s polymorphic in either <jats:styled-content style=\"fixed-case\">CEU</jats:styled-content> or <jats:styled-content style=\"fixed-case\">YRI</jats:styled-content>. The merge strategy merged results from two separate imputations, one using <jats:styled-content style=\"fixed-case\">CEU</jats:styled-content> and the other using YRI. Because recent investigators are increasingly using the data from the 1000 Genomes (1<jats:styled-content style=\"fixed-case\">KG</jats:styled-content>) <jats:styled-content style=\"fixed-case\">Project</jats:styled-content> for genotype imputation, we evaluated both 1<jats:styled-content style=\"fixed-case\">KG</jats:styled-content>‐based imputations and <jats:styled-content style=\"fixed-case\">H</jats:styled-content>ap<jats:styled-content style=\"fixed-case\">M</jats:styled-content>ap‐based imputations. We used 23,707 <jats:styled-content style=\"fixed-case\">SNP</jats:styled-content>s from chromosomes 21 and 22 on <jats:styled-content style=\"fixed-case\">A</jats:styled-content>ffymetrix <jats:styled-content style=\"fixed-case\">SNP</jats:styled-content> Array 6.0 genotyped for 1,075 <jats:styled-content style=\"fixed-case\">H</jats:styled-content>yper<jats:styled-content style=\"fixed-case\">GEN A</jats:styled-content>frican <jats:styled-content style=\"fixed-case\">A</jats:styled-content>mericans. We found that <jats:styled-content style=\"fixed-case\">1</jats:styled-content>KG‐based imputations provided a substantially larger number of variants than <jats:styled-content style=\"fixed-case\">H</jats:styled-content>ap<jats:styled-content style=\"fixed-case\">M</jats:styled-content>ap‐based imputations, about three times as many common variants and eight times as many rare and low‐frequency variants. This higher yield is expected because the 1<jats:styled-content style=\"fixed-case\">KG</jats:styled-content> panel includes more <jats:styled-content style=\"fixed-case\">SNP</jats:styled-content>s. Accuracy rates using 1<jats:styled-content style=\"fixed-case\">KG</jats:styled-content> data were slightly lower than those using <jats:styled-content style=\"fixed-case\">H</jats:styled-content>ap<jats:styled-content style=\"fixed-case\">M</jats:styled-content>ap data before filtering, but slightly higher after filtering. The union strategy provided the highest imputation yield with next highest accuracy. The intersection strategy provided the lowest imputation yield but the highest accuracy. The merge strategy provided the lowest imputation accuracy. We observed that <jats:styled-content style=\"fixed-case\">SNP</jats:styled-content>s polymorphic only in <jats:styled-content style=\"fixed-case\">CEU</jats:styled-content> had much lower accuracy, reducing the accuracy of the union strategy. Our findings suggest that 1<jats:styled-content style=\"fixed-case\">KG</jats:styled-content>‐based imputations can facilitate discovery of significant associations for <jats:styled-content style=\"fixed-case\">SNP</jats:styled-content>s across the whole <jats:styled-content style=\"fixed-case\">MAF</jats:styled-content> spectrum. Because the 1<jats:styled-content style=\"fixed-case\">KG</jats:styled-content> Project is still under way, we expect that later versions will provide better imputation performance. Genet. Epidemiol. 36:508‐516, 2012. © 2012 Wiley Periodicals, Inc.</jats:p>","journal":"Genetic Epidemiology","year":2012,"id":40433,"datarank":0.9324565948414008,"base_score":2.70805020110221,"endowment":2.70805020110221,"self_citation_contribution":0.40620753016533157,"citation_network_contribution":0.5262490646760692,"self_endowment_contribution":0.40620753016533157,"citer_contribution":0.5262490646760692,"corpus_percentile":null,"corpus_rank":null,"citation_count":14,"citer_count":14,"citers_with_citation_signal":13,"citers_with_endowment":13,"datacite_reuse_total":2,"is_dataset":false,"is_dataset_confidence":null,"is_oa":false,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":null,"fair_score":null,"fair_percentile":null,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":24752,"name":"C. Charles Gu","orcid":"0000-0002-8527-8145","position":1,"is_corresponding":false},{"id":25027,"name":"Hemant K. Tiwari","orcid":null,"position":2,"is_corresponding":false},{"id":24678,"name":"Donna K. Arnett","orcid":"0000-0003-2219-657X","position":3,"is_corresponding":false},{"id":24701,"name":"Ulrich Broeckel","orcid":null,"position":4,"is_corresponding":false},{"id":196462,"name":"Dabeeru C. Rao","orcid":null,"position":5,"is_corresponding":false},{"id":196461,"name":"Yun J. Sung","orcid":null,"position":0,"is_corresponding":false}],"reference_count":0,"raw_metadata":{"has_enrichment":true,"base_score":2.70805020110221,"endowment":2.70805020110221,"datacite_reuse_total":2,"file_count":0,"downloads":0,"views":0,"has_version_chain":false,"is_dataset":false,"is_oa":false,"pmid":"22644746","pmcid":"PMC3703942","openalex_id":"https://openalex.org/W2126214064","authors":[],"funders":[{"funder_name":"NHLBI NIH HHS","grant_id":"HL54473","title":null},{"funder_name":"NHLBI NIH HHS","grant_id":"R01 HL055673","title":null},{"funder_name":"NHLBI NIH HHS","grant_id":"U01 HL054473","title":null},{"funder_name":"NIGMS NIH HHS","grant_id":"R01 GM028719","title":null},{"funder_name":"NHLBI NIH HHS","grant_id":"HL55673","title":null},{"funder_name":"NHLBI NIH HHS","grant_id":"HL72507","title":null},{"funder_name":"NHLBI NIH HHS","grant_id":"U10 HL054473","title":null},{"funder_name":"NIGMS NIH HHS","grant_id":"GM28719","title":null},{"funder_name":"NHLBI NIH HHS","grant_id":"U01 HL072507","title":null}],"total_grants":9,"fwci":1.8396,"citation_percentile":0.86267138,"influential_citations":1,"citation_trend":[{"year":2012,"count":3},{"year":2013,"count":2},{"year":2014,"count":3},{"year":2015,"count":1},{"year":2016,"count":1},{"year":2018,"count":3},{"year":2022,"count":1}],"oa_status":"closed","license":"http://onlinelibrary.wiley.com/termsAndConditions#vor","oa_locations":[{"url":"https://europepmc.org/articles/pmc3703942?pdf=render","host_type":"GREEN"},{"url":"https://api.wiley.com/onlinelibrary/tdm/v1/articles/10.1002%2Fgepi.21647","host_type":"publisher"},{"url":"https://onlinelibrary.wiley.com/doi/pdf/10.1002/gepi.21647","host_type":"publisher"},{"url":"https://doi.org/10.1002/gepi.21647","host_type":"journal"},{"url":"https://pubmed.ncbi.nlm.nih.gov/22644746","host_type":"repository"},{"url":"https://www.ncbi.nlm.nih.gov/pmc/articles/3703942","host_type":"repository"}],"fields_of_study":["Genetic Associations and Epidemiology","Genetic and phenotypic traits in livestock","Medicine","Biology","Black or African American","Algorithms","Chromosome Mapping","Genetic Linkage","Genome","Genome, Human","Genotype","Humans","Linkage Disequilibrium","Models, Genetic","Oligonucleotide Array Sequence Analysis","Polymorphism, Genetic","Polymorphism, Single Nucleotide","Reproducibility of Results","Software"],"mesh_terms":["Algorithms","Black or African American","Chromosome Mapping","Genotype","Humans","Genetic Linkage","Models, Genetic","Polymorphism, Genetic","Software","Reproducibility of Results","Linkage Disequilibrium","Genome, Human","Genome","Oligonucleotide Array Sequence Analysis","Polymorphism, Single Nucleotide"],"keywords":["International HapMap Project","Imputation (statistics)","Single-nucleotide polymorphism","Linkage disequilibrium","1000 Genomes Project","Genome-wide association study","Genetics","Genotype","Biology","Computational biology","Statistics","Missing data","Mathematics","Gene"],"sdg_mappings":[{"sdg_number":0,"sdg_label":"Partnerships for the goals"}],"linked_datasets":[{"doi":"10.6084/m9.figshare.21331194.v1","title":"Additional file 1 of A joint use of pooling and imputation for genotyping SNPs","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.21331194","title":"Additional file 1 of A joint use of pooling and imputation for genotyping SNPs","publisher":"figshare","resource_type":"JournalArticle"}],"clinical_trials":[],"software_tools":[],"database_accessions":[],"source":"live","citation_network_status":"fetched"},"created_at":"2026-06-12T06:03:54.013997Z","pmid":null,"pmcid":null,"fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":null,"license":null,"views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":null,"fair_a":null,"fair_i":null,"fair_r":null,"fair_zscore":null,"fair_rationale":null,"fair_model":null,"fair_agent_version":null,"fair_fulltext_source":null,"fair_has_llm":null,"fair_computed_at":null,"clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}