{"doi":"10.1371/journal.pone.0291437","title":"Assessing HLA imputation accuracy in a West African population","abstract":"<jats:p>The Human Leukocyte Antigen (HLA) region plays an important role in autoimmune and infectious diseases. HLA is a highly polymorphic region and thus difficult to impute. We, therefore, sought to evaluate HLA imputation accuracy, specifically in a West African population, since they are understudied and are known to harbor high genetic diversity. The study sets were selected from 315 Gambian individuals within the Gambian Genome Variation Project (GGVP) Whole Genome Sequence datasets. Two different arrays, Illumina Omni 2.5 and Human Hereditary and Health in Africa (H3Africa), were assessed for the appropriateness of their markers, and these were used to test several imputation panels and tools. The reference panels were chosen from the 1000 Genomes (1kg-All), 1000 Genomes African (1kg-Afr), 1000 Genomes Gambian (1kg-Gwd), H3Africa, and the HLA Multi-ethnic datasets. HLA-A, HLA-B, and HLA-C alleles were imputed using HIBAG, SNP2HLA, CookHLA, and Minimac4, and concordance rate was used as an assessment metric. The best performing tool was found to be HIBAG, with a concordance rate of 0.84, while the best performing reference panel was the H3Africa panel, with a concordance rate of 0.62. Minimac4 (0.75) was shown to increase HLA-B allele imputation accuracy compared to HIBAG (0.71), SNP2HLA (0.51) and CookHLA (0.17). The H3Africa and Illumina Omni 2.5 array performances were comparable, showing that genotyping arrays have less influence on HLA imputation in West African populations. The findings show that using a larger population-specific reference panel and the HIBAG tool improves the accuracy of HLA imputation in a West African population.</jats:p>","journal":"PLOS ONE","year":2023,"id":16330,"datarank":0.2550781624460599,"base_score":1.3862943611198906,"endowment":1.3862943611198906,"self_citation_contribution":0.20794415416798362,"citation_network_contribution":0.04713400827807628,"self_endowment_contribution":0.20794415416798362,"citer_contribution":0.04713400827807628,"corpus_percentile":45.2,"corpus_rank":708,"citation_count":3,"citer_count":2,"citers_with_citation_signal":1,"citers_with_endowment":1,"datacite_reuse_total":0,"is_dataset":true,"is_dataset_confidence":null,"is_oa":false,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":null,"fair_score":45.8333,"fair_percentile":43.51363236587511,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":119076,"name":"Mamana Mbiyavanga","orcid":"0000-0003-3431-895X","position":1,"is_corresponding":false},{"id":119077,"name":"Suhaila Hashim","orcid":"0009-0001-4736-8422","position":2,"is_corresponding":false},{"id":119078,"name":"Santie de Villiers","orcid":null,"position":3,"is_corresponding":false},{"id":62302,"name":"Nicola Mulder","orcid":"0000-0003-4905-0941","position":4,"is_corresponding":false},{"id":119075,"name":"Ruth Nanjala","orcid":"0000-0002-8878-8685","position":0,"is_corresponding":false}],"reference_count":0,"raw_metadata":{"has_enrichment":true,"base_score":1.3862943611198906,"endowment":1.3862943611198906,"datacite_reuse_total":0,"file_count":0,"downloads":0,"views":0,"has_version_chain":false,"is_dataset":false,"is_oa":false,"pmid":"37768905","pmcid":"PMC10538777","openalex_id":"https://openalex.org/W4387141664","authors":[],"funders":[{"funder_name":"NIH","grant_id":"U24HG006941","title":null},{"funder_name":"National Institutes of Health","grant_id":"5U24HG006941-07","title":"H3ABioNet: informatics solutions for H3Africa"}],"total_grants":2,"fwci":0.4675,"citation_percentile":0.69472379,"influential_citations":0,"citation_trend":[{"year":2024,"count":1},{"year":2025,"count":1},{"year":2026,"count":1}],"oa_status":"gold","license":"cc-by","oa_locations":[{"url":"https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0291437&type=printable","host_type":"journal"},{"url":"https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0291437&type=printable","host_type":"GOLD"},{"url":"https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0291437&type=printable","host_type":"publisher"},{"url":"https://dx.plos.org/10.1371/journal.pone.0291437","host_type":"publisher"},{"url":"https://doi.org/10.1371/journal.pone.0291437","host_type":"journal"},{"url":"https://pubmed.ncbi.nlm.nih.gov/37768905","host_type":"repository"},{"url":"https://www.ncbi.nlm.nih.gov/pmc/articles/10538777","host_type":"repository"},{"url":"https://doaj.org/article/f1ce9bac79954c0dbce6ea8350bc58a4","host_type":"repository"},{"url":"https://pmc.ncbi.nlm.nih.gov/articles/PMC10538777/pdf/pone.0291437.pdf","host_type":"repository"},{"url":"https://europepmc.org/articles/PMC10538777","host_type":"Europe_PMC"},{"url":"https://europepmc.org/articles/PMC10538777?pdf=render","host_type":"Europe_PMC"},{"url":"https://doi.org/10.1101/2023.01.23.525129","host_type":""},{"url":"https://pubmed.ncbi.nlm.nih.gov/36747714","host_type":""},{"url":"http://dx.doi.org/10.1371/journal.pone.0291437","host_type":""},{"url":"http://dx.doi.org/10.1101/2023.01.23.525129","host_type":""},{"url":"https://doaj.org/article/92b7e0675f564bbfb3af02b00402c792","host_type":""}],"fields_of_study":["T-cell and B-cell Immunology","Immune Cell Function and Interaction","vaccines and immunoinformatics approaches","Medicine","0301 basic medicine","0303 health sciences","03 medical and health sciences","Humans","Genome-Wide Association Study","Genotype","Histocompatibility Antigens Class I","Histocompatibility Antigens Class II","HLA Antigens","HLA-B Antigens","Polymorphism, Single Nucleotide","West African People"],"mesh_terms":["West African People","Histocompatibility Antigens Class II","Genotype","HLA Antigens","Humans","HLA-B Antigens","Histocompatibility Antigens Class I","Polymorphism, Single Nucleotide","Genome-Wide Association Study"],"keywords":["Imputation (statistics)","Concordance","1000 Genomes Project","Genotyping","Human leukocyte antigen","Genome-wide association study","Genetics","Population","Biology","Genome","Allele","Allele frequency","Computational biology","Genotype","Medicine","Statistics","Single-nucleotide polymorphism","Missing data","Gene","Mathematics","Antigen","Science","Q","Histocompatibility Antigens Class I","R","Histocompatibility Antigens Class II","Polymorphism, Single Nucleotide","Article","HLA Antigens","HLA-B Antigens","Humans","Research Article","West African People"],"sdg_mappings":[{"sdg_number":3,"sdg_label":"3. Good health"},{"sdg_number":0,"sdg_label":"Good health and well-being"}],"linked_datasets":[],"clinical_trials":[],"software_tools":[],"database_accessions":[],"source":"live","citation_network_status":"fetched"},"created_at":"2026-06-01T21:29:57.942983Z","pmid":"37768905","pmcid":"PMC10538777","fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":"gold","license":"cc-by","views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":52.5,"fair_a":72.5,"fair_i":25.0,"fair_r":33.3333,"fair_zscore":0.0563,"fair_rationale":{"fair_score":45.83,"has_llm":true,"dimensions":{"F":{"name":"Findable","score":52.5,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"datacite=0, pmcid=True, pmid=True","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no OpenAlex id","rationale":null},{"key":"f_metadata_richness","label":"Rich, machine-readable metadata","kind":"llm","weight":1.0,"fraction":0.25,"signal":null,"rationale":"The paper itself does not provide machine-readable metadata (e.g., structured JSON-LD, schema.org markup) beyond basic citation fields in PMC."}]},"A":{"name":"Accessible","score":72.5,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":0.5,"signal":"files/OA location present but not flagged OA","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"16 OA location(s)","rationale":null},{"key":"a_access_protocol","label":"Clear data/code access protocol","kind":"llm","weight":1.0,"fraction":0.75,"signal":null,"rationale":"The data access statements and GitHub pipeline URL are provided, but the protocol for accessing the H3Africa reference panel requires manual request steps and is not fully automated or directly downloadable."}]},"I":{"name":"Interoperable","score":25.0,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"linked_datasets=0, datacite=0","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":0.0,"signal":"accessions=0, trials=0","rationale":null},{"key":"i_standards","label":"Standard formats, vocabularies & identifiers","kind":"llm","weight":1.0,"fraction":0.5,"signal":null,"rationale":"Standard file formats (e.g., VCF, PLINK) are implied but not explicitly listed, and while standard tools are used, formal vocabularies and persistent identifiers for data objects are not stated."}]},"R":{"name":"Reusable","score":33.33,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"is_dataset","rationale":null},{"key":"r_reusability","label":"Data-availability statement, license & reproducibility","kind":"llm","weight":2.0,"fraction":0.5,"signal":null,"rationale":"The paper includes a data-availability statement, a pipeline GitHub repository, and a liberal license (CC BY 4.0), but the absence of a formal license for the pipeline and lack of a curated, reusable dataset package limit reproducibility."}]}},"suggestions":["Provide explicit, structured machine-readable metadata (e.g., JSON-LD) describing the dataset and study parameters for automated discovery.","Make the H3Africa reference panel directly downloadable or integrate with a programmatic API instead of requiring manual request submission.","Include a formal license (e.g., MIT or Apache 2.0) for the pipeline repository and deposit the final imputed datasets in a persistent repository with a DOI."],"model":"deepseek/deepseek-v4-flash","agent_version":"fair_agent_v1","fulltext_source":"epmc_xml"},"fair_model":"deepseek/deepseek-v4-flash","fair_agent_version":"fair_agent_v1","fair_fulltext_source":"epmc_xml","fair_has_llm":true,"fair_computed_at":"2026-06-17T22:59:52.277697Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}