{"doi":"10.1093/bioinformatics/btx431","title":"DeepLoc: prediction of protein subcellular localization using deep learning","abstract":"<jats:title>Abstract</jats:title>\n               <jats:sec>\n                  <jats:title>Motivation</jats:title>\n                  <jats:p>The prediction of eukaryotic protein subcellular localization is a well-studied topic in bioinformatics due to its relevance in proteomics research. Many machine learning methods have been successfully applied in this task, but in most of them, predictions rely on annotation of homologues from knowledge databases. For novel proteins where no annotated homologues exist, and for predicting the effects of sequence variants, it is desirable to have methods for predicting protein properties from sequence information only.</jats:p>\n               </jats:sec>\n               <jats:sec>\n                  <jats:title>Results</jats:title>\n                  <jats:p>Here, we present a prediction algorithm using deep neural networks to predict protein subcellular localization relying only on sequence information. At its core, the prediction model uses a recurrent neural network that processes the entire protein sequence and an attention mechanism identifying protein regions important for the subcellular localization. The model was trained and tested on a protein dataset extracted from one of the latest UniProt releases, in which experimentally annotated proteins follow more stringent criteria than previously. We demonstrate that our model achieves a good accuracy (78% for 10 categories; 92% for membrane-bound or soluble), outperforming current state-of-the-art algorithms, including those relying on homology information.</jats:p>\n               </jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation</jats:title>\n                  <jats:p>The method is available as a web server at http://www.cbs.dtu.dk/services/DeepLoc. Example code is available at https://github.com/JJAlmagro/subcellular_localization. The dataset is available at http://www.cbs.dtu.dk/services/DeepLoc/data.php.</jats:p>\n               </jats:sec>","journal":"Bioinformatics","year":2017,"id":16703,"datarank":10.768377479390118,"base_score":7.104144092987527,"endowment":7.104144092987527,"self_citation_contribution":1.0656216139481292,"citation_network_contribution":9.702755865441988,"self_endowment_contribution":1.0656216139481292,"citer_contribution":9.702755865441988,"corpus_percentile":null,"corpus_rank":null,"citation_count":1216,"citer_count":200,"citers_with_citation_signal":200,"citers_with_endowment":200,"datacite_reuse_total":25,"is_dataset":false,"is_dataset_confidence":null,"is_oa":false,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":null,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[{"id":122333,"name":"Casper Kaae Sønderby","orcid":null,"position":1,"is_corresponding":false},{"id":122334,"name":"Søren Kaae Sønderby","orcid":null,"position":2,"is_corresponding":false},{"id":17917,"name":"Henrik Nielsen","orcid":"0000-0002-9412-9643","position":3,"is_corresponding":false},{"id":122335,"name":"Ole Winther","orcid":null,"position":4,"is_corresponding":false},{"id":21518,"name":"José Juan Almagro Armenteros","orcid":"0000-0003-0111-1362","position":0,"is_corresponding":false}],"reference_count":0,"raw_metadata":{"has_enrichment":true,"base_score":7.104144092987527,"endowment":7.104144092987527,"datacite_reuse_total":25,"file_count":0,"downloads":0,"views":0,"has_version_chain":false,"is_dataset":false,"is_oa":false,"pmid":"29036616","pmcid":null,"openalex_id":"https://openalex.org/W2730472814","authors":[],"funders":[],"total_grants":0,"fwci":34.1996,"citation_percentile":0.99862732,"influential_citations":9,"citation_trend":[{"year":2017,"count":2},{"year":2018,"count":27},{"year":2019,"count":97},{"year":2020,"count":178},{"year":2021,"count":270},{"year":2022,"count":189},{"year":2023,"count":178},{"year":2024,"count":142},{"year":2025,"count":92},{"year":2026,"count":41}],"oa_status":"bronze","license":"https://academic.oup.com/journals/pages/about_us/legal/notices","oa_locations":[{"url":"https://academic.oup.com/bioinformatics/article-pdf/33/21/3387/25166063/btx431.pdf","host_type":"journal"},{"url":"https://academic.oup.com/bioinformatics/article-pdf/33/21/3387/25166063/btx431.pdf","host_type":"BRONZE"},{"url":"https://academic.oup.com/bioinformatics/article-pdf/33/21/3387/25166063/btx431.pdf","host_type":"publisher"},{"url":"https://academic.oup.com/bioinformatics/article-pdf/33/21/3387/50315453/bioinformatics_33_21_3387.pdf","host_type":"publisher"},{"url":"https://doi.org/10.1093/bioinformatics/btx431","host_type":"journal"},{"url":"https://pubmed.ncbi.nlm.nih.gov/29036616","host_type":"repository"},{"url":"https://researchprofiles.ku.dk/da/publications/069607c2-8fed-4683-bea8-9595b330c70b","host_type":"repository"},{"url":"https://curis.ku.dk/portal/da/publications/deeploc(069607c2-8fed-4683-bea8-9595b330c70b).html","host_type":"repository"}],"fields_of_study":["Machine Learning in Bioinformatics","Cell Image Analysis Techniques","vaccines and immunoinformatics approaches","Computer Science","Medicine","Biology","Computational Biology","Eukaryota","Eukaryotic Cells","Machine Learning","Models, Biological","Molecular Sequence Annotation","Neural Networks, Computer","Protein Transport","Sequence Analysis, Protein","Software"],"mesh_terms":["Machine Learning","Eukaryotic Cells","Models, Biological","Software","Neural Networks, Computer","Computational Biology","Sequence Analysis, Protein","Protein Transport","Eukaryota","Molecular Sequence Annotation"],"keywords":["Artificial intelligence","Computer science","Subcellular localization","Deep learning","Protein subcellular localization prediction","Pattern recognition (psychology)","Machine learning","Computational biology","Biology","Biochemistry","Cytoplasm","Gene"],"sdg_mappings":[{"sdg_number":0,"sdg_label":"Quality Education"}],"linked_datasets":[{"doi":"10.6075/j0ks6q2h","title":"Protein Embedding Analysis. In Data Science &amp; Engineering Master of Advanced Study (DSE MAS) Capstone Projects","publisher":"UC San Diego Library Digital Collections","resource_type":"Dataset"},{"doi":"10.6084/m9.figshare.13059460.v1","title":"Additional file 1 of Proteotranscriptomics assisted gene annotation and spatial proteomics of Bombyx mori BmN4 cell line","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13059460","title":"Additional file 1 of Proteotranscriptomics assisted gene annotation and spatial proteomics of Bombyx mori BmN4 cell line","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13088020.v1","title":"Additional file 1 of Integrated omics unveil the secondary metabolic landscape of a basal dinoflagellate","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13088020","title":"Additional file 1 of Integrated omics unveil the secondary metabolic landscape of a basal dinoflagellate","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13088023.v1","title":"Additional file 2 of Integrated omics unveil the secondary metabolic landscape of a basal dinoflagellate","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13088023","title":"Additional file 2 of Integrated omics unveil the secondary metabolic landscape of a basal dinoflagellate","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13620077.v1","title":"Additional file 1 of Translational landscape and protein biogenesis demands of the early secretory pathway in Komagataella phaffii","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13620077","title":"Additional file 1 of Translational landscape and protein biogenesis demands of the early secretory pathway in Komagataella phaffii","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13620080.v1","title":"Additional file 2 of Translational landscape and protein biogenesis demands of the early secretory pathway in Komagataella phaffii","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13620080","title":"Additional file 2 of Translational landscape and protein biogenesis demands of the early secretory pathway in Komagataella phaffii","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13620092.v1","title":"Additional file 6 of Translational landscape and protein biogenesis demands of the early secretory pathway in Komagataella phaffii","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13620092","title":"Additional file 6 of Translational landscape and protein biogenesis demands of the early secretory pathway in Komagataella phaffii","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13620095.v1","title":"Additional file 7 of Translational landscape and protein biogenesis demands of the early secretory pathway in Komagataella phaffii","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13620095","title":"Additional file 7 of Translational landscape and protein biogenesis demands of the early secretory pathway in Komagataella phaffii","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13620101.v1","title":"Additional file 9 of Translational landscape and protein biogenesis demands of the early secretory pathway in Komagataella phaffii","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.13620101","title":"Additional file 9 of Translational landscape and protein biogenesis demands of the early secretory pathway in Komagataella phaffii","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.14060009.v1","title":"Additional file 1 of Addressing uncertainty in genome-scale metabolic model reconstruction and analysis","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.14060009","title":"Additional file 1 of Addressing uncertainty in genome-scale metabolic model reconstruction and analysis","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.14183894.v1","title":"Additional file 1 of riboCIRC: a comprehensive database of translatable circRNAs","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.14183894","title":"Additional file 1 of riboCIRC: a comprehensive database of translatable circRNAs","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.14183897.v1","title":"Additional file 2 of riboCIRC: a comprehensive database of translatable circRNAs","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.14183897","title":"Additional file 2 of riboCIRC: a comprehensive database of translatable circRNAs","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.14183900.v1","title":"Additional file 3 of riboCIRC: a comprehensive database of translatable circRNAs","publisher":"figshare","resource_type":"JournalArticle"},{"doi":"10.6084/m9.figshare.14183900","title":"Additional file 3 of riboCIRC: a comprehensive database of translatable circRNAs","publisher":"figshare","resource_type":"JournalArticle"}],"clinical_trials":[],"software_tools":[],"database_accessions":[],"source":"live","citation_network_status":"fetched"},"created_at":"2026-06-02T13:40:47.516595Z","pmid":null,"pmcid":null,"fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":null,"license":null,"views":0,"total_file_size_bytes":0,"version_count":0,"clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}