{"doi":"10.1093/bioinformatics/bts635","title":"STAR: ultrafast universal RNA-seq aligner","abstract":"<jats:title>Abstract</jats:title>\n                  <jats:p>Motivation: Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases.</jats:p>\n                  <jats:p>Results: To align our large (&amp;gt;80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of &amp;gt;50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80–90% success rate, corroborating the high precision of the STAR mapping strategy.</jats:p>\n                  <jats:p>Availability and implementation: STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.</jats:p>\n                  <jats:p>Contact:  dobin@cshl.edu.</jats:p>","journal":"Bioinformatics","year":2013,"id":12777,"datarank":1.637815886792173,"base_score":10.918772578614485,"endowment":10.918772578614485,"self_citation_contribution":1.637815886792173,"citation_network_contribution":0.0,"self_endowment_contribution":1.637815886792173,"citer_contribution":0.0,"corpus_percentile":null,"corpus_rank":null,"citation_count":55202,"citer_count":0,"citers_with_citation_signal":0,"citers_with_endowment":0,"datacite_reuse_total":0,"is_dataset":false,"is_dataset_confidence":0.0492,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"2013-01-01","fair_score":67.5,"fair_percentile":95.9,"algorithm_id":"datarank_citation_only_1hop_v6","ranking_scope":"data_only","authors":[],"reference_count":22,"raw_metadata":{"citation_network_status":"fetched"},"created_at":"2026-03-01T18:20:47.508186Z","pmid":null,"pmcid":null,"fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":null,"license":null,"views":0,"total_file_size_bytes":0,"version_count":0,"fair_f":100.0,"fair_a":70.0,"fair_i":100.0,"fair_r":0.0,"fair_zscore":null,"fair_rationale":{"fair_score":67.5,"has_llm":false,"dimensions":{"F":{"name":"Findable","score":100.0,"criteria":[{"key":"f_has_doi","label":"Has a persistent DOI","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"DOI present","rationale":null},{"key":"f_repository_presence","label":"Indexed in repositories / literature DBs","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"datacite=25, pmcid=True, pmid=True","rationale":null},{"key":"f_persistent_ids","label":"Resolvable scholarly identifiers (OpenAlex)","kind":"deterministic","weight":0.5,"fraction":1.0,"signal":"OpenAlex id present","rationale":null}]},"A":{"name":"Accessible","score":70.0,"criteria":[{"key":"a_open_access","label":"Open Access / files deposited","kind":"deterministic","weight":1.5,"fraction":0.5,"signal":"files/OA location present but not flagged OA","rationale":null},{"key":"a_retrievable","label":"Free full text retrievable","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"8 OA location(s)","rationale":null}]},"I":{"name":"Interoperable","score":100.0,"criteria":[{"key":"i_linked_data","label":"Linked datasets / DataCite relations","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"linked_datasets=25, datacite=25","rationale":null},{"key":"i_standard_ids","label":"References data via standard accessions","kind":"deterministic","weight":1.0,"fraction":1.0,"signal":"accessions=1, trials=0","rationale":null}]},"R":{"name":"Reusable","score":0.0,"criteria":[{"key":"r_license","label":"Clear, open reuse license","kind":"deterministic","weight":1.5,"fraction":0.0,"signal":"no license","rationale":null},{"key":"r_downloads","label":"Demonstrated reuse (downloads)","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"downloads=0","rationale":null},{"key":"r_version","label":"Versioned / maintained","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"no version chain","rationale":null},{"key":"r_dataset","label":"Classified as a data resource","kind":"deterministic","weight":0.5,"fraction":0.0,"signal":"not a dataset","rationale":null}]}},"suggestions":["Attach a clear, open reuse license (e.g. CC-BY or CC0).","Maintain explicit versioning for the dataset.","Make the paper/data Open Access or deposit the files in an open repository."],"model":null,"agent_version":"fair_agent_v2","fulltext_source":"abstract_only"},"fair_model":null,"fair_agent_version":"fair_agent_v2","fair_fulltext_source":"abstract_only","fair_has_llm":false,"fair_computed_at":"2026-06-23T20:19:55.520119Z","clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}