{"doi":"10.1613/jair.301","title":"Reinforcement Learning:  A Survey","abstract":"<jats:p>This paper surveys the field of reinforcement learning from    a computer-science perspective. It is written to be accessible to    researchers familiar with machine learning.  Both the historical basis    of the field and a broad selection of current work are summarized.    Reinforcement learning is the problem faced by an agent that learns    behavior through trial-and-error interactions with a dynamic    environment.  The work described here has a resemblance to work in    psychology, but differs considerably in the details and in the use of    the word ``reinforcement.''  The paper discusses central issues of    reinforcement learning, including trading off exploration and    exploitation, establishing the foundations of the field via Markov    decision theory, learning from delayed reinforcement, constructing    empirical models to accelerate learning, making use of generalization    and hierarchy, and coping with hidden state.  It concludes with a    survey of some implemented systems and an assessment of the practical    utility of current methods for reinforcement learning.</jats:p>","journal":"Journal of Artificial Intelligence Research","year":1996,"id":4517,"datarank":15.645612000470296,"base_score":9.078978053779355,"endowment":9.078978053779355,"self_citation_contribution":1.3618467080669034,"citation_network_contribution":14.283765292403393,"self_endowment_contribution":1.3618467080669034,"citer_contribution":14.283765292403393,"corpus_percentile":88.5,"corpus_rank":1935,"citation_count":8768,"citer_count":196,"citers_with_citation_signal":196,"citers_with_endowment":196,"datacite_reuse_total":0,"is_dataset":false,"is_oa":true,"file_count":0,"downloads":0,"has_version_chain":false,"published_date":"1996-05-01","authors":[{"id":45724,"name":"M. L. Littman","orcid":null,"position":1,"is_corresponding":false},{"id":45725,"name":"A. W. Moore","orcid":null,"position":2,"is_corresponding":false},{"id":45726,"name":"Leslie Pack Kaelbling","orcid":null,"position":3,"is_corresponding":false},{"id":45727,"name":"Michael L. Littman","orcid":"0000-0002-5596-1840","position":4,"is_corresponding":false},{"id":45728,"name":"Andrew Moore","orcid":"0000-0002-3395-0841","position":5,"is_corresponding":false},{"id":45723,"name":"L. P. Kaelbling","orcid":null,"position":0,"is_corresponding":true}],"reference_count":189,"raw_metadata":{"citation_network_status":"fetched"},"created_at":"2026-03-01T18:20:47.508186Z","pmid":null,"pmcid":null,"fwci":null,"citation_percentile":null,"influential_citations":0,"oa_status":null,"license":null,"views":0,"total_file_size_bytes":0,"version_count":0,"clinical_trials":[],"software_tools":[],"db_accessions":[],"linked_datasets":[],"topics":[]}