Return to home page
Searching: Muskingum library catalog
Some OPAL libraries remain closed or are operating at reduced service levels. Materials from those libraries may not be requestable; requested items may take longer to arrive. Note that pickup procedures may differ between libraries. Please contact your library for new procedures, specific requests, or other assistance.
Record 1 of 2
  Previous Record Previous Item Next Item Next Record
  Reviews, Summaries, etc...
EBOOK
Title Markov decision processes in artificial intelligence : MDPs, beyond MDPs and applications / edited by Olivier Sigaud, Olivier Buffet.
Imprint London : Wiley, 2013.

LOCATION CALL # STATUS MESSAGE
 OHIOLINK WILEY EBOOKS    ONLINE  
View online
LOCATION CALL # STATUS MESSAGE
 OHIOLINK WILEY EBOOKS    ONLINE  
View online
Series ISTE.
ISTE.
Subject Artificial intelligence -- Mathematics.
Markov processes.
Alt Name Sigaud, Olivier.
Buffet, Olivier.
Ohio Library and Information Network.
Uniform Title Processus d├ęcisionnels de Markov en intelligence artificielle. English
Description 1 online resource (481 pages).
polychrome rdacc
Contents Cover; Title Page; Copyright Page; Table of Contents; Preface; List of Authors; PART 1. MDPS: MODELS AND METHODS; Chapter 1. Markov Decision Processes; 1.1. Introduction; 1.2. Markov decision problems; 1.2.1. Markov decision processes; 1.2.2. Action policies; 1.2.3. Performance criterion; 1.3. Value functions; 1.3.1. The finite criterion; 1.3.2. The [beta]-discounted criterion; 1.3.3. The total reward criterion; 1.3.4. The average reward criterion; 1.4. Markov policies; 1.4.1. Equivalence of history-dependent and Markov policies; 1.4.2. Markov policies and valued Markov chains
1.5. Characterization of optimal policies1.5.1. The finite criterion; 1.5.1.1. Optimality equations; 1.5.1.2. Evaluation of a deterministic Markov policy; 1.5.2. The discounted criterion; 1.5.2.1. Evaluation of a stationary Markov policy; 1.5.2.2. Optimality equations; 1.5.3. The total reward criterion; 1.5.4. The average reward criterion; 1.5.4.1. Evaluation of a stationary Markov policy; 1.5.4.2. Optimality equations; 1.6. Optimization algorithms for MDPs; 1.6.1. The finite criterion; 1.6.2. The discounted criterion; 1.6.2.1. Linear programming; 1.6.2.2. The value iteration algorithm
1.6.2.3. The policy iteration algorithm1.6.3. The total reward criterion; 1.6.3.1. Positive MDPs; 1.6.3.2. Negative MDPs; 1.6.4. The average criterion; 1.6.4.1. Relative value iteration algorithm; 1.6.4.2. Modified policy iteration algorithm; 1.7. Conclusion and outlook; 1.8. Bibliography; Chapter 2. Reinforcement Learning; 2.1. Introduction; 2.1.1. Historical overview; 2.2. Reinforcement learning: a global view; 2.2.1. Reinforcement learning as approximate dynamic programming; 2.2.2. Temporal, non-supervised and trial-and-error based learning; 2.2.3. Exploration versus exploitation
2.2.4. General preliminaries on estimation methods2.3. Monte Carlo methods; 2.4. From Monte Carlo to temporal difference methods; 2.5. Temporal difference methods; 2.5.1. The TD(0) algorithm; 2.5.2. The SARSA algorithm; 2.5.3. The Q-learning algorithm; 2.5.4. The TD, SARSA and Q algorithms; 2.5.5. Eligibility traces and TD; 2.5.6. From TD to SARSA; 2.5.7. Q; 2.5.8. The R-learning algorithm; 2.6. Model-based methods: learning a model; 2.6.1. Dyna architectures; 2.6.2. The E3 algorithm; 2.6.3. The Rmax algorithm; 2.7. Conclusion; 2.8. Bibliography
Chapter 3. Approximate Dynamic Programming3.1. Introduction; 3.2. Approximate value iteration (AVI); 3.2.1. Sample-based implementation and supervised learning; 3.2.2. Analysis of the AVI algorithm; 3.2.3. Numerical illustration; 3.3. Approximate policy iteration (API); 3.3.1. Analysis in L [infinity symbol]-norm of the API algorithm; 3.3.2. Approximate policy evaluation; 3.3.3. Linear approximation and least-squares methods; 3.3.3.1. TD; 3.3.3.2. Least-squares methods; 3.3.3.3. Linear approximation of the state-action value function; 3.4. Direct minimization of the Bellman residual
3.5. Towards an analysis of dynamic programming in Lp-norm
Summary Markov Decision Processes (MDPs) are a mathematical framework for modeling sequential decision problems under uncertainty as well as Reinforcement Learning problems. Written by experts in the field, this book provides a global view of current research using MDPs in Artificial Intelligence. It starts with an introductory presentation of the fundamental aspects of MDPs (planning in MDPs, Reinforcement Learning, Partially Observable MDPs, Markov games and the use of non-classical criteria). Then it presents more advanced research trends in the domain and gives some concrete examples using illustrations.
Bibliography Note Includes bibliographical references and index.
Access Available to OhioLINK libraries.
Note Description based upon print version of record.
ISBN 9781118557426 (electronic bk.)
1118557425 (electronic bk.)
9781118619872 (electronic bk.)
1118619870 (electronic bk.)
OCLC # 830161640
Additional Format Print version: Sigaud, Olivier Markov Decision Processes in Artificial Intelligence London : Wiley,c2013 9781848211674.


If you experience difficulty accessing or navigating this content, please contact the OPAL Support Team