توضیحاتی در مورد کتاب Algorithms for Decision Making
نام کتاب : Algorithms for Decision Making
عنوان ترجمه شده به فارسی : الگوریتم های تصمیم گیری
سری :
نویسندگان : Mykel J. Kochenderfer, Tim A Wheeler, Kyle H Wray
ناشر : MIT Press
سال نشر : 2022
تعداد صفحات : 701
ISBN (شابک) : 2021038701 , 0262047012
زبان کتاب : English
فرمت کتاب : pdf
حجم کتاب : 12 مگابایت
بعد از تکمیل فرایند پرداخت لینک دانلود کتاب ارائه خواهد شد. درصورت ثبت نام و ورود به حساب کاربری خود قادر خواهید بود لیست کتاب های خریداری شده را مشاهده فرمایید.
فهرست مطالب :
Preface
Acknowledgments
Introduction
Decision Making
Applications
Methods
History
Societal Impact
Overview
I Probabilistic Reasoning
Representation
Degrees of Belief and Probability
Probability Distributions
Joint Distributions
Conditional Distributions
Bayesian Networks
Conditional Independence
Summary
Exercises
Inference
Inference in Bayesian Networks
Inference in Naive Bayes Models
Sum-Product Variable Elimination
Belief Propagation
Computational Complexity
Direct Sampling
Likelihood Weighted Sampling
Gibbs Sampling
Inference in Gaussian Models
Summary
Exercises
Parameter Learning
Maximum Likelihood Parameter Learning
Bayesian Parameter Learning
Nonparametric Learning
Learning with Missing Data
Summary
Exercises
Structure Learning
Bayesian Network Scoring
Directed Graph Search
Markov Equivalence Classes
Partially Directed Graph Search
Summary
Exercises
Simple Decisions
Constraints on Rational Preferences
Utility Functions
Utility Elicitation
Maximum Expected Utility Principle
Decision Networks
Value of Information
Irrationality
Summary
Exercises
II Sequential Problems
Exact Solution Methods
Markov Decision Processes
Policy Evaluation
Value Function Policies
Policy Iteration
Value Iteration
Asynchronous Value Iteration
Linear Program Formulation
Linear Systems with Quadratic Reward
Summary
Exercises
Approximate Value Functions
Parametric Representations
Nearest Neighbor
Kernel Smoothing
Linear Interpolation
Simplex Interpolation
Linear Regression
Neural Network Regression
Summary
Exercises
Online Planning
Receding Horizon Planning
Lookahead with Rollouts
Forward Search
Branch and Bound
Sparse Sampling
Monte Carlo Tree Search
Heuristic Search
Labeled Heuristic Search
Open-Loop Planning
Summary
Exercises
Policy Search
Approximate Policy Evaluation
Local Search
Genetic Algorithms
Cross Entropy Method
Evolution Strategies
Isotropic Evolutionary Strategies
Summary
Exercises
Policy Gradient Estimation
Finite Difference
Regression Gradient
Likelihood Ratio
Reward-to-Go
Baseline Subtraction
Summary
Exercises
Policy Gradient Optimization
Gradient Ascent Update
Restricted Gradient Update
Natural Gradient Update
Trust Region Update
Clamped Surrogate Objective
Summary
Exercises
Actor-Critic Methods
Actor-Critic
Generalized Advantage Estimation
Deterministic Policy Gradient
Actor-Critic with Monte Carlo Tree Search
Summary
Exercises
Policy Validation
Performance Metric Evaluation
Rare Event Simulation
Robustness Analysis
Trade Analysis
Adversarial Analysis
Summary
Exercises
III Model Uncertainty
Exploration and Exploitation
Bandit Problems
Bayesian Model Estimation
Undirected Exploration Strategies
Directed Exploration Strategies
Optimal Exploration Strategies
Exploration with Multiple States
Summary
Exercises
Model-Based Methods
Maximum Likelihood Models
Update Schemes
Exploration
Bayesian Methods
Bayes-Adaptive Markov Decision Processes
Posterior Sampling
Summary
Exercises
Model-Free Methods
Incremental Estimation of the Mean
Q-Learning
Sarsa
Eligibility Traces
Reward Shaping
Action Value Function Approximation
Experience Replay
Summary
Exercises
Imitation Learning
Behavioral Cloning
Data Set Aggregation
Stochastic Mixing Iterative Learning
Maximum Margin Inverse Reinforcement Learning
Maximum Entropy Inverse Reinforcement Learning
Generative Adversarial Imitation Learning
Summary
Exercises
IV State Uncertainty
Beliefs
Belief Initialization
Discrete State Filter
Kalman Filter
Extended Kalman Filter
Unscented Kalman Filter
Particle Filter
Particle Injection
Summary
Exercises
Exact Belief State Planning
Belief-State Markov Decision Processes
Conditional Plans
Alpha Vectors
Pruning
Value Iteration
Linear Policies
Summary
Exercises
Offline Belief State Planning
Fully Observable Value Approximation
Fast Informed Bound
Fast Lower Bounds
Point-Based Value Iteration
Randomized Point-Based Value Iteration
Sawtooth Upper Bound
Point Selection
Sawtooth Heuristic Search
Triangulated Value Functions
Summary
Exercises
Online Belief State Planning
Lookahead with Rollouts
Forward Search
Branch and Bound
Sparse Sampling
Monte Carlo Tree Search
Determinized Sparse Tree Search
Gap Heuristic Search
Summary
Exercises
Controller Abstractions
Controllers
Policy Iteration
Nonlinear Programming
Gradient Ascent
Summary
Exercises
V Multiagent Systems
Multiagent Reasoning
Simple Games
Response Models
Dominant Strategy Equilibrium
Nash Equilibrium
Correlated Equilibrium
Iterated Best Response
Hierarchical Softmax
Fictitious Play
Gradient Ascent
Summary
Exercises
Sequential Problems
Markov Games
Response Models
Nash Equilibrium
Fictitious Play
Gradient Ascent
Nash Q-Learning
Summary
Exercises
State Uncertainty
Partially Observable Markov Games
Policy Evaluation
Nash Equilibrium
Dynamic Programming
Summary
Exercises
Collaborative Agents
Decentralized Partially Observable Markov Decision Processes
Subclasses
Dynamic Programming
Iterated Best Response
Heuristic Search
Nonlinear Programming
Summary
Exercises
Appendices
Mathematical Concepts
Measure Spaces
Probability Spaces
Metric Spaces
Normed Vector Spaces
Positive Definiteness
Convexity
Information Content
Entropy
Cross Entropy
Relative Entropy
Gradient Ascent
Taylor Expansion
Monte Carlo Estimation
Importance Sampling
Contraction Mappings
Graphs
Probability Distributions
Computational Complexity
Asymptotic Notation
Time Complexity Classes
Space Complexity Classes
Decidability
Neural Representations
Neural Networks
Feedforward Networks
Parameter Regularization
Convolutional Neural Networks
Recurrent Networks
Autoencoder Networks
Adversarial Networks
Search Algorithms
Search Problems
Search Graphs
Forward Search
Branch and Bound
Dynamic Programming
Heuristic Search
Problems
Hex World
2048
Cart-Pole
Mountain Car
Simple Regulator
Aircraft Collision Avoidance
Crying Baby
Machine Replacement
Catch
Prisoner\'s Dilemma
Rock-Paper-Scissors
Traveler\'s Dilemma
Predator-Prey Hex World
Multicaregiver Crying Baby
Collaborative Predator-Prey Hex World
Julia
Types
Functions
Control Flow
Packages
Convenience Functions
References
Index
Пустая страница