FinUniversity Electronic Library

Details

	Card	Table	RUSMARC

Lanham, Micheal. Hands-on reinforcement learning for games: implementing self-learning agents in games using artificial intelligence techniques / Micheal Lanham. — 1 online resource (1 volume) : illustrations — <URL:http://elib.fa.ru/ebsco/2346941.pdf>.

Record create date: 6/17/2020

Subject: Machine learning.; Artificial intelligence.; Reinforcement learning.; Computer games — Programming.; Application software — Development.; Computer games — Programming.

Collections: EBSCO

Allowed Actions: –

Action 'Read' will be available if you login or access site from another network Action 'Download' will be available if you login or access site from another network

Group: Anonymous

Network: Internet

Annotation

The AI revolution is here and it is embracing games. Game developers are being challenged to enlist cutting edge AI as part of their games. In this book, you will look at the journey of building capable AI using reinforcement learning algorithms and techniques. You will learn to solve complex tasks and build next-generation games using a ...

Document access rights

	Network		User group		Action
	Finuniversity Local Network		All
	Internet		Readers
	Internet		Anonymous

Cover
Title Page
Copyright and Credits
Dedication
About Packt
Contributors
Table of Contents
Preface
Section 1: Exploring the Environment
Chapter 1: Understanding Rewards-Based Learning
- Technical requirements
- Understanding rewards-based learning
  - The elements of RL
  - The history of RL
  - Why RL in games?
- Introducing the Markov decision process
  - The Markov property and MDP
  - Building an MDP
- Using value learning with multi-armed bandits
  - Coding a value learner
  - Implementing a greedy policy
  - Exploration versus exploitation
- Exploring Q-learning with contextual bandits
  - Implementing a Q-learning agent
  - Removing discounted rewards
- Summary
- Questions
Chapter 2: Dynamic Programming and the Bellman Equation
- Introducing DP
  - Regular programming versus DP
  - Enter DP and memoization
- Understanding the Bellman equation
  - Unraveling the finite MDP
  - The Bellman optimality equation
- Building policy iteration
  - Installing OpenAI Gym
  - Testing Gym
  - Policy evaluation
  - Policy improvement
- Building value iteration
- Playing with policy versus value iteration
- Exercises
- Summary
Chapter 3: Monte Carlo Methods
- Understanding model-based and model-free learning
- Introducing the Monte Carlo method
  - Solving for
  - Implementing Monte Carlo
  - Plotting the guesses
- Adding RL
  - Monte Carlo control
- Playing the FrozenLake game
- Using prediction and control
  - Incremental means
- Exercises
- Summary
Chapter 4: Temporal Difference Learning
- Understanding the TCA problem
- Introducing TDL
  - Bootstrapping and backup diagrams
  - Applying TD prediction
  - TD(0) or one-step TD
  - Tuning hyperparameters
- Applying TDL to Q-learning
- Exploring TD(0) in Q-learning
  - Exploration versus exploitation revisited
  - Teaching an agent to drive a taxi
- Running off- versus on-policy
- Exercises
- Summary
Chapter 5: Exploring SARSA
- Exploring SARSA on-policy learning
- Using continuous spaces with SARSA
  - Discretizing continuous state spaces
  - Expected SARSA
- Extending continuous spaces
- Working with TD (λ) and eligibility traces
  - Backward views and eligibility traces
- Understanding SARSA (λ)
  - SARSA lambda and the Lunar Lander
- Exercises
- Summary
Section 2: Exploiting the Knowledge
Chapter 6: Going Deep with DQN
- DL for RL
  - DL frameworks for DRL
- Using PyTorch for DL
  - Computational graphs with tensors
  - Training a neural network – computational graph
- Building neural networks with Torch
- Understanding DQN in PyTorch
  - Refreshing the environment
  - Partially observable Markov decision process
  - Constructing DQN
  - The replay buffer
  - The DQN class
  - Calculating loss and training
- Exercising DQN
  - Revisiting the LunarLander and beyond
- Exercises
- Summary
Chapter 7: Going Deeper with DDQN
- Understanding visual state
  - Encoding visual state
- Introducing CNNs
- Working with a DQN on Atari
  - Adding CNN layers
- Introducing DDQN
  - Double DQN or the fixed Q targets
  - Dueling DQN or the real DDQN
- Extending replay with prioritized experience replay
- Exercises
- Summary
Chapter 8: Policy Gradient Methods
- Understanding policy gradient methods
  - Policy gradient ascent
- Introducing REINFORCE
- Using advantage actor-critic
  - Actor-critic
  - Training advantage AC
- Building a deep deterministic policy gradient
  - Training DDPG
- Exploring trust region policy optimization
  - Conjugate gradients
  - Trust region methods
  - The TRPO step
- Exercises
- Summary
Chapter 9: Optimizing for Continuous Control
- Understanding continuous control with Mujoco
- Introducing proximal policy optimization
  - The hows of policy optimization
  - PPO and clipped objectives
- Using PPO with recurrent networks
- Deciding on synchronous and asynchronous actors
  - Using A2C
  - Using A3C
- Building actor-critic with experience replay
- Exercises
- Summary
Chapter 10: All about Rainbow DQN
- Rainbow – combining improvements in deep reinforcement learning
- Using TensorBoard
- Introducing distributional RL
  - Back to TensorBoard
- Understanding noisy networks
  - Noisy networks for exploration and importance sampling
- Unveiling Rainbow DQN
  - When does training fail?
- Exercises
- Summary
Chapter 11: Exploiting ML-Agents
- Installing ML-Agents
- Building a Unity environment
  - Building for Gym wrappers
- Training a Unity environment with Rainbow
- Creating a new environment
  - Coding an agent/environment
- Advancing RL with ML-Agents
  - Curriculum learning
  - Behavioral cloning
  - Curiosity learning
  - Training generalized reinforcement learning agents
- Exercises
- Summary
Chapter 12: DRL Frameworks
- Choosing a framework
- Introducing Google Dopamine
- Playing with Keras-RL
- Exploring RL Lib
- Using TF-Agents
- Exercises
- Summary
Section 3: Reward Yourself
Chapter 13: 3D Worlds
- Reasoning on 3D worlds
- Training a visual agent
- Generalizing 3D vision
  - ResNet for visual observation encoding
- Challenging the Unity Obstacle Tower Challenge
  - Pre-training the agent
  - Prierarchy – implicit hierarchies
- Exploring Habitat – embodied agents by FAIR
  - Installing Habitat
  - Training in Habitat
- Exercises
- Summary
Chapter 14: From DRL to AGI
- Learning meta learning
  - Learning 2 learn
  - Model-agnostic meta learning
  - Training a meta learner
- Introducing meta reinforcement learning
  - MAML-RL
- Using hindsight experience replay
- Imagination and reasoning in RL
  - Generating imagination
- Understanding imagination-augmented agents
- Exercises
- Summary
Other Books You May Enjoy
Index

Usage statistics

Access count: 0
Last 30 days: 0
Detailed usage statistics