Deep reinforcement learning for environments with small state spaces.
- NumPy
- OpenAI Gym 0.8
- TensorFlow 1.0
Uses environments provided by OpenAI Gym.
The network has a single hidden layer with 40 rectified linear units. The output layer has as many nodes as there are actions. Each output node represents the expected utility of an action.
Heavily influenced by DeepMind's seminal paper 'Playing Atari with Deep Reinforcement Learning' (Mnih et al., 2013) and 'Human-level control through deep reinforcement learning' (Mnih et al., 2015).