DDQN-with-PyTorch-for-OpenAI-Gym

DDQN-with-PyTorch-for-OpenAI-Gym

Implementation of Double DQN reinforcement learning for OpenAI Gym environments with PyTorch.

Stars: 62

Visit
 screenshot

Implementation of Double DQN reinforcement learning for OpenAI Gym environments with discrete action spaces. The algorithm aims to improve sample efficiency by using two uncorrelated Q-Networks to prevent overestimation of Q-values. By updating parameters periodically, the model reduces computation time and enhances training performance. The tool is based on the Double DQN method proposed by Hasselt in 2010.

README:

DDQN with PyTorch for OpenAI Gym

Implementation of Double DQN reinforcement learning for OpenAI Gym environments with discrete action spaces. Performance is defined as the sample efficiency of the algorithm i.e. how good is the average reward after using x episodes of interaction in the environment for training.
The related paper can be found here: Hasselt, 2010

Double DQN

The standard DQN method has been shown to overestimate the true Q-value, because for the target an argmax over estimated Q-values is used. Therefore when some values are overestimated and some underestimated, the overestimated values have a higher probability to be selected.

Standard DQN target:
Q(st, at) = rt + Q(st+1, argmaxaQ(st, a))

By using two uncorralated Q-Networks we can prevent this overestimation. In order to save computation time we do gradient updates only for one of the Q-Networks and periodically update the parameters of the target Q-Network to match the parameter of the Q-Network that is updated.

The Double DQN target then becomes:
Q(st, at) = rt + Qθ(st+1, argmaxaQtarget(st, a))

And the loss function is given by:
(Q(st, at) - Qθ(st, at))^2

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for DDQN-with-PyTorch-for-OpenAI-Gym

Similar Open Source Tools

For similar tasks

For similar jobs