Videos

Deep Reinforcement Learning in Python Tutorial – A Course on How to Implement Deep Learning Papers



freeCodeCamp.org

In this intermediate deep learning tutorial, you will learn how to go from reading a paper on deep deterministic policy gradients to implementing the concepts in Tensorflow. This process can be applied to any deep learning paper, not just deep reinforcement learning.

In the second part, you will learn how to code a deep deterministic policy gradient (DDPG) agent using Python and PyTorch, to beat the continuous lunar lander environment (a classic machine learning problem).

DDPG combines the best of Deep Q Learning and Actor Critic Methods into an algorithm that can solve environments with continuous action spaces. We will have an actor network that learns the (deterministic) policy, coupled with a critic network to learn the action-value functions. We will make use of a replay buffer to maximize sample efficiency, as well as target networks to assist in algorithm convergence and stability.

🎥 Course created by Phil Tabor. Check out his YouTube channel: https://www.youtube.com/channel/UC58v9cLitc8VaCjrcKyAbrw

⭐️ Course Contents ⭐️
⌨️ (0:00:00) Introduction
⌨️ (0:04:58) How to Implement Deep Learning Papers
⌨️ (1:59:00) Deep Deterministic Policy Gradients are Easy in Pytorch

Learn to code for free and get a developer job: https://www.freecodecamp.org

Read hundreds of articles on programming: https://www.freecodecamp.org/news

Source

Similar Posts

40 thoughts on “Deep Reinforcement Learning in Python Tutorial – A Course on How to Implement Deep Learning Papers
  1. Its very advanced for me i guess (still watched watched for 20mins) … Hope to get some advice from phil for beginners… To really reach to a level of implementing papers….any advice on learning road path would be helpful. Have subscribed to your channel also.Thanks Phil. 🙂

  2. At 43:01, you say: " i is each element of that minibatch transitions" which is wrong. i is just the index of the reply memory, i.e. state i+1 follows after state i.
    And thanks for your great explanation. helped me a lot.

  3. Thank you so much.
    I have one question, implementing D3QN in dynamic environment, where obstacles are continuously moving, how one can implement it on hardware. And which one is better DDPG or D3QN in the scenario started above.

  4. Thank you this is an amazing tutorial , but i want to ask you about the traveling thief problem , and about that environment if i want to sovling by deep reinforcment learning….. can you give me some advice about this approach???

  5. Hello, I'm trying to decompose all the problem and I have a question when you use OUActionNoise based on Ornstein-Uhlenbeck process
    x = (

    self.x_prev

    + self.theta * (self.mu – self.x_prev) * self.dt

    + self.sigma * np.sqrt(self.dt) * np.random.normal(size=self.mu.shape)

    )

    I check the equations of OU process but I dont know how this "np.sqrt(self.dt)" is a valid implementation of a differential.

  6. Hi, I don't know if you the right person, and I'm an ignorant in machine learning. This being said I would like to know just a simple question to justify if i jump in the world of ML. The problem is the development of question answer system. Think in a project with many disciplines with 200-300 people where the information is dynamic spread in many documents and wiki's and the data change along the time. Is it possible to have a question and answer system with natural language that can understand the progression of time. 2 pieces of information had a relationship in the past but now they are not related and the system refrain on the actual question to mix the past information with the new one. The system can show how the question changed along time but not infer relationships of past events with actual ones.

  7. Hello, thank you for your tutrial. I have only one issue. I tried to replicate your code, but I have an error "cannot import name 'plotLearning' from 'utils'". Do you have any idea how can I fix that?

Comments are closed.

WP2Social Auto Publish Powered By : XYZScripts.com