Cutting-Edge AI: Deep Reinforcement Learning in Python

Apply deep learning to artificial intelligence and reinforcement learning using evolution strategies, A2C, and DDPG

Register for this Course

$29.99 $199.99 USD 85% OFF!

Login or signup to register for this course

Have a coupon? Click here.

Course Data

Lectures: 51
Length: 8h 38m
Skill Level: All Levels
Languages: English
Includes: Lifetime access, certificate of completion (shareable on LinkedIn, Facebook, and Twitter), Q&A forum

Course Description

Welcome to Cutting-Edge AI!

This is technically Deep Learning in Python part 11 of my deep learning series, and my 3rd reinforcement learning course.

Deep Reinforcement Learning is actually the combination of 2 topics: Reinforcement Learning and Deep Learning (Neural Networks).

While both of these have been around for quite some time, it’s only been recently that Deep Learning has really taken off, and along with it, Reinforcement Learning.

The maturation of deep learning has propelled advances in reinforcement learning, which has been around since the 1980s, although some aspects of it, such as the Bellman equation, have been for much longer.

Recently, these advances have allowed us to showcase just how powerful reinforcement learning can be.

We’ve seen how AlphaZero can master the game of Go using only self-play.

This is just a few years after the original AlphaGo already beat a world champion in Go.



We’ve seen real-world robots learn how to walk, and even recover after being kicked over, despite only being trained using simulation.

Simulation is nice because it doesn’t require actual hardware, which is expensive. If your agent falls down, no real damage is done.



We’ve seen real-world robots learn hand dexterity, which is no small feat.

Walking is one thing, but that involves coarse movements. Hand dexterity is complex - you have many degrees of freedom and many of the forces involved are extremely subtle.

Imagine using your foot to do something you usually do with your hand, and you immediately understand why this would be difficult.



Last but not least - video games.

Even just considering the past few months, we’ve seen some amazing developments. AIs are now beating professional players in CS:GO and Dota 2.





So what makes this course different from the first two?

Now that we know deep learning works with reinforcement learning, the question becomes: how do we improve these algorithms?

This course is going to show you a few different ways: including the powerful A2C (Advantage Actor-Critic) algorithm, the DDPG (Deep Deterministic Policy Gradient) algorithm, and evolution strategies.

Evolution strategies is a new and fresh take on reinforcement learning, that kind of throws away all the old theory in favor of a more "black box" approach, inspired by biological evolution.





What’s also great about this new course is the variety of environments we get to look at.

First, we’re going to look at the classic Atari environments. These are important because they show that reinforcement learning agents can learn based on images alone.



Second, we’re going to look at MuJoCo, which is a physics simulator. This is the first step to building a robot that can navigate the real-world and understand physics - we first have to show it can work with simulated physics.



Finally, we’re going to look at Flappy Bird, everyone’s favorite mobile game just a few years ago.



What do you get if you sign up for the VIP version of this course? A brand new exclusive section covering an entirely new algorithm: TD3! As usual, both theory and code for this powerful state-of-the-art algorithm are provided.

Thanks for reading, and I’ll see you in class!



Suggested Prerequisites:

  • calculus
  • matrix arithmetic (adding, multiplying)
  • probability
  • Object-oriented programming
  • Python coding: if/else, loops, lists, dicts, sets
  • Numpy coding: matrix and vector operations, loading a CSV file
  • linear regression, logistic regression
  • neural networks and backpropagation
  • Know how to build a convolutional neural network (CNN) in TensorFlow
  • Markov Decision Proccesses (MDPs), Q-Learning

Testimonials and Success Stories


I am one of your students. Yesterday, I presented my paper at ICCV 2019. You have a significant part in this, so I want to sincerely thank you for your in-depth guidance to the puzzle of deep learning. Please keep making awesome courses that teach us!

I just watched your short video on “Predicting Stock Prices with LSTMs: One Mistake Everyone Makes.” Giggled with delight.

You probably already know this, but some of us really and truly appreciate you. BTW, I spent a reasonable amount of time making a learning roadmap based on your courses and have started the journey.

Looking forward to your new stuff.

Thank you for doing this! I wish everyone who call’s themselves a Data Scientist would take the time to do this either as a refresher or learn the material. I have had to work with so many people in prior roles that wanted to jump right into machine learning on my teams and didn’t even understand the first thing about the basics you have in here!!

I am signing up so that I have the easy refresh when needed and the see what you consider important, as well as to support your great work, thank you.

Thank you, I think you have opened my eyes. I was using API to implement Deep learning algorithms and each time I felt I was messing out on some things. So thank you very much.

I have been intending to send you an email expressing my gratitude for the work that you have done to create all of these data science courses in Machine Learning and Artificial Intelligence. I have been looking long and hard for courses that have mathematical rigor relative to the application of the ML & AI algorithms as opposed to just exhibit some 'canned routine' and then viola here is your neural network or logistical regression. ...

READ MORE

I have now taken a few classes from some well-known AI profs at Stanford (Andrew Ng, Christopher Manning, …) with an overall average mark in the mid-90s. Just so you know, you are as good as any of them. But I hope that you already know that.

I wish you a happy and safe holiday season. I am glad you chose to share your knowledge with the rest of us.

Hi Sir I am a student from India. I've been wanting to write a note to thank you for the courses that you've made because they have changed my career. I wanted to work in the field of data science but I was not having proper guidance but then I stumbled upon your "Logistic Regression" course in March and since then, there's been no looking back. I learned ANNs, CNNs, RNNs, Tensorflow, NLP and whatnot by going through your lectures. The knowledge that I gained enabled me to get a job as a Business Technology Analyst at one of my dream firms even in the midst of this pandemic. For that, I shall always be grateful to you. Please keep making more courses with the level of detail that you do in low-level libraries like Theano.

I just wanted to reach out and thank you for your most excellent course that I am nearing finishing.

And, I couldn't agree more with some of your "rants", and found myself nodding vigorously!

You are an excellent teacher, and a rare breed.

And, your courses are frankly, more digestible and teach a student far more than some of the top-tier courses from ivy leagues I have taken in the past.

(I plan to go through many more courses, one by one!)

I know you must be deluged with complaints in spite of the best content around That's just human nature.

Also, satisfied people rarely take the time to write, so I thought I will write in for a change. :)

Hello, Lazy Programmer!

In the process of completing my Master’s at Hunan University, China, I am writing this feedback to you in order to express my deep gratitude for all the knowledge and skills I have obtained studying your courses and following your recommendations.

The first course of yours I took was on Convolutional Neural Networks (“Deep Learning p.5”, as far as I remember). Answering one of my questions on the Q&A board, you suggested I should start from the beginning – the Linear and Logistic Regression courses. Despite that I assumed I had already known many basic things at that time, I overcame my “pride” and decided to start my journey in Deep Learning from scratch. ...

READ MORE

By the way, if you are interested to hear. I used the HMM classification, as it was in your course (95% of the script, I had little adjustments there), for the Customer-Care department in a big known fintech company. to predict who will call them, so they can call him before the rush hours, and improve the service. Instead of a poem, I Had a sequence of the last 24 hours' events that the customer had, like: "Loaded money", "Usage in the food service", "Entering the app", "Trying to change the password", etc... the label was called or didn't call. The outcome was great. They use it for their VIP customers. Our data science department and I got a lot of praise.

Lectures

Welcome

3 Lectures · 16min
  1. Introduction (03:46) (FREE preview available)
  2. Outline (07:47)
  3. Where to get the code (04:37)

Review of Fundamental Reinforcement Learning Concepts

8 Lectures · 01hr 20min
  1. Review Section Introduction (04:01)
  2. The Explore-Exploit Dilemma (13:35)
  3. Markov Decision Processes (MDPs) (20:19)
  4. Monte Carlo Methods (07:54)
  5. Temporal Difference Learning (TD) (17:17)
  6. OpenAI Gym Warmup (06:46)
  7. Review Section Summary (07:29)
  8. Suggestion Box (03:10)

A2C (Advantage Actor-Critic)

11 Lectures · 01hr 39min
  1. A2C Section Introduction (07:54)
  2. A2C Theory (part 1) (20:40)
  3. A2C Theory (part 2) (06:48)
  4. A2C Theory (part 3) (03:14)
  5. A2C Demo (03:09)
  6. A2C Code - Rough Sketch (07:11)
  7. Multiple Processes (08:47)
  8. Environment Wrappers (11:49)
  9. Convolutional Neural Network (05:31)
  10. A2C (17:22)
  11. A2C Section Summary (06:40)

DDPG (Deep Deterministic Policy Gradient)

7 Lectures · 01hr 20min
  1. DDPG Section Introduction (03:37)
  2. Deep Q-Learning (DQN) Review (09:21)
  3. DDPG Theory (18:42)
  4. MuJoCo (18:53)
  5. DDPG Code (part 1) (18:31)
  6. DDPG Code (part 2) (06:36)
  7. DDPG Section Summary (04:25)

ES (Evolution Strategies)

9 Lectures · 01hr 28min
  1. ES Section Introduction (06:26)
  2. ES Theory (20:17)
  3. Notes on Evolution Strategies (08:50)
  4. ES for Optimizing a Function (06:33)
  5. ES for Supervised Learning (06:40)
  6. Flappy Bird (12:09)
  7. ES for Flappy Bird in Code (15:02)
  8. ES for MuJoCo in Code (07:50)
  9. ES Section Summary (05:02)

Setting Up Your Environment (Appendix/FAQ by Student Request)

3 Lectures · 42min
  1. Pre-Installation Check (04:13)
  2. Anaconda Environment Setup (20:21)
  3. How to install Numpy, Scipy, Matplotlib, Pandas, IPython, Theano, and TensorFlow (17:33)

Extra Help With Python Coding for Beginners (Appendix/FAQ by Student Request)

4 Lectures · 42min
  1. How to Code Yourself (part 1) (15:55)
  2. How to Code Yourself (part 2) (09:24)
  3. Proof that using Jupyter Notebook is the same as not using it (12:29)
  4. Python 2 vs Python 3 (04:38)

Effective Learning Strategies for Machine Learning (Appendix/FAQ by Student Request)

4 Lectures · 59min
  1. How to Succeed in this Course (Long Version) (10:25)
  2. Is this for Beginners or Experts? Academic or Practical? Fast or slow-paced? (22:05)
  3. What order should I take your courses in? (part 1) (11:19)
  4. What order should I take your courses in? (part 2) (16:07)

Appendix / FAQ Finale

2 Lectures · 08min
  1. What is the Appendix? (02:48)
  2. Where to get discount coupons and FREE deep learning material (05:49)

Extras

  • TD3 (Twin Delayed DDPG) Theory
  • TD3 Code
This website is using cookies. That's Fine