Control of a Quadrotor with Reinforcement Learning

March 3, 2017Artis Modus

aslteam

In this video, we demonstrate a method to control a quadrotor with a neural network trained using reinforcement learning techniques. With reinforcement learning, a common network can be trained to directly map state to actuator command making any predefined control structure obsolete for training.

More detail regarding the paper can be found from https://arxiv.org/abs/1707.05110
and implementation is available from https://bitbucket.org/leggedrobotics/rai

Source

Similar Posts

19 thoughts on “Control of a Quadrotor with Reinforcement Learning”

123kid says:

March 3, 2017 at 4:18 pm

Good job guys. Can you share details of the platform you're using – is the machine learn platform tensorflow/python-based, and is the quadrotor system running on ROS?
Ahmed E.A Abdalla says:

March 14, 2017 at 11:06 am

This looks amazing!
I have experience with ROS and tensorflow (very basic). I'm highly interested in applying your findings to a quadcopter I'm working on.
Can you please point me in the right direction?
What can I start with and what must I learn?
Thanks!
Praveen Gudhi says:

July 1, 2017 at 12:05 pm

You guys are amazing, thank you for showing the direction
cypreessDK says:

November 6, 2017 at 7:34 am

What simulation engine did you used? Was that Unity? How did you dealed with reality-gap – what was your approach here. Did the copter flight on policy trained only in simulation?
Niharranjan Pradhan says:

November 12, 2017 at 12:44 am

Sir!
Which simulator have u used for training?
Kuan-Ho Lao says:

December 20, 2017 at 12:14 am

Cool work! I just completed a similar project, but I used DDPG with SNNs, nice to see implementation on real quadcopter, have you tried different tasks?
Rooster says:

March 29, 2018 at 6:30 am

mY GOSH!
Anush Manukyan says:

July 17, 2018 at 5:22 am

I've read the paper however I could not understand how do you define the 4 actions? Each action is each rotor's velocity, right? But then how do they choose the velocity in the beginning?
Ebert says:

November 24, 2018 at 3:30 pm

Random question here: I'm doing something similar using PPO on a quadrotor (simple simulation using OpenAI gym). I'm trying to get the cable suspended load case now, but still struggling with the reward function. Terminal states are way more difficult to handle than the standard quad.
Yuanye Ma says:

January 16, 2019 at 7:00 pm

The performance of the quadrotor is excellent.
But i have a question about the reinforcement learning. I want to know what a role the RL algorithm play in the control system? navigation or attitude control?
RL algorithm just provide where to move, and PID controler to finish attitude control? Or both attitude control and location control are all RL algorithm`s duty?

Thanks for your response.
Inviaz says:

June 9, 2019 at 1:36 am

These 4 PWM actions can be perform at the same time in one iteration loop?
Edin12n says:

August 26, 2019 at 5:11 am

Hello, What a great video. I'm new to the subject of Reinforced Learning and hoped I could ask a question. Here goes: Does the ability of the drone to recover depend on input from the various sensors e.g. gyro. So say the drone flew from shade to sunlight, would there need to be a temperature sensor on board to allow it to cope with any sudden movement associated with moving from hot to cold. Or does the reinforced learning model not care about any of that and just learn to deal with whatever it encounters (so you could throw away the temperature sensor) and it would stabilize just fine with any sudden movement associated with hot to cold. Thanks
srinath tangudu says:

November 6, 2019 at 2:22 am

what drone did you use? i planning to buy a drone for my research to do similar things. can some suggest me a drone
Tran Cong Nguyen says:

February 12, 2020 at 10:14 pm

Thank you very much, that is really awesome to see
The Nuke Gaming FR says:

May 24, 2020 at 7:42 am

Hello good job, excuse me I'm a beginner and I don't understand why the value network is essential for learning. With only one value of distance between the quad and the point this could not be enough.
Pratik Prajapati says:

June 26, 2020 at 5:19 am

This is great indeed
Loop says:

December 2, 2020 at 5:19 am

where is fucking source code
jingwei wu says:

July 23, 2021 at 8:44 pm

I have a lot of simulation files for four rotors，If someone needs can contact me by email：wujingwei1995@gmail.com
Yug says:

March 8, 2022 at 4:40 am

Amazing work! How did you guys handle the noise in sensors (Gyro/Accelerometer) ?

Comments are closed.