Videos

AI learns to Speedrun QWOP using Machine Learning



Wesley Liao

UPDATE:
AI was able to surpass the World Record in my new video:
https://youtu.be/82sTpO_EpEc

AI bot learns to play QWOP like a human and achieves a top 10 speedrun (1m 8s). Trained using Reinforcement Learning and Imitation Learning.

Writeup:
https://towardsdatascience.com/achieving-human-level-performance-in-qwop-using-reinforcement-learning-and-imitation-learning-81b0a9bbac96

Github repo:
https://github.com/Wesleyliao/QWOP-RL

Papers mentioned:
– Sample Efficient Actor-Critic with Experience Replay
https://arxiv.org/pdf/1611.01224.pdf
– Deep Q-learning from Demonstrations
https://arxiv.org/pdf/1704.03732.pdf

Kurodo’s channel:
https://www.youtube.com/channel/UCLxJfj_Dq8Ks89tUVR3z7ug

QWOP speedrun leaderboard:
https://www.speedrun.com/qwop

Source

Similar Posts

23 thoughts on “AI learns to Speedrun QWOP using Machine Learning
  1. Right away I'm not such a fan of the fact that you had to demonstrate a stride and then it used that. It feels like it limits the possibilities only to that which humans can already achieve.

    A more general purpose AI might take longer to train, but it would have been able to discover special movies on its own. They usually find glitches in the physics engine and exploit those in a way no human can.

    Great video and thanks of the upload and explanation!

  2. did you try training for 20 hours without pretraining? i’m curious how much the extra 12 hours of training helped compared to pretraining

  3. we live in a strange time where we know flash games from childhood and are intelligent enough to program ai to accomplish what we couldn’t lol

  4. I believe the biggest problem for AI in this instance was the lack of Kurodial arteries, therefore restricting the bandwith.

  5. Well, this reinforces my confidence that computers will never take over the world.
    If THAT guy was chasing after me… I'd take his lunch money and then look for his brother.

  6. Blast from the past. I remember being in high school when this game was at the height of its popularity. Almost every computer in the lab during lunch had this on it.

  7. I was very excited to see the final result with 65 hours.. but it turns out it was what we've already seen, what a disappointment

  8. Ah, Semi-Supervised Learning. That is how machine learning really needs to be done.

    Or really…
    – A combination of Supervised Learning, Semi-Supervised Learning, and Unsupervised Learning
    – Redundancies to prevent it from just always taking the laziest path.
    – Hard-coded rules manually put in by a human.
    – Humans always kept in the loop, along with Non-AI redundancies.
    – Everything done in a safe and controlled environment, instead of a live deployment where it's possible for the AI to do real damage.

  9. this game should be an ultimate AI algorithm benchmark tester, I knew about the game since it first came out and didn't know much about programming until recently so a lot made sense easily once I got the proper vocabulary understood

Comments are closed.

WP2Social Auto Publish Powered By : XYZScripts.com