How computers learn to recognize objects instantly | Joseph Redmon

August 18, 2017Artis Modus

TED

Ten years ago, researchers thought that getting a computer to tell the difference between a cat and a dog would be almost impossible. Today, computer vision systems do it with greater than 99 percent accuracy. How? Joseph Redmon works on the YOLO (You Only Look Once) system, an open-source method of object detection that can identify objects in images and video — from zebras to stop signs — with lightning-quick speed. In a remarkable live demo, Redmon shows off this important step forward for applications like self-driving cars, robotics and even cancer detection.

Check out more TED talks: http://www.ted.com

The TED Talks channel features the best talks and performances from the TED Conference, where the world’s leading thinkers and doers give the talk of their lives in 18 minutes (or less). Look for talks on Technology, Entertainment and Design — plus science, business, global issues, the arts and more.

Follow TED on Twitter: http://www.twitter.com/TEDTalks
Like TED on Facebook: https://www.facebook.com/TED

Subscribe to our channel: https://www.youtube.com/TED

Source

Similar Posts

41 thoughts on “How computers learn to recognize objects instantly | Joseph Redmon”

WebTomekK says:

November 30, 2019 at 4:42 am

3:22 says "skateboard" on the shoes. slow it for x25 and you'll see it. lol.
Best Buy says:

December 1, 2019 at 11:13 pm

wow, amazing, It's really impressive matter!
Amee Sami says:

December 2, 2019 at 8:21 am

Awesome! Thank you for making it open source.
Jack Soder says:

December 8, 2019 at 12:40 pm

To many people we need a new plague
Chengyuan Tang says:

December 8, 2019 at 3:54 pm

5:42 does the system just recognize a parrot as a teddy bear?
Victor says:

December 8, 2019 at 9:57 pm

the damage that can be done to segregate societies and track people is crazy real.
Piotr Adamowicz says:

December 9, 2019 at 6:14 am

Spends 7 minutes, never actually says how they work, except that his works well . I want my 7 minutes back.
Chad Thaddeus says:

December 10, 2019 at 5:27 am

China took it and built an advanced facial recognition system. Thanks, Darknet!
SumriseHD says:

December 10, 2019 at 5:38 am

This audience is so horrible.. urgh…
Angery Angery says:

December 10, 2019 at 5:46 am

5:42 Parrots are now Teddy Bears. Change my mind because you cant
x X _ K J C O M P U T Ξ R _ X x says:

December 10, 2019 at 5:47 am

3:22 it said skateboard
Manos Chalvatzopoulos says:

December 10, 2019 at 7:15 am

he went fullmetal alchemist on his presentation
Ítalo de Pontes Oliveira says:

December 10, 2019 at 9:51 am

3:10 detect "remote" in the arm
Ola Østbyhaug says:

December 10, 2019 at 11:48 am

hold da phone, is that obama?
luit meinen says:

December 11, 2019 at 10:47 am

Of course this is really amazing, but I do find it funny that at certain points the system thought a parrot was a pizza, a stop sin was a frisbee and a tripod was skis :p
Vlad Gidea says:

December 11, 2019 at 1:00 pm

Asta-i Bara?
Jonathan Lutz says:

December 11, 2019 at 2:21 pm

– Doctor: Let's try this in the body
– AI: I found a suitcase

I'm joking, very good work ! Thank you
LifeInTags says:

December 11, 2019 at 5:32 pm

Where can I download this application?
Feldenkrais with Alfons says:

December 12, 2019 at 11:27 am

"thanks for putting this into the public domain free for anyone to use" – China.
movax20h says:

December 12, 2019 at 11:47 am

A stop sign is a freisbee. A parot is a person. Detects backpack where there is not a backpack. Pretty good, but I wonder how current state of the art systems perform.
Eric Adair says:

December 12, 2019 at 7:39 pm

point it out some mmore 2 late yet ?
Reuben M.D. says:

December 13, 2019 at 2:17 am

3:22 Computer thinks there's a skateboard there 🙂
Nogardtist says:

December 13, 2019 at 1:06 pm

i wonder why kinect died
and youtube AI gets more weapons
Feralz says:

December 13, 2019 at 6:52 pm

hooman detected. must kill. hooman.
Classix says:

December 14, 2019 at 4:51 am

3:22 skateboard
Akira says:

December 15, 2019 at 1:31 am

maybe they should put someone else in charge to name things
Genomul Uman says:

December 15, 2019 at 6:34 am

I like it how all maniacs use misleading examples like "self driven cars" instead of just stating the truth: we need this kind of technology for instant face recognition for mass surveillance. Good job "Washington University graduates" from all over the world!
Gul Rukh Khan says:

January 8, 2020 at 3:41 am

Hi, can you send me the code for Matlab, I really very impressed and wants to know more about this.
If you kindly send me the code for Matlab will be highly appreciated. Thanks again.
best Regards
Gul Rukh Khan
WOOSAL says:

January 16, 2020 at 2:31 pm

What is his distro?
Robert Antonio says:

January 17, 2020 at 8:17 am

I usually dont do this but i recommend cyberspaceintelligence@gmail.com or hackgoodness on instagram for any phone spying or gps tracking services. with their help , I was able to spy on my wifes phone to see alll her text messages, phone calls, facebook messenger chats, whatsapp chats and more! they were able to install my iphone 8 as the mirror phone so i was viewing everything remotely without stress! just contact cyberspaceintelligence@gmail.com or hackgoodness on instagram for help
Uthael Killeanea says:

January 19, 2020 at 1:28 pm

4:11 and 15s after that is the closest to "HOW computers learn…"
I hoped for more details when I clicked on this.
norkator says:

January 23, 2020 at 11:32 am

I'm using Yolo v3 on security camera application. Yolo is perfect for sorting as example cars and people fast with slower hardware. Then you can use other specific algorithms to do other detection's from these sorted objects like car -> license plates, people -> faces (train model to detect persons). Yes it's making mistakes but this computer vision is not nothing simple. This video doesn't answer any questions -> better start from their paper: https://pjreddie.com/media/files/papers/YOLOv3.pdf
ETechno Tricks says:

January 29, 2020 at 2:09 pm

watch these interesting video about Face recognition,,
watch these interesting video about Face recognition,,

https://www,youtube.com/watch?v=5JtDSi4FUPQ&t=42s
Oussa bly says:

February 3, 2020 at 1:47 pm

where can I get the maven dependency please if useable for java…
Tapa says:

February 12, 2020 at 6:20 pm

Now you can do this in 3 lines of code. Jeez, how far we come in 3 years
Sarvagya Mamgain says:

February 13, 2020 at 7:26 am

can someone tell When I train YOLO for custom detection via my own dataset, it is again and again giving error that "STB Reason : Could not fopen" to the image path stored on Training line being exectuted
ismail oujaa says:

February 19, 2020 at 6:58 am

جيد جدا
Godwin George says:

February 24, 2020 at 1:02 am

This is really great, but i had some installation issues while installing darkflow in my laptop.
Any one who know this willing to help me?
dumbcreaknuller says:

February 26, 2020 at 12:50 pm

still doesn't solve the unsupervised learning problem because if you solved unsupervised learning, you would not need to pretrain any object at all, instead the program could just figure out on its own how to classify what object based on information alone from the outside having nothing to build on to begin with other than the method of the learning algorithms itself. example, what if i wanted to make the network recognize sound or words at the same time as detecting people and animals. what if i wanted to make the program think that the sound it heard was directly related to a portion of the screen and treat that portion as a separate object until the program had done this same process over and over and over making different version of part of the screen as that particular object until it can tell background from actual object. it means the program must find part of a image to be more of one object that other parts where each versions of this audio based guess of graphics narrow down what a object is as a separate object discarding what is not a object or another object from what its previous version was thinking was the object of iterest. let say a program just make a section of the image a cut based on aproximity and angle and start learning what ever pattern in it and decides its a dog even it could be part of a table or bed at the same time but over time of repetition is able to separate the real dog from other objects. what i means is what if you just start by training in junk with certain keywords of jumble and try to make the network over time detect real objects and classify real names to them by guessing. this way, you chould be able to make unsupervised learning work. just think of how a child think the name dog is spelled dol and over time learning its spelled dog. anyway if such a system like yolo could have incorporated somthing like i suggest, it could have a real advantage of realtime tracking of what at first will be detected junk and classified jumble. this way, you don''t need to pretrain any data but let it train itself over time or teach itself over time. what it would do would be like a human getting everything wrong at first past the baby stage and then become more and more human as it learns. i hope there is some real superintelligent human out there that want to make a unsupervised version of yolo that stated out as a complete idiot of a program and over time become very acurate in predicting what object is what, and with what name.
Swagatam Guha says:

March 2, 2020 at 10:34 am

hello. how this system works ?
Emad Boctor says:

March 8, 2020 at 11:45 pm

Image annotation interface(for object detection) check https://github.com/emadboctorx/labelpix

Comments are closed.