World Economic Forum
“Our laptops, tablets, and smartphones will become precision instruments that will be able to measure three-dimensional objects in our environment”, says Michael Bronstein in this video for the World Economic Forum. The associate professor from the University of Lugano, Switzerland, says 3D sensors are key to future intelligent machines and will transform the way we interact with our computers.
Click on the link for the full presentation, or read selected quotes below.
On Making machines see
“I would say that an intelligent machine should be able to sense the environment and be able to understand the environment it is found in. If we look at the sensorial information that our body is exposed to – that we perceive through our eyes, through our nose or our tongue or our skin – the majority of this information, over 90 percent actually comes from vision. It would make sense to equip an intelligent machine with the ability to see, and to understand the world around it.”
“We humans, our visual system is very well developed. But it starts evolving from age zero as we are born. And we acquire the capability to analyze visual information way before we start walking or learn how to talk. It is so natural to us to analyze objects, that we are not fully aware of how complex this task is and what it takes for the brain to do this every second that we open our eyes.”
“I work in the field of computer vision where we try to replicate or imitate some of these wonderful capabilities of our visual system on a computer. Vision is really hard, not only for a machine; it is hard for humans and we can easily be fooled and get it wrong.”
On visioning of the future
“I’m sure that many of you have seen the movie – Minority Report – that appeared in 2002. This is a dystopian vision of how our world (hopefully not) might look in the 2020’s. And one of the most famous scenes in this movie is when Tom Cruise is using his hands to manipulate virtual objects on a giant holographic screen. This is also a vision of the producers of this movie about what our interaction with our future intelligent machines might look like.”
“If you want to design such an interface based on computer vision, you need to solve what we call the hand-tracking problem. You need to detect and recognize different parts of our fingers that constitute our hands, and there are many degrees of freedom, many ambiguities. For example, the fingers can be hidden from view. And this is why this problem is quite challenging. It’s a notoriously hard problem in computer vision.”
“Fast forward several years after the movie Minority Report. Microsoft came up with a very successful product called Kinect. It was an add-on to the Xbox gaming machine that allowed the users to control their games using their bare hands. You can move your hands and animate or activate your virtual self in your game, or interact in this natural way with your computer. Basically this was the same capability that Tom Cruise had in the very futuristic science fiction movie, but without any light gloves that made the task in the movie much easier.”
On 3D sensors
“In this case no special equipment was required to interact with the machine. This capability came from a novel 3D sensor that projected invisible laser light on the objects in front of it and, using triangulation techniques, extracted the geometric structure of these objects to the accuracy of several millimeters. It appeared that this three dimensional information actually solved many degrees of freedom, many ambiguities that exist in standard two dimensional images. So suddenly the hand-tracking problem becomes much easier in three dimensions, because many of these ambiguities are gone.”
“Kinect was a revolutionary product in the sense that technology that existed only in the lab and cost a fortune had suddenly become a commodity. Of course it was designed for gaming, and manufacturers of laptops, tablets and smartphones are fighting for every gram and millimeter in the design of their gadgets; no one would want to use a smartphone that weighs a kilogram and requires an external power source. So I was involved with my colleagues in Israel in the startup company that tried to take this dream of a 3D seeing machine one step further. We designed a technology that would allow us to shrink the size of the 3D sensor to dimensions that would fit into the display of a laptop or a tablet.”
“These technological capabilities all exist today. We are not talking about the future. We are talking about the present. I believe that 3D sensing tech is a key ingredient that might be needed for a paradigm shift in the way we interact with our intelligent machines. It will bring us closer to a more natural way of interacting with computers, replacing traditional input devices such as keyboard, touch screens or a mouse.”
Source
Vision could be a trap. Data sensibility, data correction, fast correlation and implementation are the best attributes to help humans. Machines helping is good but if we want to seat in the sofa and let them take care of all, very rapidly humans do not have space/need to exist. The human body is a machine, "Intelligence" is merging one to the other.
I think our vision system is strongly correlated with the touch during childhood which leads to the sense of dimension and texture…