Machine vision is partially blind. Of all the information a camera can capture, one essential part is always missing. Unlike popular belief, 3D capture is not about rendering a 3D environment but rather giving vision sensors the power to integrate depth, very much as our own eyes ( and brain) perform. How we perceive and understand our world is very much influenced by our ability to integrate spatial dimension. Gaile Gordon, senior director of Technology at Enlighted, who has spent most of her career researching 3D vision, sat with us to tell us more, ahead of her talk at the upcoming LDV Vision Summit.
– A little about you, what is your background ?
I’ve been in the computer vision community over 20 years starting with my grad school research at MIT and Harvard in 3D face recognition, transitioning to government and commercial R&D, and then co-founding TYZX, one the pioneers in real-time 3D image sensors. TYZX was acquired by Intel in 2012 and became branded as RealSense. I helped launch our technology into the consumer market in 2015 as integrated 3D cameras for laptops and tablets. In January, I joined Enlighted. I have pretty much always had one foot on the cutting edge while one foot was grounded in the very real and practical side of making products that people can use.
– What does Enlighted do ?
Enlighted is the major smart building success story with an Internet-of-Things approach to energy savings for offices, warehouses, factories, and other commercial buildings. We have dense sensor networks installed in over 100 million sq ft of enterprise space to control lighting and HVAC. We are working on new applications that will be run on these installed networks along with our partners.
– You have done a lot of work/research on 3D. What has attracted you to 3D vision?
The world is simply not flat. To really understand what is going on around you, it seems clear that we need to bring in the full structure of the scene – the richer the data the better the analytics. There have been many approaches to bring this extra dimension into our reasoning, from dense 3D sensors using a variety of technologies, to structure from motion to learning approaches using many 2D views of the same scene or object.
I became very interested in face recognition during graduate school and was particularly struck by failures of 2D methods related to confounding head pose differences with identity differences. Cognitive psychologists told us that our brains deal differently with familiar and unfamiliar faces, so I was interested in whether that could be related to better learned 3D representations of the familiar faces – and this lead to working on face recognition algorithms based on detailed 3D input data. I worked with Cyberware to build a database of cylindrical laser scans of faces and developed several very effective methods of classifying faces from this detailed 3D data.
Although in 1990 this millimeter accuracy 3D data was not easy to produce … I’ve always felt that we will eventually have 3D sensors built into many devices, in much the same way as we transitioned from black and white image sensors to color image sensors many years ago. With my two cofounders, I started TYZX, which commercialized real-time 3D sensors based on passive and active stereo as well as distributed networks of these 3D sensors for person detection and tracking. TYZX made computationally intensive algorithms feasible for embedded vision applications with custom ASICs, taking the final step to the consumer price point after our acquisition by Intel in 2012.
– What does 3D vision bring to the visual tech space?
Any application that needs to understand what is going on around us in an interactive way, either requires or benefits from having 3D data available. The most obvious applications today are in navigation for drones or self-driving cars, and accurately computing headset position and scene structure for augmented reality or virtual reality. But as the price point falls, I’m sure we will see 3D simplify many visual computing tasks such as basic scene segmentation.
– Why haven’t we seen more 3D technology in consumer cameras?
Dedicated consumer cameras are most likely on the way out. We should look today at what technology is going into the most commonly used consumer devices in general – devices like phones, tablets, and gaming devices. 3D is making its way into these devices from all sides. Microsoft, Apple and Intel have all made 3D sensor acquisition(s) in the recent past which are making their way into consumer products. Consider Google Tango, Microsoft Xbox and HoloLens, Intel’s RealSense, and Apple… well, we don’t yet know where PrimeSense technology will end up! Intel’s RealSense 3D cameras were introduced into general purpose laptops, tablets, and phablets in 2014/15 which was a huge step made possible by reducing the power and physical footprint to the point they could be practically integrated into a 3mm thick battery-powered device. The next step is to take these types of technology advances into fully integrated embedded products like drones, cars, or perhaps even lights!
– Is Light Field photography useful for 3D vision ?
At the consumer level, using light field photography for 3D image computation hasn’t produced results that have been commercially successful compared to other sensors. Time will tell whether the same will be true for the professional content creation market.
– What excites you in the visual tech space today ?
It’s hard to choose since so much is going on in the area today: pushing the limits of what machine learning can accomplish, continuing to shrink visual processor power and footprint, seamlessly incorporating user intent into visual capture, fusing location data with visual sensing.
– What do you expect from LDV Vision Summit and why ?
Computer vision is exploding in terms of new capabilities and applications. I’m hoping to be amazed by new entrepreneurs’ creativity.
– What would you like to see Enlighted create that technology cannot yet deliver?
It’s my job to help Enlighted bring many new developments onto our sensor networks in the near future, so I can’t talk in much detail on this subject. Focus your imagination on how to take advantage of a densely distributed, powered, network of sensors which is already installed and funded for energy management – and let me know what you come up with! Our current sensors include passive IR, ambient light, temperature, and power metering. Sensor fusion of multiple basic sensors brings surprisingly robust solutions for occupancy, motion analysis, and other applications already – and we will continue to add new, richer, sensors to the mix.
Kaptur is a proud media sponsor of the LDV Vision Summit. On top of advance previews of speakers and panels, we offer our readers 25% discount pricing. Act fast, there are only 10 tickets available.
Go to: www.ldv.co/visionsummit/2016/tickets , enter KAPTUR25
Photo by “Stròlic Furlàn” – Davide Gabino
Author: Paul Melcher
Paul Melcher is the founder of Kaptur and Managing Director of Melcher System, a consultancy for visual technology firms. He is an entrepreneur, advisor, and consultant with a rich background in visual tech, content licensing, business strategy, and technology with more than 20 years experience in developing world-renowned photo-based companies with already two successful exits.