Site icon Kaptur

Questions for a Computer Vision Scientist : Serge Belongie

Computer vision by qthomasbower

We caught up with Serge Belongie,  co-organizer and judge for the upcoming Entrepreneurial Computer Vision Challenges, part of the LDV Vision Summit to learn more about him, the competition and computer vision.

Serge Belongie
Professor at Cornell NYC Tech

– First, a little bit about you. What is your background ?

I’m currently a professor of Computer Science at Cornell University’s newly created NYC Tech campus.  I visited here for the inaugural “beta semester” in spring of 2013 and I liked it so much I decided to join full-time.  Before moving to Cornell I was a professor in the Computer Science & Engineering department at UC San Diego, which I joined in 2001 after completing my PhD at Berkeley.  My main research areas are Computer Vision and Machine Learning.  In parallel with my work in academia I’ve been involved with several startup companies in the areas of fingerprint biometrics, vehicle recognition and personal photo album organization.

– Besides recognizing people and objects in images, what practical applications can we expect from computer vision and machine learning ?

Recognition of text in images is something that will dramatically improve the quality of life for blind individuals.  Optical Character Recognition (OCR) on scanned document images has been a mature technology for quite some time, but recent advances in Computer Vision and Machine Learning technology have extended the scope of OCR to include unconstrained photos of everyday scenes ranging from subway stops to grocery store shelves to restaurant menus. –

– Tell us more about your involvement with the LVD Vision Summit . Why did you decide to collaborate on the event ?

When I first moved to NYC I was introduced to Evan by our mutual friend Karen Moon, founder of fashion & beauty analytics company Trendalytics.  Evan invited me to a dinner with a group of NYC based entrepreneurially minded people working in tech and I was inspired by the sense of community he cultivated within his extended network. When Evan proposed the idea of a summit I thought it was a great idea, very much in line with Cornell Tech’s role in bolstering the tech community in NYC.

– What do you expect to gain from it ? In other words, what would be a key sign of success.

I look forward to making new connections with researchers, inventors, investors and other people in the local tech community.  A sign of success would be a new spinoff company founded by one of the teams competing in the Entrepreneurial Computer Vision Challenge funded by (or by way of introduction by) investors in attendance at the summit.

Screenshot of Visipedia for the iPad

– What have you seen lately that excites you the most ?

One specific product I find very exciting is OrCam, a wearable computer vision system for low vision individuals with the capability to read text and recognize objects.  I’m also excited about the surge of interest in so called “fine grained visual categorization” in the research community.  Most of the advances in visual recognition from the last 10 years have targeted basic level categories, e.g., determining that an object is a bicycle as opposed to a television. While this is a problem of fundamental importance in a wide variety of applications, its solutions don’t necessarily capture the imagination of non computer savvy users.  Imagine a mom-and-pop user snapping a photo of a bird on the windowsill, submitting it to a recognition system and getting back the response “bird.”  Chances are that user already knew it was a bird — what they really wanted to know was fine grained information, i.e., detailed information such as the species, that they didn’t already know.  Projects such as Leafsnap and Birdsnap from Columbia and UMD and Visipedia from my lab and Caltech are helping to advance the state of the art in fine grained visual recognition and this will result in some very exciting technology for the general public.

Orcam demo

Prove that you or your team are computer vision experts! The deadline for the
Computer Vision Challenges is May 14th, 2014.  For details and to register :

Author: Paul Melcher

Paul Melcher is a highly influential and visionary leader in visual tech, with 20+ years of experience in licensing, tech innovation, and entrepreneurship. He is the Managing Director of MelcherSystem and has held executive roles at Corbis, Stipple, and more. Melcher received a Digital Media Licensing Association Award and is a board member of Plus Coalition, Clippn, and Anthology, and has been named among the “100 most influential individuals in American photography”

Exit mobile version