In between the real world and the digital world is Catchoom. The 3-year-old Spanish company, born from the creative mind of Augmented Reality pioneer David Marimon, offers developers the tools to automatically link what users see to, among other things, products they can buy online. Using advanced Image recognition and Augmented Reality, it automatically answers the question:” what I am looking at? ” with virtual answers. Since Sandler Research recently forecasted the Global Mobile AR market to grow at a CAGR of 96.52 % over the period 2014-2019, naming Catchoom as one of its key player, we had to sit down with co-founder and CEO David Marimon to learn more:
– A little about yourself, what is your background ?
I am the former Project Manager of Mobile Augmented Reality and Contextual Services at Telefónica R&D. I have a Ph.D. in Computer Vision and Augmented Reality from EPFL, Switzerland. I have been asked to speak at leading industry events including CES 2014 and have spoken at Augmented Reality Meetups in New York and Barcelona, MWC 2011, 2013 and 2014. I am considered an AR pioneer with more than 11 real years under my belt.
– In a few words, what is Catchoom and what does it solve?
Catchoom is a new world order image recognition technology that brings you the tools you need to create digital experiences for the real world. Catchoom is focused on dissolving the barriers between physical objects and the digital world. Fewer gimmicks, more actionable AR. Catchoom is a private, VC-funded company and the first spin-off of Telefónica R&D. Incorporated as of November 2011, Catchoom is an image recognition and augmented reality software and solutions provider. The company licenses its software On-premise and through SaaS. Catchoom’s flagship product is CraftAR — the “toolbox” for Augmented Reality web-based content management and creation, and consumption with its Image Recognition and Augmented Reality SDKs. Catchoom’s platform is compatible with Android, iOS, with plugins for PhoneGap/Cordova and Unity, and HMDs such as Google Glass. The company works with a number of brands, companies, and agencies, including Intel, Condé Nast, Times Mobile and powers leading AR providers.
– So Catchoom is in part a visual product search. It is becoming a crowded space. What makes you different from other companies like Slyce or Cortexica?
Visual Recognition or Visual Search involve different technologies and visual search engines can have two main types of outputs: instance matching or object classification.
Instance matching allows to search for a very specific object inside a database of images. Typically a user scans a book cover or wine bottle label with a specific (branded) app that leads to a product comparison, rating or even purchase. This outcome does not interpret the content of the image but tries to match it exactly (even with some difficult angle, illumination conditions, or occlusions).
Object Classification allows to identify the elements present in a picture by categories that can range from very generic (dog, cat, chair, train), to very specific (west highland white terrier, white wine-cup), but never searching through a specific pre-populated database of images, but rather categories which are trained using a database of labeled images.
– Catchoom uses Image matching vs content recognition. Why the preference?
Catchoom offers image matching and is doing R&D on content recognition. The results of this research are not disclosed, but we can comment that we leverage on 12+ years of experience in the field of multimedia retrieval from our CTO, Tomasz Adamek.
– How many active apps use Catchoom overall today? in which countries ?
Catchoom offers globally as a software licensor. We power apps that operate locally in countries in North America, Europe, and APAC, and also on apps with tens of millions of downloads worldwide. On April 2015, we reached more than 420 million image recognitions in our 3 years of existence through the apps that we power.
– Who is your target user?
The most prominent sectors where our software is used are advertising and print, with retail associated to e-commerce (e.g. wine bottle e-commerce) following.
Our target users for our SaaS platform are mainly developers and agencies who build apps for brands and retailers. For our On-Premise product, called Enterprise Image Recognition Engine, our target customers are either business for whom IR is core to their app or business, or customers that prefer to keep the images in their own server infrastructure, and not in someone else’s servers.
– Catchoom offers the possibility to do a complete visual search off-line. You can’t possibly have all products in a database downloaded to a user’s phone. How does this work ?
That’s correct, typically you’ll have 100-1000 images on-device for offline image recognition. If your entire database is larger than that, we recommend taking a look at the probably head and long-tail shaped curve of the popularity of their products. The head, most commonly requested is ideal to go on-device for rapid responses adding to wish list, etc; whereas the long-tail is well suited for cloud image recognition.
Catchoom has patented technology to cover such hybrid cases, that we call extended search.
– Catchoom also offers a AR engine. Is your target Google glasses apps and similar devices ? Aren’t you worried about their lack of traction?
Let me clarify several things. We offer Augmented Reality tools, not fully immersive as in VR. We also offer our Cloud Image Recognition API for Google Glass. Glass, in particular, is a device that is not well suited for AR, as the screen sits in a corner of your eyesight. We’ve done a collaboration with see-through AR glasses from ODG and there, using AR makes perfect sense.
Regarding the traction of Google Glass, we’re agnostic of the device that is used in combination with our SDKs and APIs, whether those are smart watches, smart glasses or smartphones, all equipped with a camera. What we do invest is in empowering people to have a seamless bridge between the real and the digital world.
– Your company has offices in Spain, California and Australia. What is the breakdown and what are the benefits ? Are engineers better in Spain ?
Our headquarters are in Spain for historical reasons, as we spun off from the R&D center of Telefónica in Barcelona. Local and foreign engineers and talent, in general, are abundant in Barcelona, especially thanks to its attractive weather and top schools.
– What would you like to create for Catchoom’s customers that technology does not yet allow you to build.
We want to cover a broader range of objects that are possible to match. As of today, we require a few visual features to be present and unique. There are challenges with very minimalistic logos or labels when matched against hundreds of thousand of images. Less textured objects such as apparel and more complex shapes such as engine parts are our focus for the coming years.
Paul Melcher is the founder of Kaptur and Managing Director of Melcher System, a consultancy for visual technology firms. He is an entrepreneur, advisor, and consultant with a rich background in visual tech, content licensing, business strategy, and technology with more than 20 years experience in developing world-renowned photo-based companies with already two successful exits.