It’s been ten years since Instagram launched and not long after, the selfie. It has taken the same amount of time for visual recognition to understand how to read our faces. If anything, 2019 has been the year where faces have taken center stage of visual tech, for good and bad…
All of your faces are belong to us
The most frequent usage of our faces is facial recognition for surveillance and security. It is has been the most visible because it is the most controversial. Fueled by a competitive market made of a combination of state-owned agencies and private enterprises, facial recognition has been at the forefront of commercial image recognition services. Scores of small, medium, and large companies compete to grab shares of what is considered to become a $7 to $15 billion market by 2024.
Like with all the A.I., these companies disregard ethics in favor of profits. A client is always offered a solution, regardless of who they are and what usage they intend. The only criteria is their ability to pay.
There are three components to facial recognition: the algorithm, the teaching pool, and the index.
- Facial recognition algorithms are probably the most mature of the visual recognition space. Mostly because they have been studied for a long time, do not need to consider context, and, with no offense to anyone, a face is a rather easy object to recognize. Today’s algorithms take into account anywhere from 80 to 500 data points on a face ( like distance between eyes and width of the nose) to create a unique digital fingerprint of a person.
- The teaching pool is the images you used to train your algorithm to recognize a person. The more images of one person you have, the better it is at identifying them. Countries or companies with more forceful leadership have the most complete data pool.
- The index is what allows you to take a new image, compare it with the content of your data pool, and retrieve a match. Probably the simplest and most straightforward part of the process.
Clearly, getting the most extensive data pool is key to success, and this is where all efforts are being made. In countries with low respect for individual liberties like China, it is an almost complete categorization of the population for policing purposes. Others use a more subtle approach, like the USA, which starts with felons, expands to travelers, and might one day even access our selfies to create its own data pool. The result, however, will be the same.
Soon our faces will be part of databases whether we like it or not. It will be used partly for policing, partly for marketing. Either way, we will have little or nothing to say about how we are being tracked.
Of course, there are positives, like instant identification and personification. Financial companies see facial recognition as a potent, unique identifier to access accounts and transactions, and phone manufacturers use it to let people protect their information. It could soon be used in cars and homes, in replacement of keys, and to trigger personal comfort settings.
The greatest challenge is, as with any new technology, the lack of appropriate legislation. Without a proper debate on how, where, and when facial recognition is used, it is bound to slide into abusive and destructive applications quickly.
It is not real, or is it?
Deepfakes have exploded in 2019. Not so much in volumes as yet another source of concern brought by technology. And while anything can be replicated via synthetic data, deepfakes are mostly replacing faces: Those of celebrities put on porn stars or faces of politicians made to say or do disturbing things.
As with facial recognition, ethical barriers are no match to the destructive intent. And while our attention is entirely focused on deepfake faces, the real damage will come from unsuspected content with probably no or unrecognizable faces involved. As long as deepfakes play in the realm of famous people, movie stars, and politicians, they will easily be identifiable: Those involved will certainly loudly report the deception.
However, when AI and synthetic data matches cinematic quality, scenes of fake events will completely obliterate our already shattered credibility of visual content. It will be almost impossible to tell reality from fakery.
And the damage will be double: From the lies of the deepfakes and from the declining credibility of genuine visual content. Every content will become suspicious. It already happened with photographs, and it is now, thanks to AI, extending to videos.
Buying a new face:
2019 was also the breakthrough in GAN faces. The launch of the website thispersondoesnotexist.com was a revelation to many. Before, it took a human, some photoshop skills, and some time to generate a photo of someone who didn’t exist. It was a creative process. With generative adversarial networks, or GAN, not anymore. An unsupervised computer can get to the same result in milliseconds and with incredible accuracy.
Recognizing faces is one of mankind’s most expert attribute. We are hardwired to recognize faces. It is what babies recognize first, the faces of mom and pop. It is how we navigate in society, quickly identifying genders, age, origin, intent, social status, and many other non-verbal clues.
The ability for an algorithm to generate all by itself, with no human supervision, a face that humans recognizes as real is thus an enormous leap. Already available as free stock photos for anyone to use ( fake avatars anyone ?), computer-generated faces will soon completely replace real models everywhere they are used in photos. Next, it will be videos. And not far away, you will see deepfakes with perfectly reproduced faces along with generated voices of people who never, ever existed.
GAN generated faces might become, quite ironically, one of the most vigorous defenses against abuses in facial recognition technology. Using fake faces, one could create double or triple parallel identities, just to navigate identification barriers.
There is no doubt what has emerged in 2019 will continue through 2020 and beyond. The want to control combined with the pleasure of being recognized will undoubtedly lead to making our faces a central point of our visual technology. It will help us open doors, suggest foods or clothes while at the same time quickly separate the good guys from the bad. The question is, who decides who and what is bad?
Opening image by Photo by Sharon McCutcheon from Pexels
Author: Paul Melcher
Paul Melcher is the founder of Kaptur and Managing Director of Melcher System, a consultancy for visual technology firms. He is an entrepreneur, advisor, and consultant with a rich background in visual tech, content licensing, business strategy, and technology with more than 20 years experience in developing world-renowned photo-based companies with already two successful exits.