The year is coming to an end, and unlike the previous years, things are not quieting down. In fact, it’s increasing. 2022 is undoubtedly the year of Generative AI. And with it, not only a flurry of applications but many, many questions, if not anxieties. While other events might have happened in the visual space this past year, nothing will be as much remembered as the shockwave created by the successive public releases of DallE, Stable AI and Midjourney. It’s too bad because other technologies, like Neural Radiance Field ( NeRF), can certainly benefit from more exposure. If anything, 2022 has demonstrated that nothing is ever settled in the world of visual tech.

Generative AI in numbers

Launched in the otherwise sleepy summer of July 2022, Open AI’s DALL-E 2 has more than 1.5 million users generating more than two million images daily.  After the release of the first version a year earlier, the quality of Dalle 2 made everyone, skeptics included, pay attention. The same month, MidJourney released its version. It was quickly followed by Stable Difusion, released by a collaboration of Stability AI, CompVis LMU, and Runway with support from EleutherAI and LAION.  In the space of a month, computer-generated synthetic media had taken its space on everyone’s favourite computing device.

It took less than a month for the first AI-generated image to win the first prize at an art competition.

In just six months, websites with generative AI are getting almost as much web traffic as major stock photo houses (Getty Images was founded in 1995). via https://profgmedia.com/

In September, Stability AI announced it had raised $101 Million in funding from Coatue, Lightspeed Venture Partners, and O’Shaughnessy Ventures LLC. Almost a dollar an image it helped created in a few months of existence ( 170  million). 

Experts estimate that as much as 90 per cent of online content may be synthetically generated by 2026, according to a new report from the European law enforcement group Europol.

In its massive wake, the technology has triggered huge questions about:

  • Copyright infringement: The training set of some of these generators was acquired by scrapping sites with clearly defined copyrighted content without authorisation or indemnification.
  • Artist rights: some of these generators allow for the easy copying of an artist’s style, also with no authorisation and/or compensation.
  • Ethics: while some roughly limit what type of images can be generated, others leave it wide open to possible misuses like violence, racism, abuse/bullying, deepfakes and yet-to-be-defined horrors.

2023 will need to see generative AI clean up its act and self-regulate or risk crippling legislative setbacks. China has already implemented some and the European Union, which has a history of punishing American-made technology it sees as too pushy and invasive, is certainly not far behind. It will also need to mature beyond the ego-satisfying selfie avatars and dog paintings by famous illustrators and apocalyptic landscapes and deliver real-world business solutions.  Already, companies like Bria and Claid offer some strong practical applications for businesses. We would like to see more.

 

2023 will also see generative AI as the new search.
Rather than searching for content for reuse, create it. Because now you can create content for practically free. What would cost thousand in flights, hotels, gear, models, and assistants and take weeks to execute can now be done from a desktop, for pennies, with a few phrases, in minutes. So why stock and archive thousands of images on expensive servers to eventually re-use an image that might not even be exactly what you need when you can create an exact match of what you need in seconds? Create an image, use it, trash it. Repeat. No need for large image databases, inefficient search engines and inadequate queries. The new search box is now the prompt box.
The new search bar
And in a few years, auto-generation.
The next step will be the auto-generation of content based on user persona.  Images will self-create every time the page is loaded, generated based on who is visiting. If you are an old white guy, the image you will be served will be different than if you are a teenage Asian girl. And both will be extremely effective at drawing attention and delivering the message, thanks to market intelligence used to generate the most appropriate visual. We will all be served with on-the-fly auto-generated customized content exactly tailored to our demographic, environment, time of day, income, geography, and weather in order to trigger the highest response. Our visual experience will be unique to us. Never the same image as our neighbour, yet always the same message.

Truth  matters

Yet, in a world of computer-generated content, truth matters more than ever. The Content Authenticity Initiative and its wider implementation, the C2PA, have made big strive this year. The impressive advancement in synthetic media quality has helped fuel its implementation. In January, the C2PA released version 1.0 1.0 of its technical specification for digital provenance. It is quickly available in Photoshop and, before the end of the year, adopted by companies like Leica and Nikon. Associated Press, Reuters, The New York Times, and the Wall Street Journal are all coalition members with plans to use the open standard to support credibility in visual content.
The verify screen from the Content Authenticity Initiative

It is a question of existential threat. Without trust in what we see, we will all lose our ability for judgement and sense. As we will soon no longer be able to differentiate between images taken via light impression and those built via diffusion models.  Knowing how and by whom an image is created becomes necessary if we continue to count on images as sources of information.

And a couple of disappointments…

Meta misery
A view of the Metaverse, as viewed by Dalle 2

Deflated hype, poor timing, and disappointing experiences, the Metaverse has not lived up to anyone’s expectations. Some have spent millions in digital real estate, only to be the only ones to visit. One key disappointment is certainly the 80’s type visuals reminiscent of Dire Strait MTV’s “Money for Nothing” at a time when every movie or game delivers better-than-reality CGI.  While it would have been perfect when the world was anchored down, this technology that requires us to be stuck inside our houses with a mask on our heads, staring at an electronic screen, is the last thing most of us want to do right now. Or ever.

 

Non-Fungible, is it?

It finally held all the promises that early adopters of the blockchain claimed it would have. It was to revolutionise art and how we consume it. It was decentralized and fair. Everyone could participate and eventually make millions. Until it didn’t. While we all ( most of us) understood that the obscene evaluation of NFTs was to be a bubble to burst on the path to finding its respectability,  we also hoped the community would ultimately defend it against greedy speculators. But it did not happen. Once the greedy speculator left the scene, there was nothing left. Not even a technology that could be useful for anything and anyone in the visual space.

The last straw came when it was announced that most marketplaces did not implement the very promising ability of smart contracts to deliver royalties to creators. Greed won, blockchain lost. Again. Not sure if there is an option for a third revival here…

Looking ahead.

2022 has confirmed that visual content is at the centre of the human experience. It is impossible for companies today to sell services or products without the help and support of visual content. And it is equally impossible for us individuals to communicate with each other without using visual content.  2023 will continue to prove how visual tech is instrumental in powering this need. And, of course, we will be here to report on it.

Happy Holidays.

 

Main  Photo by Dmitry Ratushny on Unsplash

Author: Paul Melcher

Paul Melcher is the founder of Kaptur and Managing Director of Melcher System, a consultancy for visual technology firms. He is an entrepreneur, advisor, and consultant with a rich background in visual tech, content licensing, business strategy, and technology with more than 20 years experience in developing world-renowned photo-based companies with already two successful exits.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.