While discussions around AI training data often revolve around the unauthorized use of copyrighted content, it’s crucial to spotlight the companies paving the way for ethical data practices. The responsible utilization of data in AI development is a critical issue, and a select group of visual AI platforms are leading by example, demonstrating that AI can flourish while respecting intellectual property rights and fairly compensating creators.
At the forefront of this movement are four pioneering platforms that have taken proactive steps to ensure their AI models are trained exclusively on properly licensed content, while also implementing safeguards to prevent users from generating copyrighted material. Their approach mitigates legal risks, fosters trust with creators and showcases the potential benefits of prioritizing ethical data practices.
- Adobe Firefly
Adobe’s Firefly platform is trained on the company’s vast Adobe Stock library, comprising over 230 million images, supplemented by specially commissioned works to fill in gaps. This ensures that the AI-generated content is based solely on ethically sourced materials. Adobe compensates contributors through a bonus system based on their historical performance on Adobe Stock, considering the total number of approved images and licenses generated within a 12-month period. Since its launch in March 2023, Firefly has generated an impressive over 6.5 billion images to-date, demonstrating its popularity and the demand for ethically sourced AI-generated content. - Getty Images
Getty Images, with its collection of over 415 million images, has partnered with NVIDIA to offer a text-to-image generator accessible to registered users. Priced at $14.99 per 100 generations, the tool is trained solely on Getty’s non-editorial content. Contributors are compensated through a model similar to Adobe’s, based on the historical popularity of their content. While Getty has not disclosed specific usage volumes for this relatively new tool, the company is leveraging the technology to develop other AI-based functionalities to enhance its existing database, with plans to make it available to third-party partners in the future. - Shutterstock
Shutterstock, boasting over 400 million images and 25 million videos, offers various subscription and licensing options, including options for AI training data. Their text-to-image generator, created in association with OpenAI and similar to DALL-E 3, pays contributors in two ways :
1) For the use of their data to train a generative model (i.e. when a deal is made with the likes of OpenAI, NVIDIA, Meta, LG, etc.)
2) When somebody licenses an image created with the AI Image Generator, Shutterstock pays a royalty into the Contributor Fund that is divided commensurately according to the amount of contributor data that went into training the model used for that particular image (right now DALL-E 3 and Imagen 2, with more model options to be introduced in the future). While reserved for Shutterstock users, their AI image platform has generated over 1 million images since its launch, reflecting the growing user base and demand for ethically sourced AI image generation. - Bria.ai
Bria.ai partners with diverse content providers, including Getty Images, Alamy, Envato, Danita Delimont, TONL, and many others, through revenue-sharing agreements. This approach ensures that creators are continuously rewarded financially every time their content is used to generate synthetic data, with no ceiling on potential earnings. With an estimated training dataset of over 850 million images, Bria.ai’s dataset is the largest and most comprehensive legal dataset among the four platforms. Their text-to-image generator is available to licensed partners.
Notably, none of these platforms waited for legislation to implement their clean data approach. Instead, they proactively took steps to prioritize ethical AI development, setting a positive example for the entire industry. By utilizing properly licensed content and ensuring fair compensation for creators, these companies are demonstrating the potential benefits of responsible data practices, including improved model accuracy, reduced legal risks, fostering trust with creators, and achieving commercial success.
As the AI industry continues its rapid growth, it is crucial for more developers to follow suit and contribute to a future where AI thrives while respecting the rights and contributions of creators. These four visual AI platforms showcase the viability and advantages of ethical data practices, paving the way for a more responsible and sustainable AI ecosystem.
Main image via Bria.ai
1 Comment