Only 3% of the photographs published on the web still have their metadata — The remaining 97% are stripped of all metadata. Why? How? By whom? What are the solutions? In a time where we are confronting a surge of fake news, these questions are worth asking.
photo: Olivier Jobard /MYOP.

October 2017. Night Time in South Sudan, in a war zone called Jonglei, near the village of Akobo. It is 8 pm when French photographer Olivier Jobard, having dined of sardines out of a can along with some bananas, opens the hood of the Land Rover rented for cash a few days earlier. He connects the crocodile clips to the battery and on the other side of a cable several meters long, connects to a transformer feeding his computer. He then starts the engine so that the vehicle’s battery does not discharge. Sitting in the passenger’s seat, the computer on his lap, the photographer downloads his memory card into an external hard drive. He then imports the photos he took that day into Lightroom: the military training of a group of peasant-warriors.

The longest part of editing the selected images (about forty on nearly two hundred shot) consists in lightening the very dark faces against a very blue sky.

Metadata edition by the photographer. (photo : Olivier Jobard /MYOP)

The predefined metadata information for this assignment automatically fills in the same information for each image; copyright, name of the author, name of the agency by which Olivier broadcasts his photos, country, region, category, date and time of shooting.

The only thing left is to fill the box “description”. If the same description applies to all the pictures taken, for example, during the “obstacle course”, another is required for “shooting training” and another for the “final parade”. Do not forget to add the name of the commander on the pictures where the latter harangues his men.

In the box “keywords” – these famous words without which the photos of Olivier would remain invisible to all the search engines – the photographer types a long military list: Africa, East Africa, armed conflict, combat training, Jonglei, Nuers, South Sudan, war … etc. He re-reads all this information one last time, corrects some spelling errors and, satisfied, saves the images to an export folder.

Luckily in the area where he stands, his phone picks up a relatively powerful signal. Olivier immediately transfers his images to Paris, France. Miracle, that night the network is stable during transmission. It is almost 11 PM when the photographer can finally shut down the engine of the Land Rover, shut down his computer, unplug the transformer cable and the hard drive, carefully storing everything in a small waterproof bag and going to sleep under his tent. He must get up in four hours, well before dawn, to get closer to the fighting zone.

Unfortunately, not much will be made of these long hours devoted to informing his images as accurately as possible – the famous metadata – giving all their journalistic value to Olivier’s photographs. The metadata will be deleted, discarded, erased by the news sites who will publish them. This is also the case for the vast majority of the three billion photos published daily on the web, whether yours or mine, those of professionals published on The New York Times’ website or Liberation’s or those of amateurs exchanged by smartphone, published on Facebook, Instagram, Snapchat, etc.

While Elon Musk provokes the admiration of the media by putting a stupid car into orbit, the same media outlets do not care if the metadata disappears from the photos (therefore its sources) published on their web sites and on the web. But the same media are outraged, of course, by the multiple scandals related to fake news ….

Let’s follow the photos taken and sent by Olivier Jobard. Their metadata are not erased by the agency who received the photos and relayed them (with metadata) to the publishers who licensed them. It is the publishers that are responsible for their disappearance.

metadata vs web optimization

The photos will probably be correctly captioned and signed in the print publication as on its website. But on the latter, photos will be resized to be smaller so they can appear in the blink of an eye on the screens of our computers or our smartphones – now the leading consultation tools. The news sites IT department, to optimize images as much as possible, compresses them and out of an old bad habit, stripped them of all metadata, considered an unnecessary weight. This obsolete habit dates back to the early days of the internet when its speed was slow. It is no longer necessary today. As proof, some newer news sites do not delete them anymore.

Look at the table below carefully. It is the result of a study by IMATAG on metadata of about 120,000 images analyzed on more than twenty news sites worldwide. It is to my knowledge, the first time such a study is published. The result is stunning:


Statistics of presence and relevance of metadata on the images of some representative press sites (North America and Europe).
As you can see, there are good and bad participant. When I questioned the IT directors of the latter, they agreed that there is hardly any technical reasons for doing so, that it is “a new culture that will have to be learned”, that it is « a question of ignorance, worse, negligence » or that « it takes people like you to draw our attention to this issue » … For the New York Times, it’s even a question of confidentiality, to protect the journalist!

There is certainly no need to leave the metadata with the address, email and phone of the photographer but at least his name, the source of the image (agency, collective or news organization) and its description. Otherwise, as soon as Olivier Jobard’s reportage goes online, the article and his photographs will be downloaded, shared, screenshots will multiply, will be shared in turn, broadcasted worldwide without either source nor information.



The first step is, of course, to fill in the IPTC fields with the information that will constitute the famous metadata of a photograph.

Of course, metadata risk being modified as well. In this case, it is necessary to find ways to verify the integrity of the information. A very interesting track would be the creation of a secure and tamper-proof registry that would verify the metadata on demand, or even access the original metadata. The solution might be at the blockchain level with the creation of a decentralized register, coupled with an evolutionary image identification system. A system that will have to work at the web scale, that is for billions of images. Companies such as Binded or KODAKOne are taking this path but the emphasis is on the proof of precedence, rather than the protection of metadata. Binded only references the description field and we do not know yet what is exactly the position of KODAKOne regarding metadata.

Identification of the image remains the centerpiece of the system. Marking photographs with an invisible watermark is currently the most reliable technology for identifying a photo and link it back to its original metadata. Its adoption is still far from being global. Digimarc was one of the first to offer such marking but has no policy on metadata and focuses mostly on the corporate market. Its watermark is also not fully resistant to compression, inversion or reframing.

The most efficient in protecting metadata is clearly Imatag. For 10 € per month Imatag allows you to store, tag, share and trace your photographs on the web and in print. Coupled with its reverse search engine that references over 1 million images on the web every day – including yours as soon as they are marked – Imatag delivers an indisputable proof of ownership while reuniting a photo with its original metadata, even if it has been deleted or edited.

While some news sites are leading the way in not deleting metadata (Spiegel, Le Monde, Le Figaro or Huffington Post for example), what remains is to turn the tide and convince others that it is the easiest way to reference sources and authors while effectively combating false information. This solution is viable if search engines and social networks also decide to reference the metadata. But there is no point for publishers to lament on the phenomenon of fake news or to beg the EU to adopt neighboring rights for GAFA to pay for the use they make of their articles as long as they continue to delete the metadata of images. They are the entry point of many pictures on the web and thus should be the benchmark. It is not too late.