IBC
The Coalition for Content Provenance and Authenticity (C2PA), is an organisation developing technical methods to document the origin and history of digital-media files, both real and fake.
article hereIn April 2022 a BBC news report claimed that Ukraine was
behind a missile attack on a Donbas station that killed 57 people. The video
opened with a BBC logo and had the broadcaster’s watermark in the corner. It
was a fake, as a BBC Verify journalist pointed out on X
but it was also a wake-up call to the broadcaster to do something about rising deepfake
disinformation.
“Everyone was horrified to see the fake video but the only
thing we could do was tweet denials,” says Laura Ellis, Head of Technology
Forecasting, BBC. “For some it was the ‘Aha!’ moment when they fully realised
we needed to do more.”
Fortunately, the Corporation was already pioneering efforts
to go beyond flagging deepfakes after the event and to show audiences the
source of video it publishes up front.
“The work of BBC Verify is key in terms of fact checking and
signalling to the audience if we’ve not been able to check it but we wanted to
raise the bar by turning the question on its head. We want to positively assert
media provenance by showing audiences how this media came to us and how it was
made.”
“Most people said that it was ambitious, that it was almost
undoable. But we are very stubborn and thought that this is something R&D
should be looking at.”
The idea of media provenance or data integrity has been
gaining ground in the tech community as a way of combatting the rush of
AI-generated fakes. It is news media which is particularly vulnerable to this
sort of attack (truth being the first casualty of war and also of political
elections). So, in a bid to take the initiative broadcasters including the BBC
along with Canada’s CBC and the New York Times joined forces to ensure their
own integrity as a trustworthy news source did not fall victim.
“What we’re seeing is a really fundamental shift in the
media ecosystem that we need to act on,” says Judy Parnall, BBC Head of
Standards and Industry. “I kind of wish the elections [including UK and the US]
were not this year. We’ll be in a better position in 2025 when efforts set in
train a few years ago really come to fruition.”
Project Origin
Project
Origin was formed by the consortium in 2018 to secure trust in news through
technology. They were later joined by Microsoft. As the project progressed,
they found fellow travellers in Adobe which had established its own similar Content Authenticity
Initiative. In 2020, they combined efforts into the Coalition for Content
Provenance and Authenticity (C2PA) to work on a
set of open standards which would allow content to contain provenance details.
“We were looking at a similar problem [to CAI] so we agreed
to work towards one technical standard,” Parnall explains. “C2PA is the
underlying technical standard and CAI and Project Origin the two user
communities feeding into it. We are absolutely hand in glove. At Project Origin
we are better placed to bring in larger news and tech organisations and Adobe
is bringing in more individual users. C2PA pulls them all together.”
A first technical standard for attaching cryptographically
secure metadata to image and video files was released in 2021. It is a free,
open source implementation released under the Linux Foundation.
Images that have been authenticated by the C2PA system can
include a “cr”
icon in the corner; users can click on it to see whatever information is
available for that image—when and how the image was created, who first
published it, what tools they used to manipulate it, how it altered, and so on.
A number of vendors have begun incorporating the standard
into their product with more announcements pending. The most recent signatories
include Sony, joining Nikon, Canon and Leica, in developing cameras capable of
capturing CP2A data at acquisition. In its press conference, Sony likened
content credentials to a “birth certificate for an image.”
Most significantly, at the start of the year month OpenAI
said it will implement the C2PA digital credentials for images generated by
DALL-E 3, the latest version of its AI-powered image generator. It said this was to prevent the use of its
Gen-AI products for misinformation ahead of the US Presidential Election in
November.
In parallel OpenAI said it was experimenting with
a “provenance classifier” for detecting images generated by DALL-E.
“Our internal testing has shown promising early results,
even where images have been subject to common types of modifications. We plan
to soon make it available to our first group of testers – including
journalists, platforms, and researchers – for feedback.”
How Content Credentials are implemented
The CAI’s work
is focused on three main areas: capture, edit, and publish, explains Santiago
Lyon, Head of CAI Advocacy and Education, Adobe. "With capture, we work
with camera and smartphone manufacturers to integrate provenance technology
into their hardware devices at production, allowing us to empirically establish
a file’s provenance from the moment a photo is taken, or a video or audio file
is recorded.”
The next area
concerns editing. Here, provenance technology is integrated into multiple tools
including Adobe’s creative suite like Photoshop, allowing for any editing
changes made to a file to be captured and securely stored and creating a secure
‘edit history’ of the file in question.
“When digital
files are published, metadata can sometimes be removed, so we are also actively
working with news publishers, social media platforms and others to retain and
display this underlying provenance information through a universal icon
displayed next to each published asset.”
This icon and
the underlying provenance information are the Content Credentials, which Lyon
says are the equivalent of a digital ‘nutrition label’ on food.
“The consumer
can then inspect the Content Credential published alongside each digital file
and better understand where it came from and what changes have been made to it.
Over time, our hope is that consumers will naturally expect to see Content
Credentials displayed alongside online images, videos, audio recordings and
other file type, to discern what is trustworthy.”
Lyon adds,
“Ahead of multiple elections happening across the world this year, we cannot
let misinformation erode trust, endanger creative and digital economies, and
even threaten democracy itself.”
Building a standard
Standards work most effectively when adopted by broad user
group and C2PA is “hoping for a critical mass” to achieve its goal. One aim is
to get the standard ratified by international standards bodies such as the ISA
which details professional standards for the auditing of financial information.
“We’re talking to everyone,” says Parnall. “A number of
groups have come together and many are not yet announced or are working out how
to integrate it into their system. The bigger the organisation the more
complicated this is.”
News workflows are particularly complex and also time
sensitive. It is important that adding C2PA signals into the chain doesn’t add
a processing delay.
Innovation incubator BBC
News Labs is testing the integration of C2PA signals on the BBC website and
trailing the same with an unnamed social platform. This work is now being
picked up and developed as a pilot by Origin partner Media City Bergen who,
with the IPTC, joined the Origin consortium as
members last year. One aim is to prove that a third party can come in and work
with signals off-the-shelf.
“Our in-house research team has found evidence that adding
provenance to images increases trust in content amongst those who don't
typically consume our content,” Ellis says. “We also found evidence that
provenance evens out trust across a range of images we use (editorial, stock
and user generated content).”
The idea is to build a range of options through which
organisations can employ provenance signals - directly at the point content is
published and using functionality offered by manufacturers.
Ellis’s group is also exploring the idea of “service
centres” to which publishers could send their images for validation and
certification; the images would be returned with cryptographically hashed
metadata validating their authenticity.
The underlying technology
The technology is a lighter touch system than blockchain. It
is being designed to work whether a user is connected to the internet or not
and CP2A members want a solution that has a lighter carbon footprint than
blockchain.
“We need to make the tools as low friction as possible and
to automate the process so the journalist has the minimal amount to do,” says
Ellis.
Another consideration is for the journalist to redact
information from the signals for instance in order to protect the identity of a
source or a vulnerable person being interviewed.
C2PA is based on cryptographically hashed metadata, a
technique that forms a small and unique representation of the underlying data.
If the data changes in any way, even by a single digital bit, the hash will no
longer match the data.
“Protecting the integrity of the hash through a
cryptographic signature is an effective way of keeping the integrity of the
whole data,” Ellis explains, “by proving the signature was a witness to the
hash and checking the data still generates the same protected hash value.”
Fox’s Verify system
Fox has gone a different route and developed an in-house
system called Verify,
an open source protocol just launched in beta and like CP2A designed to
establish the history and origin of registered media. It is built on a blockchain
developed by researchers at Polygon Labs.
Fox Corp launched a closed beta of Verify on August 23,
coinciding with the first Fox News GOP debate. To date, 89,000 pieces of
content, spanning text and images, have been signed to Verify, from Fox News,
Fox Business, Fox Sports, and Fox TV affiliates.
“With this technology, readers will know for sure that an
article or image that purportedly comes from a publisher in fact originated at
the source,” Fox
explained.
Additionally, Verify establishes a bridge between media
companies and AI platforms. Fox say Verify creates new commercial opportunities
for content owners by utilising smart contracts to set programmatic conditions
for access to content.
Social media question mark
None of these efforts will have much impact if social media
platforms don’t get onboard. Viewers
will only see content credentialled information if they’re using a platform or
application that can read and display the data.
Meta is
reportedly engaged on this issue down to the practicalities of the additional
compute requirements needed for content watermarking. X boss Elon Musk has
voiced his support for AI regulation.
“The preference is for social media platforms to take
credentialled content and continue to display those credentials. It’s a bit
chicken and egg. They want to know there’s enough movement in the standard
before they go,” says Parnall. “They are very aware of [CP2A] and the work we
have been doing.”
Social media are the “key problem space,” says Ellis
declined to comment further.
Chain of trust
The idea of content credentials and the work of the C2PA in
particular is gathering momentum. “It is remarkable watching the community come
together around C2PA which is very much a dominant force. Its really the only
standard in town.”
Nonetheless rollout might be glacial relative to the
eyewatering pace of Generative AI and media saturation of deepfakes.
“This is a gradual rollout that needs to be introduced in
every part of the ecosystem and therefore requires a lot of collaboration,”
Parnall says. “Having the ability to drill down and nest all this material
together so people can access all the information about the way the news they
are consuming is produced is important but it is not a quick fix.”
Companies including Nvidia, Publicis Groupe, AFP, Reuters
and AP, Intel, Ateme and Truepic are filling in the gaps at different points in
content production. AWS is another member.
Adobe, for example, generates the relevant metadata for
every image that’s created with its image-generating tool, Firefly. Microsoft
does the same with its Bing Image Creator.
“It will take a while for all of these components to knit
together properly but why C2PA is so utterly essential is that if you’ve got
open standards and a body of people that want this to work then you have a
chance of making it happen.”
Can it be hacked?
There’s a misunderstanding that C2PA is easy to hack or
remove but that is not the point, says Ellis. “The point is that as a
trustworthy broadcaster we want to put these signals in to show that our
content is trustworthy and to give users the ability to interrogate the various
elements.”
The likelihood is that integrity of C2PA will be
strengthened by combining it with other provenance marking technology such as
watermarking and fingerprinting.
Meanwhile Microsoft is keeping an eye on developments in
quantum computing which threaten to take computer powered cyberattacks into
another realm of sophistication.
“We are working with the tech giants so when a quantum break
happens we are as prepared as we can be,” Ellis says. “A quantum future is
built into our thinking.”
The reliability of C2PA certification is vital. If somebody
spoofs the C2PA it is “instantly a disaster for us,” she says. “We need out
output to have integrity so we’re putting a lot of effort into making it as
secure as possible and also that people understand what it is.
“It is not a watermark,” she continues. “It is a way of
communicating with the audience that this media is from [the BBC] and that this
is how we made it. Similarly, if you don’t see those signals or those signals
are broken that is the time to be alert.”
Communicating the message
Getting that message across to the public demands a huge
programme of media literacy. “It is a mammoth task,” says Ellis. “We hope to
use [BBC] airwaves and websites to explain but the issue of enormous interest
to everybody including in the regulator sector, at Ofcom, the Government and
the House of Lords.”
There are plans to bring on partners to help educate the
media business, the wider public, schools and universities. Systems integrators
could advise companies on how to adopt the standard.
“It’s like running a startup and what we’re trying to do
next is scale up,” she says.
The principal is being enshrined in legislation like the
pending EU AI Act. The Biden administration issued an executive
order on AI that requires content authentication and labelling of synthetic
content.
“Technology moves faster than legislation. Let’s be
realistic, it might not be the most appropriate standard to use in five years’
time but the principals of media provenance should be universal and perpetual.”
No comments:
Post a Comment