IBC
Metadata is crucial as an enabler of automated production
and targeted content. Would a standard help?
When a broadcaster shoots and distributes content, they are
throwing away up to 95% of their raw material. For live broadcasts the figure
is closer to 99%. Yet this massive wasted asset could be monetised if it can be
accessed and shared online by anyone, or better still anything, within the
media company.
Technology is arriving in the form of machine learning
algorithms that will enable the automated production of video tailored to
individuals on specific social media platforms, smartphones, streamed channels,
and TV.
Some dub this Media 4.0 - the mass customisation and
distribution of video content targeted to different channels (broadcast,
digital and social media) using AI and metadata within an existing workflow.
Getting to this stage requires knowing what the content is
and where it resides.
TVU Networks Chief Executive Paul Shen says: “The effort of
locating content you’ve already shot is often costlier and potentially slower
than going out and re-shooting material. With the increasing demand from
consumers for customised video content combined with the coming 5G networks,
producing more sophisticated stories faster will be critical to satisfy the
market. The first step on this path has to be to index everything.”
The process of identifying and distributing video content so
that media producers can follow their audience to whichever device they’re
viewing their content is possible now but it’s a fragmented picture.
Codifying standards
There have been many attempts to codify standards for metadata in the past, most notably a push by the EBU to adopt a standard known as EBU Core. There are also well established common descriptive and rights formats including TVA and ADI, and common identifiers such as EIDR (Entertainment ID Registry) and ISAN, and, of course, many technical metadata standards.
There have been many attempts to codify standards for metadata in the past, most notably a push by the EBU to adopt a standard known as EBU Core. There are also well established common descriptive and rights formats including TVA and ADI, and common identifiers such as EIDR (Entertainment ID Registry) and ISAN, and, of course, many technical metadata standards.
Each broadly supports four basic tenets of data structure,
content, value and format/exchange.
Prime Focus Technologies Vice President and Global Head,
Marketing & Communications, T. Shobhana, says: “Together these provide the
rules for structuring content, allowing it to be reliably read, sorted,
indexed, retrieved, and shared. When metadata records are formatted to a common
standard, it facilitates the location and readability of the metadata by both
humans and machines.”
Avid VP Platform & Solutions Tim Claman adds: “We think
a standard for time-based metadata would aid in the discovery of content. It
should feel similar to enabling a search on the internet. If web pages were not
designed using a common language and if data were not represented in a
consistent form it would be impossible to find anything online. The industry
should learn from that and agree to a common language and a common structure
for time-based metadata.”
However, he cautions on the practicality of achieving this.
“The industry has a mixed track record of developing and implementing metadata
standards. You can’t be overly prescriptive without being restrictive.”
While a global metadata standard may ease broadcaster
workflows, this is not considered likely.
IPV Executive Vice President of Sales and Marketing Nigel
Booth says: “A lot of different standards already exist, but asset management
vendors want to differentiate themselves – and their use of metadata is one of
the ways they do this. So, standardising how content is tagged isn’t likely to
be popular.”
Tedial General Manager, US, Jay Batista agrees: “Different
AI vendors are supplying engines with various tagging options, and they
consider their logging parameters both proprietary and a competitive edge in
the marketplace.”
Broadcasters would benefit the most from a unified metadata
schema, he says. “Yet, many content producers believe they must maintain an
internal core metadata index for their unique productions and business
requirements, and often these internal data models are not shared.”
It is believed more realistic to develop a way of sharing
data rather than standardising it.
“It’s more important to standardise how content can be
uniquely identified,” says Paul Shen. “If it is simple and transparent enough,
we may not even need a standards body.”
However, it can be challenging to do this even within one
company, let alone sharing data with third party systems and external media
partners.
“We have done MAM projects which have failed because it has
proved hard to get all stakeholders in one organisation to agree,” says Claman.
“Even when you do get agreement on the metadata governance it is often only for
a period of time.”
He explains that Avid conceives of metadata in strata. “What
these layers have in common is the ability to be expressed as individual frames
or moments. If you can aggregate that time-based strata the more discoverable
your content becomes.”
Avid advocates the idea of a consortia to devise such a
standard, much like the way the industry united to forge standards around
carrying audio, video and ancillary data over IP.
“Some vendors go into these [standardisation efforts]
looking for an opportunity to differentiate and maybe claim intellectual
property and get an edge,” warns Claman. “A consortium will work best if
vendors follow the lead of users. It leaves less room for proprietary
technology to be introduced into the mix.”
“If MAM providers are required by large broadcasters to
standardise, it’s possible that vendors will be forced to collaborate to put
forward a single-solution way of working,” says Booth. “An example of where
this has happened is IMF (Interoperable Media Format).”
TVU revealed it is working with a number of equipment
manufacturers and major broadcasters - believed to include Disney - on the best
approaches to the issues. An initial meeting is being held in June.
“We want to create a consortium which would provide guidance
to both manufacturers and media companies,” says Shen. “Every part of the
industry needs to come together if [automated production] is to happen faster.
I don’t believe any one company can do the heavy lifting.”
Sharing, not conflicting
One aim is to address potential conflicts in working with metadata originated under different AI/ asset management protocols.
One aim is to address potential conflicts in working with metadata originated under different AI/ asset management protocols.
Primestream Chief Operating Officer David Schleifer says:
“The immediate area where I would see conflict is in assuming that the value of
metadata in a file would be the same regardless of the AI-driven dataset that
generated it. As the area is still maturing, I would not assume that, for
example, face recognition from one system would be equal to face recognition
from another. In fact, one system may focus on known famous people while the
other might be a learning algorithm to build collections – therefore, different
data used in different ways built on similar technology.
“AI is a perfect
example of where an existing metadata schema would need to be expanded,” he
adds. “With AI we do not yet know where it is going or how it will be used, so
the schema needs to be extensible, allowing for growth. At a high level you can
sort all types of metadata into categories like ‘tied to the entire asset’ or
‘tied to moments in time or objects in the image at specific times’, and so on.
But in the end, creating the schema first will always lead to revisions later.”
Standardising the ontologies (terms used to describe media)
that are used within different domains would be useful when sharing content.
“Standardisation in this area would mean less confusion
across industries,” says Booth. “For example, IPV’s Curator uses controlled
vocabularies to ensure consistency and accuracy. Specific terms are selected
and tagged instead of having different operators selecting their own terms.”
An alternative is the use of technologies like XML and Rest
APIs, which are becoming increasingly popular as a format when data is
exchanged.
“The challenge with descriptive metadata is that you don’t
know ahead of time what is going to be interesting after the fact,” says
Claman. “For this reason, for news and sports, you want as much automation of
metadata creation as possible.
“We need extensible data models if we’re going to see
widespread adoption.”
Metadata fusion
Booth calls for ‘metadata fusion’, a means of bringing together data that’s saved by contrasting systems and checking where it agrees. “Doing so means that you can improve reliability. An example of this is combining speech-to-text and object recognition – if they both identify similar metadata, it’s likely correct. The key thing is to understand the provenance of the metadata - as long as you capture it you can make a decision based on it.”
Booth calls for ‘metadata fusion’, a means of bringing together data that’s saved by contrasting systems and checking where it agrees. “Doing so means that you can improve reliability. An example of this is combining speech-to-text and object recognition – if they both identify similar metadata, it’s likely correct. The key thing is to understand the provenance of the metadata - as long as you capture it you can make a decision based on it.”
Downstream, licensing and reconciliation issues need to be
considered and adhered to. Additionally, some content owners have clear
contractual rules which restrict platform operators from modifying their data.
Piksel Joint Managing Director Kristan Bullett says:
“Providing clear traceability of origination of metadata and also providing a
mechanism to lock restrict modification of attributes that should not be
modified.”
Piksel is initiating its own metadata group. It is joining
up some disparate systems that will allow customers to purchase, ingest and manage
localised metadata on a per-title basis, enabling advanced recommendations and
content discovery functionalities.
The first metadata providers to join are Austrian content
discovery specialist XroadMedia, Bindinc Metadata from the Netherlands, France’s
Plurimedia and Mediadata TV from Spain. We don’t know at the moment whether
providers like ThinkAnalytics, Rovi/TiVo and Gracenote will be ‘invited into
the club’ or whether it will act as a purely competitive offer to these
alternatives.
Established primarily to aid content editors in the quest to
augment and enhance their existing metadata, Piksel said its ‘ecosystem’ will
prove particularly useful for customers dealing with multilingual or
cross-territory titles.
“Platform operators have been abstracted away from the
responsibility for their metadata needs and need to work with the data that has
been provided to them,” says Bullett. “Part of our vision is to bridge this gap
and put that decision-making process into the hands of the people who are responsible
for ensuring end customers get the best possible user experience.”
Automated production, personalised distribution
For production, TVU’s solution is MediaMind which embeds metadata on ingest using text to speech recognition, as well as an AI to identify objects and people into specific video frames in real time. Content can be searched with TVU’s own search engine and it has an API allowing broadcasters to integrate it with existing MAM systems for archival search.
For production, TVU’s solution is MediaMind which embeds metadata on ingest using text to speech recognition, as well as an AI to identify objects and people into specific video frames in real time. Content can be searched with TVU’s own search engine and it has an API allowing broadcasters to integrate it with existing MAM systems for archival search.
“If the producer is just interested in a few frames out of a
twenty-minute file, today’s manual search processes can make locating the exact
frames time-consuming,” says Shen. “Using an Artificial Intelligence engine
with object and voice recognition will automate the process of tailoring and
distributing clips to the appropriate outlets.”
Tedial’s similar approach targets sports production. Its
SMARTLIVE tool uses AI logging to automatically create highlight clips and
pitch them to social media and distribution.
“Applications are
being developed, especially in reality television production, where
auto-sensing cameras follow motion,” says Batista. “AI tools such as facial
recognition augment the media logging function for faster edit decisions as
well as automatic social media deliveries.”
The current state of the art in AI only augments news and
sports production and is intended to augment the human curated event
presentation with automated story-telling.
The evolution of this suggests programmes will at some point
be created entirely automatically to cater for different consumer tastes.
“With the growing capability of technology to collect every
bit of data to analyse consumer behaviour, it could one day become plausible to
create a formula for how content should be produced based on the target
audience,” says Shen.
The BBC has been working on this, in the shape of
object-based media, for nearly a decade and is expecting to deliver it within
the next five years.
“The importance of metadata in this space will be crucial as
an enabler of targeted content,” says Batista. “Object-based media is a hugely
interesting concept and could transform the way content is consumed
dramatically. There are challenges, obviously but you can easily see from a
resource, storage, distribution, consumption and analytics perspective what
opportunities this could bring.”
The media company of the future could be a mass producer of
individually tailored content. A clue to how this would look is in music
distribution. Five years ago, consumers tended to download music tracks to add
to a personal collection. Today, it is more likely they will prefer a
Spotify-like service to curate and stream the content they want for them.
“Media 4.0 will see video production move from a
programme-centric to a story-centric process, where the content is
automatically produced, targeted and distributed to the viewer,” says Shen.
“Producers create the video content, and the AI engine automates the assembly
of the material and delivers it in the most effective way to the target
audience.”
Whether this is desirable or not for all content is another
matter.
“The risk and challenge here is not in our ability to move
certain types of programming to an automated process, but rather the loss of
editorial judgement,” says Schleifer. “Systems that produce content in this
manner will adhere to specific rules, and as a result will produce consistent
content that will never challenge us to get out of our comfort zone. We need to
figure out how a system like this can continue to push the envelope and
challenge us.”
No comments:
Post a Comment