NAB
article here
“Welcome to the
forefront of post-production evolution,” he says.
Kammes invites
post-production chiefs to take a look at a number of analytical tools. These
include StoryToolkitAI, an editing tool that uses AI to transcribe,
understand content and search for anything in your footage, integrated with
ChatGPT and other AI models. It began as a GitHub project by developer Octimot,
runs on OpenAI’s Whisper and Python, and can be used on Blackmagic Design’s
DaVinci Resolve among other professional editing systems.
“StoryToolKitAI
transforms how you interact with your own local media. Sure, it handles the
tasks we’ve come to expect from AI tools that work with media like
speech-to-text transcription. But it can understand and execute tasks that it
was never explicitly trained for,” he says.
He describes it as
a “conversational partner. You can use it to ask detailed questions about your
index content, just like you would talk with ChatGPT.”
Kammes likes that
StoryToolkit runs locally so users get privacy even while the application
itself is open source. He believes the app’s architecture is a blueprint for
how things should be done in the future.
“That is, media
processing should be done by an AI model of your choosing and can process media
independently of your creative software. Or better yet, tie this into a video
editing software’s plug-in structure, and then you have a complete media
analysis tool that’s local, and using the AI model that you choose.”
While many
analytical AI indexing solutions search your content based on literal keywords,
others perform a semantic search by using a search engine that understands
words from the searcher’s intent and their search context. This type of search
is intended to improve the quality of search results.
This is what Twelve
Labs seems to have cracked. Its tech can be used for tasks like ad
insertion or even content moderation, says Kammes. “Like figuring out which
videos featuring running water or depicting natural scenes like rivers and
waterfalls or manmade objects like faucets and showers,” he explains.
“In order to do
this, you would need to be able to understand video the way a human understands
video and what we mean by that is understanding the relationship between those
audio and video components and how it evolves over time because context matters
the most.”
Cloud storage
developer Wasabi Technologies recently acquired Curio AI, a
technology developed by GrayMeta that uses AI and ML to automatically generate
a searchable index of unstructured data. GrayMeta President and CEO Aaron Edell
and his AI team are also joining Wasabi.
According to
Kammes, speaking ahead of the acquisition announcement, “Curio isn’t just a
tagging tool. It’s a pioneering approach to using AI for indexing and tagging
your content using their localized models. Traditionally, analytical AI
generated metadata can drown you in data and options and choices, overloading
and overwhelming you. GrayMeta simplifies the search process right in your web
browser.”
Wasabi is planning
to gives its users exclusive access to Curio. It will allow them to easily
search their huge archives of unstructured data, something that was not
possible before, the company said.
“Imagine walking
into Widener Library at Harvard with 11 million volumes, and there’s no card
catalog,” David Friend, CEO of Wasabi, told Joseph Kovar at CRN. “That’s
what we have right now with unstructured data in the cloud. Our acquisition of
this machine learning technology is really going to be the most important
development since the introduction of object storage itself.”
He added, “Today
unstructured data is still in the dark ages. I believe that what we’re doing
here with Curio AI to automatically create an index of every face, every logo,
every object, every sound, every word, will really revolutionize the utility of
object storage for the storage of unstructured data.”
Wasabi plans to
fully integrate Curio into its cloud storage, and not offer it as a standalone
technology for other storage clouds.
“It’s going to be
one integrated product, and it’s going to be sold by the terabyte just like our
regular storage, but at a slightly higher price. And for that, you will get
unlimited use of the AI,” Friend detailed.
Curio will
automatically scan anything that’s put into Wasabi’s storage and produce an
index which can then be accessed using the Curio user interface and one of
several media asset management systems including Iconik, Strawberry and Avid.
The company expects to go to market with the product later this year “with
channel partners who sell into the media and entertainment industry.”
Wasabi even thinks
its combination of object storage and Curio is a step ahead of even Amazon,
Google and Microsoft in terms of functionality.
“The hyperscalers
can’t do what we’re doing with Curio. I mean, they have a toolkit, and you can
assemble something like this if you have the time and money. But there’s
nothing equivalent to this that anybody else is offering as far as I know.”
Next Kammes
addresses Code Project AI server which handles both analytical and
generative AI. He describes it as “Batman’s utility belt” where each gadget and
tool on the belt represents a different analytical or generative AI function
designed for specific tasks.
“And just like
Batman has a tool for just about any challenge, Code Project AI Server offers a
variety of AI tools that can be selectively deployed and integrated into your
systems, all without the hassle of cloud dependencies.”
This includes
object and face detection, scene recognition, text and license plate reading,
and for even the transformation of faces into anime-style cartoons.
Additionally, it can generate text summaries and perform automatic background
removal from images.
The Server offers a
straightforward HTTP REST API for integration into a facility or workflow. “For
instance, integrating scene detection in your app is as simple as making a
JavaScript call to the server’s API. This makes it a bit more universal than a proprietary
standalone AI framework,” says Kammes.
It further also
allows for extensive customization and the addition of new modules to suit
specific needs.
Finally, Kammes
highlights Pinokio “a playground for you to experiment with the
latest and greatest in generative AI.”
Pinokio is a
self-contained browser that allows you to install and run various analytical
and generative AI applications and models without knowing how to code. It does
this by taking GitHub code repositories (called repos( and automating the
complex setups of terminals, clones and environmental settings. “With Pinokio,
it’s all about easy one click installation and deployment, all within its web
browser,” Kammes insists. “It enables you to with various AI services before
they go mainstream.”
It already chock
full of diverse AI applications to play with, from image manipulation with
Stable Diffusion to voice cloning and AI generated video tools. “Pinokio helps
to democratize access to AI tools by combining ease of use with a growing list
of modules. As AI continues to grow in various sectors platforms like this are
vital in empowering users to explore and leverage AI is full potential. The
cool part is that these models are constantly being developed and refined by
the community,” Kammes says.
“Plus, since it
runs local and it’s free, you can learn and experiment without being charged
per revision. Every week there are more analytical and generative AI tools
being developed and pushed to market.”
No comments:
Post a Comment