IBC
article here
Live sports entertainment remains the most powerful driver of real-time engagement in media, but the format through which it’s delivered is rapidly evolving.
As vertical video becomes the dominant consumption format – not just for social clips but increasingly for live content – the question for broadcasters is no longer whether to adapt, but how efficiently they can scale.
For Fox Sports, the answer appears to be training machines to think like camera operators.
“Around 90% of our viewership comes from vertical video,” Ricardo Perez-Selsky, Sr. Director, Digital Production Operations, Fox Sports explains. “That alone shows the scale of demand.”
Fox Sports broadcasts NFL, college football, MLB, motorsports including IndyCar and NASCAR, and global soccer properties such as the FIFA World Cup. Across that portfolio, vertical consumption is no longer a secondary format – it’s the primary one.
Its coverage of LIV Golf, the World Baseball Classic and IndyCar are already complemented by vertical formatted video highlights published to social media. When Fox broadcasts all 104 FIFA World Cup matches live across the US this summer, its streaming and TV coverage will be accompanied by extensive vertical video programming.
All of it is being delivered using technology developed together with AWS Elemental using an AI model trained on Fox Sports content over two years of tests.
From dedicated workflow to machine learning
The origins of the project trace back to summer 2024, when Fox Sports Digital was covering the Euros and Copa América tournaments.
“For our digital-exclusive content (pregame, halftime, postgame, highlights) I basically built out a small control room dedicated to vertical video,” Perez-Selsky says. “It had its own director and producer and team of editors.”
When Amazon engineers visited the setup, they were struck by the fact that Fox Sports was taking 16:9 video and doing a full recut specifically for vertical platforms.
“I explained why that was so important to both our team and our audience.”
The concept sparked an internal hackathon at Amazon later that summer. That experimentation became the early stages of AWS Elemental Inference – an AI driven system designed to automatically convert 16:9 broadcast feeds into 9:16 vertical video.
Teaching a machine visual storytelling
“While other tools can convert horizontal to vertical using ball tracking or player tracking, what sets this apart is the depth of development. We trained the model on how Fox Sports Digital produces vertical highlights – focusing on storytelling, smooth camera motion, and following the flow of play, not just the ball.”
The AI model had to learn not just where the action is, but how broadcast cameras behave. “There’s an art to sports camera work. It’s not just panning left and right. There’s a ramp-up and ramp-down. There’s smooth acceleration and deceleration, anticipating passes, not just reacting.”
Over 18 months, the model learned how to behave like a camera operator within a 16:9 frame – avoiding jerky movement and judging where the action is heading.
“It’s more like visual storytelling,” Perez-Selsky says. “Following the action of a play as opposed to just following the ball.”
For example, in soccer, “You’re covering Lionel Messi with the ball, then he passes it 50 feet ahead and there’s this jerky movement trying to catch up. Our technology can anticipate when that’s going to happen. You see a ramp-up and a ramp-down so it’s smooth motion in the highlight.”
That level of refinement took time. “The first time we put it through, it was a little choppy, a little rough. But by the second and third time, you could see that machine learning taking place. It improves with repetition. The machine had to learn how to act like a camera operator inside of a 16:9 frame.”
Replacing 80% of manual editing
While live sports has historically been built for horizontal television screens, reformatting that experience for vertical, in real time, without doubling production costs has remained a stubborn technical and operational hurdle.
For instance, vertical conversion required manual keyframing inside a non-linear editing (NLE) system. “An editor would take 16:9 content and keyframe it into 9:16. That was probably 80% of the workload,” says Perez-Selsky.
Now, that process is fully automated. “The most time-consuming piece (keyframing) is entirely automated.”
Nonetheless, Fox Sports retains vital human oversight. “Custom graphics, custom copy, publishing – you still want eyes on that. And I’d argue you’d want that to stay manual anyway.”
But the heavy lifting is handled by AI. What’s more, the ability to ‘self-improve’ applies across all sports, according to the Fox executive.
“You can feed it basketball, football, soccer – and it gets better. Anything you feed through it actually learns. That said, if you want optimal 9:16 quality, the machine needs time to learn.”
No vertical cameras required
The automation also shifts the economics of vertical production. Rather than deploying separate vertical camera crews, Fox can leverage the primary broadcast feed.
“It depends what you’re trying to capture,” Perez-Selsky says. “Broadcast cameras which are usually now in 4K, provide resolution and access to ultra-slow motion and multiple angles that an iPhone simply can’t.
“Looking ahead to something like the FIFA World Cup, you might have 20 cameras on the pitch. That variety gives you far more flexibility. Using AWS tools, we can take the world feed and generate high-quality vertical clips. An iPhone can provide a unique perspective, but it’s limited in access and scale.”
“Always-on” distribution strategy
Vertical video highlights are not currently treated as a premium upsell by Fox where the motive is about reach and expanding its audience.
“Highlights and live streams help us maintain an ‘always-on’ presence. If someone follows Fox soccer or IndyCar on TikTok or Instagram, they’re consistently served high-quality vertical content. Our social accounts are round the clock, either with live programming, highlight reels, or evergreen content. Nothing is behind a paywall. That consistency helps grow our subscriber base and broaden our audience. If there are gaps in content, audiences go elsewhere.”
For major events, Fox even offers free vertical previews – for example, the first half-inning of the World Baseball Classic, or the first few minutes of FIFA World Cup matches.
“The goal is to meet audiences where they are and encourage them to tune in via broadcast or the Fox Sports app.”
Technically there’s nothing to stop whole matches being streamed live to mobile. “It’s certainly possible,” Perez-Selsky says. “The only thing that inhibits that now is potential media rights and distribution agreements.”
Fox is already experimenting with format-specific live vertical experiences. For IndyCar Grand Prix for example it has served a vertical live stream that’s exclusively from in-car cameras. “It’s not the full broadcast, there’s no commentary – more of a raw experience,” Perez-Selsky says.
The source remains 16:9 broadcast video, processed through AWS Inference before distribution to TikTok, Instagram Reels, YouTube Shorts or the Fox Sports app.
Beyond sports
Fox’s model is sport-trained but the applications extend further. Award shows, concerts, and entertainment events are all viable with sufficient training data.
“If you fed it every Oscars or every Grammys from the last 10 years, absolutely. It’ll do a decent job right away. The only limitation is the model needs to learn it. Vertical will be part of the conversation for everything going forward, World Cup and beyond.”
Vertical is the essential play
AWS and Fox are not the first to target the market for vertical consumption. NBC Sports was another beta partner with AWS in developing Inference. Last August, ESPN launched Verts, a revamped mobile app which featured clips formatted for vertical. A mobile-first highlights feed produced with Samsung was distributed from the recent Winter Olympics Milano Cortina.
In April 2025, OTT solutions provider Quickplay launched a version of its Quickplay Shorts tool for live sports. This includes an orchestration layer, CMS and front-end leveraging TwelveLabs’ multimodal AI models to analyse, understand and timestamp key moments in videos. Its customers include Philippine streamer Cignal which has been running a ‘live shorts’ service since last April, leveraging content from the professional local leagues like the Philippine Basketball Association.
“With Quickplay Shorts, sports broadcasters can own the conversation around the game, to keep viewers engaged and to drive them to higher value, live game viewing opportunities,” says Juan Martin, Co-Founder and CTO, Quickplay. “The change in viewing behaviour requires a strategic reimagining of audience engagement.”
Bitmovin is building agentic workflows to automate clip generation, encoding and publishing to mobile-first platforms with demos expected at NAB 2026 and a release of its live vertical workflow tool in Q3.
“Vertical is a hot topic among customers,” says Jacob Arends, Senior Product Manager, Bitmovin. “Broadcasters, telcos and streaming platforms come to us to encode and optimise their content because they are challenged with how to compete with the wave of short-form, scrollable experiences.”
Bitmovin offers content metadata enrichment powered by AI scene analysis. It extracts granular metadata (objects, scenes, speech, actions) from video to optimise search and recommendations and content reuse and clipping.
“If it’s a goal, you want to follow the player who scored; track the ball into the net; and maintain contextual relevance. “That requires machine learning models that understand what’s happening in the scene,” Arends says.
Expectation not a trend
René van Koll, Senior Solutions Architect at Big Blue Marble thinks both formats will continue to coexist. “Look at radio or newspapers – everyone predicted their demise, yet they still have a market. Likewise, some content simply suits landscape better. Think of traditional cinema: that experience doesn’t translate naturally to vertical. But the market is clearly moving toward vertical, and its success shows there’s strong demand. I expect both formats to live side by side for the foreseeable future.”
From a business perspective, the strategy is about engagement first, but new revenue streams are on the radar.
“Live sports rights are among the most valuable assets in media. You can’t replicate a live or viral moment and if you’re not there, you miss that audience,” says Regina Rossi, Head of Product, AWS Media Services. “So this is about expanding reach and engagement – but also monetisation. By adding live metadata and unlocking vertical distribution, customers can create new revenue opportunities and extend their content to additional platforms.”
Martin says consumption is shifting from long-form, TV-first experiences to discovery-driven, engagement-driven formats. “People aren’t waiting for a scheduled time to sit in front of the TV. They’re flowing through content, engaging dynamically, following creators, watching news highlights and sports clips. Traditional broadcasters and rights holders need to adapt to that shift.”
It’s worth recalling that streaming was viewed as something to support the main TV experience but over the last decade has evolved to become the dominant consumption platform.
“When most viewing happened on television, horizontal made sense,” says Rossi. “But mobile consumption has dramatically increased, and vertical viewing has become the norm. Live sports in vertical, optimised for scrolling and discoverability, is an expectation – not a trend. I believe it’s a long-term shift.”
No comments:
Post a Comment