How Google and Vevo Leverage AI-Enabled Localisation for Saving Costs and Driving Revenue

Article Featured Image

The case for AI-based content localisation is typically made in terms of how it streamlines localisation initiatives and dramatically reduces costs, particularly for smaller publishers that might not be able to reach into other markets without it.

In this clip from Streaming Media Connect 2025, execs from two larger operations, Google Cloud’s Albert Lai and Vevo’s Natasha Potashnik, share with IntelliVid’s Steve Vonder Haar not only the cost savings opportunities but also how AI-enabled localisation can drive new revenue opportunities as well.

Using AI and local experts to tailor content to specific cultural preferences and operational requirements in different territories

Vonder Haar says to Potashnik, “Captions are one way where AI is impacting things. But you're also using AI from a localisation standpoint in terms of figuring out what content to serve up to people?”

“We are distributed off of YouTube across FAST channels,” Potashnik says. “What that means is these are live linear broadcasts that are streamed digitally, but there's a set program, just like you have a TV guide, and every 30 minutes of the day, we'll have a program block. It's across many endpoints for us, everything from Samsung, Roku, Hulu, etc. And our challenge is we are live across many territories and there's quite a lot of lift to getting a collection of channels live and a new territory. [The channels] span different genres, decades, and themes. And when you're getting it live in a new territory, [such as] Italy, you're not just going to copy-paste what you did in Spain because that wouldn't work. So, localisation is at play when you think about cultural preferences, requirements in certain territories, and then just operational efficiency…assembling these schedules, generating them en masse without requiring a team of 20 new experts to be stood up in each new territory.”

She notes that a human-guided AI approach is very important for successfully localising programming. “We've used AI to create the foundation of that programming when we go into a new territory, and then we still retain two or three experts who can help guide that model. For example, in France, even if the top hits in France at any given month are mostly English-speaking or mostly English songs, we still need a minimum number of French songs in the programming. French people just don't want to see all English songs in their linear programming. And so that's something that we get from local experts, and they guide the model to ensure that we produce an outcome that is good for the market.”

Advancements in multimodal AI models

Vonder Haar says to Lai, “It's not just all about cost savings. In some cases, we can leverage this localisation in a way that drives monetisation as well. What do you see happening in that realm? Is it possible to leverage AI local for localisation applications in a way that drives more revenue to the sheet?”

“I'm going to start with the foundational technology change that has really opened up these conversations, especially in the last 12 to 18 months,” Lai says. “And that change, it's not just with kind of the broad AI LLM wave that all of us within the industry and even as consumers have been hearing people talk about, but it's this notion of multimodal models, which really just came about a year or so ago. And so fundamentally, when you think about what's different about these models, they are trained from the very beginning to understand video, text, and audio images. They have advanced reasoning, and they have temporal understanding. So, what's happening over time. I often talk about this notion of a customer saying, 'Oh, we're looking for a knife in this program,’ but with multimodal models, what is a knife? Is it the object that you see? Is it somebody speaking the word knife in multiple languages? Are the letters K-N-I-F-E written in the background? Is it somebody from an action standpoint using a knife in a not-so-nice way? Or is it Bobby Flay chopping vegetables with a knife? Today's models can understand the differences. And so, when you think about what people want to do with content, yes, it has started with captions and subtitles and making that accuracy, improving that, automating those. But what we're also seeing is this now can expand into things like audio description…really, dubbing is not just audio in a different language. Dubbing is about the timing of the words to the original track.”

How these advancements enhance viewer experience and drive monetisation by expanding content reach

Lai discusses how these advancements enhance viewer experience and drive monetisation by expanding content reach.

“Companies can now think about how they can take their existing content, expand the reach to audiences to improve engagement, and ideally create more inventory for top-line revenue,” he says. “This [is a] way to acquire new subscribers. Companies are also looking at content outside of their country and saying, ‘If I can acquire that content, if I can localise it, and it could be any number of captions to subtitles, to even dubbing, if I can do this properly within the contractual use of that, then I can offer more content to my audience that I wouldn't have been able to get before. But I can do this in a way that, from an ROI perspective, I can again monetise you through advertising or through subscriptions.’ So, we're seeing a lot more of this notion of saying, ‘How do I expand my content library without having to produce it from scratch?’ Which is difficult, time-consuming, and requires very specific creative talent.”

Join us in May 2025 for more thought leadership, actionable insights, and lively debate at Streaming Media Connect.

Streaming Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

AI Hallucinations and Training Your LLM Like a New Puppy

Generative AI hallucinations are real and cause for constant vigilance, but as media and AI strategist Andy Beach points out in this discussion with Ring Digital's Brian Ring at Streaming Media Connect, the large-language models that power Gen AI are only as good as their training, and they'll always try to give you what you want (just like a new puppy) rather than generating content that is accurate and ethically sound unless you train them properly in this early-days era of LLMs at work and in action.

The Rise of Automated Translation in the Localisation Industry: A Q&A With Maria CastaƱeda of Lokalise

Lokalise, a cloud-based localisation and translation management system, has observed a fascinating shift in how content is being translated and has some intriguing data on the rise of automated translations versus human translations. Their findings illustrate a significant increase in automated translations in 2024. Maria CastaƱeda, Lead Product Marketing Manager at Localise, answers some questions about these new insights.

Leveraging Gen AI to Improve Discovery and Streaming Engagement

The applications of Generative AI in streaming are seemingly endless, but what are specific ways that AI can make streaming content more discoverable, more personalised, more engaging, interactive, and more effective for advertisers in leveraging targeted content to reach the right customers? Microsoft's Andy Beach, Vecima's Paul Strickland, mireality's Maria Ingold, Alvarez & Marsal's Ethan Dreilinger, and Reality Software's Nadine Krefetz explore the possibilities in this clip from Streaming Media Connect 2024.

Accessibility and Localisation: How AI Can Create More Accessible Content for Larger Audiences

With key streaming services such as Disney+, Amazon, and Netflix trying to drive down production costs across the board, premium content providers have spent considerable time looking at how they can develop or license content which isn't produced in English but can offer global appeal.