The State of Video and AI 2018

Article Featured Image

The confidence level can be used to trigger manual workflows where something below a certain threshold requires manual input to review, approve, or reject certain suggestions.

“The Google video intelligence system is heavily trained on all of the assets that Google owns,” says Azimi. “If you upload a piece of video with a sneaker, Google may suggest this is a white sneaker.” Iconik can then do a keyword search identifying all white sneakers within existing footage.

Use Case: Personalisation

IRIS.TV is a video personalisation and programming platform; its core technology is used to serve contextually relevant content. “We work with publishers and marketers to serve the right content to audiences in real time. Our AI is specifically computational AI,” says Field Garthwaite, co-founder and CEO, IRIS.TV. This means the platform intelligence learns from specific data and makes choices. IRIS.TV builds personalised streams for customers, similar to what YouTube does to help publishers understand viewers’ likes and dislikes.

This AI system analyses the content consumers prefer and searches out similar content based on similar interests, tone of story, popularity, and other business rules a publisher may put in place. “We have this API plugin in a video player on [the sites for] Sports Illustrated or Time magazine,” says Garthwaite. “You’re going to see things like Skip buttons or thumbs up/thumbs down or a little pop-up showing what’s next.”

IRIS.TV’s platform AI is used to display the right editorial or to insert ads into contextually relevant editorial. Its latest product improves stream yields, where the technology is used to determine the best frequency for serving ads or placing branded content into the stream. A campaign IRIS.TV did for Bud Light is driving 20 percent completion rates on a 100-second piece of branded, in-stream content by targeting the right audience with the right content.

Use Case: Quality of Service

“The internet is not really architected to deliver high-quality video at scale, and therefore this trend toward more video consumption is going to require a major unique approach to really work well,” says Ed Haslam, CMO of Conviva. Conviva has been installing sensors or SDKs to measure video quality of service; to date it has installed almost 3 billion sensors deployed globally on behalf of its publisher base and measured around 14 billion viewing hours of video content in 2017. “As far as we know that’s the largest installed base of sensors (outside of walled-garden environments), especially for a multi-publisher solution provider,” says Haslam.

The company’s new product is Video AI Alerts, developed for technical operations teams to discover any video viewing anomalies like rebuffering, slow start times, and poor bitrates. Conviva’s AI is calculating not only what is normal, but also sensing anomalies and, more importantly, correlating those anomalies to potential causes. “It could be the asset or it could be your whole CDN provider,” says Haslam.

HBO has thousands of assets, and so setting up alerts to track every single one of those would be onerous. In one case, HBO said it never would have found the issue that was occurring with a specific asset. A provider can create sensitivities to trigger an alert when anomalies are a specific percentage off the mean for a predetermined percent of its audience. This can be based on audience size, because HBO likely cares more about the season finale of Game of Thrones than about viewers watching old episodes of The Sopranos in the middle of the night.

“[The Video AI Alert] really reduces the level of effort the operations team have to expend doing the diligence themselves,” says Haslam. “[This] gives them the prescriptive diagnostic data out there and arms them to go solve the problem. Beforehand they were doing all that stuff that machine learning is doing by themselves, by doing multiple queries against the UI (user interface). They would restrict and say ‘Only show me CDN 1’s data.’ It could take them hours potentially running all those queries, whereas machine learning does it within seconds or fractions of seconds.”

Use Case: Ad Insertion

If emotions or content can be detected in video, then ad insertion is one use case that brand marketers will welcome with open arms. Video marketing platform Innovid will be launching updates to its platform in Q1 2018 to understand contextual intelligence—for example, to see how one ad creative is working against millions of different YouTube videos. The end result is the ability to determine what content will give a specific ad, say an ad for a white sneaker, the most impact.

“Our system collects different parameters about the video content and looks for the possible correlation between ad performance and video content. Based on this analysis, the brand can optimise its media buy or optimise which creative ad is delivered against what type of content,” says Zvika Netter, CEO and cofounder, Innovid.

These capabilities are especially relevant given industry challenges in delivering ads safely, at scale. “[Our] artificial intelligence analyses billions of online activities per day across social platforms and websites, powered by RSS feeds, to understand the topics and content driving the most engagement among digital audiences.” This is nirvana for brands, being able to better understand where their engaged audience is.

Data Boasting Rights

Remember that white sneaker? A well-trained system will have the exact product number for that white sneaker. Azimi says that right now he doesn’t have the ability to train the Google Cloud Vision API, and so Cantemo is planning on supporting other frameworks. An IBM feature he likes is that the company offers use of an isolated container, so while the file proxy is being sent to Watson, the data is not being used by IBM. “The customers that have sensitive content would probably want to train a machine-learning system from scratch and have that maintained on-prem or maybe in their own virtual private cloud so that their data is not leaving their VPC environment,” he says.

This brings up two good questions: if a company is starting to train an AI model, do they own the intelligence that they are providing, and where is that data stored? “We would love for you to share your training with the world, but most people don’t want to do that,” says Kulczar. The challenge is that systems become smart based on the aggregate of data being consumed. For anyone who has used a system that has returned horrible results for image recognition, the idea of pooling customer data is certainly appealing.

With Conviva, publishers own their raw data, but Conviva owns the analytics on that data, which it uses across its whole customer base. “We call [this] transfer learning, where a global data set trains an algorithm for application use on a local dataset,” says Haslam. “We have access to this huge amount of data on behalf of our publishers.” “That algorithm is smarter for ESPN because it got trained by ESPN data, HBO data, Sky data, and CBS data.” All that data is kept secure via a cloud-based, containerised architecture for each customer.

AI can be used to generate analytics that help brands determine which ads are most relevant to the audiences watching particular types of content. 

The Future

AI is now everywhere. It’s a prevalent technological paradigm, and the expectation is technology will benefit from data and become smarter. There are a number of other use cases we’ll leave for a future article, including image compression, content rating, denial of service filtering, and even automated video editing.

Moving forward, there are a number of questions to consider, especially for content owners and publishers looking to leverage AI: how much data does a system need to become relatively accurate? What kind of software interface is there to work with a system, and what skills do staff members need to do this training? Is the data owned by the user? Will there be a larger pool of data training the system? Is the data transferable to another system? How long will it take to train a system? What is available now, and what’s on the roadmap for the next year? Is there any benchmarking information available?

We’re nowhere near the point where the fear that “the machines are taking over” is warranted. Common sense, curiosity, and abstract reasoning are still in the domain of human intelligence. Considering how much video content is being generated in the world today, AI is the only way content can be managed, parsed, and optimised in the future.

While AI can certainly help with labour-intensive processes, there’s no way AI can take over these tasks without human training for the systems and human operators to check the output. The biggest benefit is that companies can free up resources to focus on more interesting problems that need to be solved in order to deliver great streaming content at scale.

[This article appears in the Spring 2018 issue of Streaming Media European Edition.]

Streaming Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

Leverage AI for Better Sports Content and Interactivity

Whether getting fans more involved in the action, super-serving viewers the content they crave, or better targeting advertising, artificial intelligence is the future of sports video

At IBC, an AI Startup Hacks the Brain for Audience Insights

Can technology scan viewers' brains to understand how they emotionally respond to stories? One startup searches deeper for the future of recommendations.

Streaming Forum: Metaliquid Innovates with AI and Metadata

New AI implementations allow for content analysis to detect everything from faces to brands and even the type of content, offering value for content owners, brands, and consumers

Trends That Will Shape 2018 Include Post-OTT Convergence and AI

Nagra details five trends it says will shape how companies deliver-and how consumers purchase-TV in the coming year.

AI and Machine Learning Push Video Quality to New Heights

Artificial intelligence and machine learning, along with deep learning and neural networks, are solving OTT challenges from encoding quality to closed captioning.