AI For Content Discovery Flunks The Follow-Through

AI seems to be everywhere these days with the promise of being a Swiss army knife—helping with basically everything. Which is certainly true, but the question needs to be shifted from a binary yes/no to a more nuanced “how good are the results?”

The media industry is a complex ecosystem with many processes, APIs, metadata sets etc. which are mostly handled in the background, but are key elements to offering a decent user experience, meaning giving you the right metadata to find something to watch and, going even one step further, giving you the right link that gets you to the service where you can watch.

We wanted to understand how good AI chatbots are today at supporting you in content discovery. Especially in the light of current announcements that AI is already or being integrated into entertainment operating systems. (Gemini will be part of Android TV and Samsung’s Tizen will work with Perplexity).

We kept the scope of our research intentionally narrow and excluded the whole “get me inspired” bit and focused on the capabilities of the chatbots to identify the right streamer and give you the right link. Why? Doing step 1 without the ability to execute on step 2 is pointless. 

So we ran a detailed analysis across four of the leading chatbots including ChatGPT, Claude, Gemini and Perplexity across six countries including the US, UK, France, Germany, Italy and Spain and tested for the capability to identify the right streamer for an exclusive show. We ran each prompt three times in a row to test as well for consistency of the results.

When asking “Where to stream [show xyz]?” roughly two third of the chatbots identified the correct streamer across markets and chatbots. What stood out was the extremely wide spread of correct results from ChatGPT between US/UK/Germany (c80%) and France/Italy/Spain (<30%). While the other chatbots had differing results across markets, the deltas were around 10% basis point.

Consistency, and therefore also perceived reliability, seems to be an issue. The chatbots provided differing results when asked the same questions again. A little less than a half of the runs were inconsistent when comparing three results using the same initial prompt each time. This did not include the way the chatbots presented the results, just looking at the facts of what they presented.

The result quality even dropped further, when checking for the correct links to the streamers provided by the chatbots. Less than a third provided correct deep links, while other results did not include any links, invalid links, etc. So this leaves a lot of room for improvement and a lot of work needs to be done coordinating the generative bit of the chatbots with the deterministic data models from the streamers: a link is a link and cannot be predicted or made up by the chatbots.

It will now be the time to develop better models (technical, product and commercial) to get streamers properly integrated, ensure better consistency in the results and, on top, improve the way results are presented. This will require efforts on both ends to ensure a personalized experience with the right balance of deterministic data from streamers and a conversational/generative component from the chatbots.

Bernd Riefler

Bernd is the founder and CEO of veed analytics, a market intelligence company for the video industry. They provide unique insights around video products in Europe and the US. veed runs a showroom with access to the leading European and US streaming services and corresponding hardwared. Veed was founded by Bernd, who spearheaded the development of streaming producs for MVPDs and broadcasters.

Previous
Previous

Tube Trends: How Netflix Would Upgrade YouTube Footprint With Warner Bros.

Next
Next

The 210 DMA Upgrade: How LG, CCR and AdImpact Are Making Local Work for CTV Buyers