-->
为五月的纽约流媒体保留座位吧. Register Now!

生成式人工智能与媒体技术的未来

Article Featured Image

It would be hard to miss all of the ongoing conversations concerning 生成式人工智能以及它将如何影响媒体技术. So far, the conventional wisdom is that it promises to enable novices to complete repetitive tasks more easily and efficiently, 创造更好的盈利机会, to foster new tool development, and—for better or worse—to drive disruption in many different ways.

The intelligence that generative AI models like ChatGPT have been trained on is vast, and, therefore, 而不是依靠一个人的能力, it can provide as many permutations as a machine drawing on the entire span of its learning can deliver. 问对问题,也许你会得到回报.

Incorporating generative AI into our workflows has the potential to impact almost everything in media technologies, 我将在本文中研究几种可能性. Starting off with low-hanging fruit, I’ll cover production, then move onto monetization、搜索、新工具的创建等等. 与我交谈过的许多专家都认为 immediate by-products of integrating generative AI into our work to be an increase in productivity and a decrease in time spent on rote tasks. They also see the potential for some very creative problem-solving, while at the same time, a number of ethical questions that I won’t attempt to fully explore in this piece.

“In media entertainment, 我倾向于从三个角度看问题,” says Anil Jain, 战略消费行业全球董事总经理 Google Cloud. “The first is improving content creation, production, and management. The second is enhancing and personalizing audience experiences, 第三是提高盈利能力.”

Ultimately, Jain contends, monetization is “going to be the one thing everyone cares about because of the opportunity to streamline internal processes and address operational efficiencies, which is probably where the biggest impact will be in the short term. 当你想到生成AI的盈利时, 然后你开始关注可能性的艺术.

Datasets

It all starts with the data. Traditional AI systems recognize patterns based on training, then make a prediction. Generative AI uses that data to create some kind of output that’s new.

“生成式人工智能基于大量数据创造东西,” says Jonas Ribeiro, digital products, platform, and ad tech manager at Globo. “We need to create models with this data so we can create things in the M & E industry.” This could include creating scripts or summaries for editors or images, audio, or video. “Basically,” Ribeiro explains, “you need a lot of data and a lot of models.”

Both public and private data contribute to these models, Ribeiro says. “For the major initiatives, we are using open data, but we need to have a more cautious approach because the internet can influence information”—and as we all know, 互联网上充斥着不准确和可疑的信息. “我们有很多人检查结果. 不是每个人都能负担得起私人数据, 但我不能透露一些具体的工作量, 我们正在尝试使用私人数据.”

当涉及到语言模型时,各种各样的观点层出不穷. One executive I spoke with talked about customers not needing their own training for their particular products, 因为它们可以在搜索查询中提供数据. Others aren’t so convinced they want to put data out on the open internet.

“To me, it’s a garbage in, garbage out type of situation,” says Steve Vonder Haar他是IntelliVid Research的高级分析师. “如果你要从网上获取信息, you’re not going to really have a trusted source of information from which to draw. The real future for AI—at least in the business sense—is going to be in the development of limited datasets that are used to inform decision making within a specific corporate network.”

Analytics

Generative AI-driven analytics seems to be taking shape as a sort of business analysis tool on steroids. First Tube is a live-streaming platform that leverages AI-driven analytics. The company started with the idea that it wanted to simulate a project’s success in order to tweak delivery. “We are using generative services to create mock campaigns that we can then pull into our analytics platform,” notes David Clevinger, First Tube’s VP for products. These campaigns “We’ve used generative AI to create test data that mimics the way a customer would see campaign elements and the outcomes of a campaign.”

Clevinger says this could be tied to identifying the best platform for posting content; what kind of content drives better engagement—for example, driving brand awareness or getting people to sign up for a sweepstakes—which social platform is better for that specific live stream; what kinds of measurements can be delivered in views, clicks, impressions, or social media comments; or even evaluating the ROI based on the result.

First Tube is planning to build its analytics platform in-house to fully deliver on this promise. “I’m never going to hand that to a third party,” Clevinger says. “But the intention then is to say this approach worked well for this brand in the past. How can we leverage the findings there to turn that into a campaign and now ask the generative AI service to draft a media plan based on what worked well last time?”

下一步是利用First Tube的内部平台, Clevinger states, is to use generative AI “to do optimization at the vertical level. What is the best tactic or the best kind of campaign or the best parameters around a campaign? Some of our workflow pieces have historically been in spreadsheets or disparate databases,” Clevinger concedes. “我们试图做的是建立一个更全面的, robust analytics platform for our customers based on performance metrics.”

Content Creation

Another organization that decided to put some of its production requirements into the hands of generative AI is Barrett-Jackson Auction Co. “Productivity-wise, we have to write tens of thousands of car descriptions every month for our listing service,” says Darcy Lorincz, Barrett-Jackson’s CTO. The company incorporated the automotive information it owns on every car sold in the past 50 years into its own language model. Barrett-Jackson shares some of this information because it wants people to know the results of auctions, but, otherwise, 这是该公司自己专有的数据模型.

“训练自己的模型并不适合所有人, 这就是为什么这些开放式模型很好,” Lorincz explains. “现在我们可以在几秒钟内生成社论. 我们仍然需要人们做一些节制, 但是随着机器学习的越来越多, 我们的工作量更少,所以我们可以扩大规模,” he continues. This allows the company to say, “We want this car to be in this background with this person talking about it, and 5 minutes later, we have a meaningful 2-minute video describing something that would’ve taken a production team a monstrous amount of research.”

Captioning

Automated captioning is a feature that has become increasingly common in streaming, VOD, and videoconferencing. Not everyone is enamored with the captioning results AI produces. Thierry Fautier, Your Media Transformation的总经理说, “I have a friend using Google speech-to-text for captioning at a French broadcaster. Does it work? No. In a lab with English speakers, it gets a certain percentage of recognition right. Then you move that to a French environment with noise in the room and with very strict regulation requirements, and it doesn’t work.”

In the live-streaming world, there’s a lot of use of AI for captions. “One of our big things is, if you have a political, legal, pharmaceutical, or healthcare client, there is no way you want to use AI for your captions because you are only going to lose at some point,” says Corey Behnke, lead producer and co-founder of LiveX. “监督是人工智能的关键. … I actually believe that we’re going to have more demand for producer oversight than we’ve ever had before in live streaming.”

I’ve seen live software demos that have a very high accuracy level, and I use systems in the course of work all the time that don’t. Later in this article, I’ll discuss an interesting use case that is somewhat based on the same technology, 但结果却完全不同.

Advertising

“In the world of generative AI, 实际上,你可以增加每个印象的价值, because now, the ad created is actually created specifically for you at the right moment based on all the context you’ve shared so that the CPM is much higher,” says Google Cloud’s Jain. This sentiment is echoed by others as something that will have immediate appeal.

虽然很多人跟我说过定向广告, the attendant costs need as much clarification as the technical capabilities required to deliver targeted results. “I think you can make much more creative ads for a much lower cost,” says Fautier. Creating ads targeted at different groups now immediately runs up against budget limitations. “如果我现在能自动化这10个不同的子类别, 您可以为一个专门的组提供一个专门的AD. You don’t deal with more than 15 groups in general …, so you do 15 ads, and you’re done.”

Another area under consideration is digital product placement within content. “We identified some opportunities of putting a bottle on the table that could be water, beer, or soda,” says Globo’s Ribeiro. This would provide an opportunity to target a much wider audience. “我们现在还没有做到,但我们正在研究.”

Advertising Analytics

All of the advertising data required to deliver useful analytics does exist: where it ran, how it ran, what it ran against, who it was delivered to, what errors occurred, what CPM was paid for it, and so forth. The problem is that these different pieces of information are currently sitting in different systems. “On the DSP side, 有不同的系统来组合CRM, delivery, and campaign creative datasets to see if an ad was better served on Crackle or the desktop site for NBC News,” according to C.J. 莱纳德,全球媒体和广告技术顾问 MAD Leo Consulting. “For one person to sit there and look at an impression log from each of those systems and try to tie all of this together is far from practical.”

Generative AI can be used to clean up this data in ways that humans never could. “把这些不同的数据集放在一起,” Leonard says, “我们应该能够谈论更好的结果. 而不是把手指放在空中说, “基于我的直觉……”我希望自己能够说, ‘Based on my gut and this model that is out there …”

Creative Assistance 

The most common theme around applications for generative AI thus far in the media world has been how to use these tools to reduce the time it takes in postproduction to finish content. But it could be just as useful at other stages in the workflow. 我们如何在最初的构思工作中使用它? 我们如何用它来总结内容? 如果它确实在这些领域发挥了有意义的作用, what does that mean for the humans who traditionally did those jobs?

“Our customers are a lot more cautious about pure generative AI usage literally trying to do the same job that the creative would have done,” says Shailendra Mathur, 架构和技术副总裁 Avid. It’s easy to see how this would make a lot of people uncomfortable, whether they are producers, editors, animators, writers, or actors.

“One of the philosophies that we believe in is creative assistance,” notes Mathur. “It’s automating the mundane.” There are so many mundane tasks in the post workflow that can be very repetitive and time consuming, he explains, such as logging the metadata, manual content checking, searching for specific B-roll, and doing research for a script. 另一个想法是减少技术含量较低的工作. “我们今天的行业存在劳动力和技能短缺, so part of it is actually leveraging some of the AI models as well as some of the automation that results from it to drive what we could not fill with humans’ skills.”

然而,这种自动化只能到此为止. “[ChatGPT]只能猜测你想要什么,”Mathur说. “你需要知道你在要求什么, and you can’t blame ChatGPT for giving a wrong answer when you didn’t ask for the right things. If you’re asking the system to perform a job, you always need to be there to double check.”

While various levels of metadata search have been available previously, using generative AI means associated content can be surfaced that normally wouldn’t be found, Mathur explains. Large language models used in generative AI are based on a method of representation called semantic embeddings. 嵌入空间模型用于转换文本, video, 或音频对象到矢量数据库,” Mathur says. This database can identify things using object data as well as semantic information.

“When we look at the semantic embeddings’ core technology underneath, 这就是多个音频片段的关联, video, etc.,所有这些都聚集在一起,”Mathur说. “你可以说,‘这是用这种语言写的吗?“或者给我看一张关于纳丁的图片或音频。,’” and the system would return a list of every media object related to my name.

The result is that predictions about tens of thousands of image labels not observed during training are possible. This opens up source libraries to much speedier access and far greater detail than ever before. Avid has a research and advanced development lab showing many other concepts under consideration.

Just-in-Time Advertising

随着2023年ChatGPT的激动人心, says Google Cloud’s Jain, “Everyone experienced the paradigm shift to direct to consumer. But now with generative AI, there’s a bright light that’s shining on the potential for the disruption on the upstream content creation and production side as well.”

“We’re running multiple FAST channels out of our facilities,” says Tulix CEO George Bokuchava. “为什么不考虑动态边缘生成呢? Imagine you have a [brand], and you have a slot in a live stream. You can have an AI-generated ad dynamically inserted based on market conditions and whatever is going on in the world. We just need to be open-minded and think about things completely from a new angle. This is absolutely doable.”

User Experience

“I think if you look at publishers, they have this mix of excitement and fear,” notes Jain. “On the fear side, generative AI is going to reduce the amount of time that audience members spend on publisher sites, 因为它要么在某处被总结了, or it reduces the need for an individual to dig deep into what journalists produce.”

“The excitement is that can we actually create more kinds of content experiences for a consumer because we can summarize information,” Jain continues, “and we can build greater community because we can add conversational AI on top of our content to make the experience more of a dialogue, more dynamic and interactive.”

Using voice could also have a very exciting outcome for any and all software; consider conversational interfaces or using speech to control how a software product will work. “You’re actually conversing with the applications in your own natural language,” says Mathur.

Looking at the area of content creation, it’s easy to see how efficiencies are created in production. 谷歌建立了一个OTT概念验证 它使用Bard来帮助观众挑选观看的内容. 这个功能代替了传统的流媒体导航. A viewer can interact to find content, based on whatever conversation inputs they have. It’s essentially a recommendation engine with a memory and the ability to parse interactions.

And then, 如果用户想要(并且已被授予权限), they can have fries with that too (or any other type of food they can order through a delivery service) by continuing their conversation with “the voice of OTT POC.” This is an interesting approach to the overwhelming consumer content dilemma.

Lorincz says Barrett-Jackson is also looking at how to bring more interactivity to its offerings. “Now we’re doing video backgrounds with avatars that actually talk without being scripted,” he notes. “We also have a customer service robot—not an avatar—and I can ask it any question about our company, 它可以实时回答我的问题.”

Cautionary Tales

I’ve encountered many people who asked ChatGPT to write something for them and were absolutely thrilled with the results. So, to test the idea that ChatGPT is useful for creative assistance—and in the interest of research—I decided I needed to try this. I attempted to have ChatGPT write a paragraph on the same topic as this article in my style.

A few questions come to mind. 有足够的数据来生成和模拟我的声音吗? Do I really write like this? Where are the direct quotes from real people that accompany all of my articles, including this one? I provided the same exact research to ChatGPT that I drew on here, and out popped the following:

Title: Revolutionizing Media Operations Through AI: Bridging Creativity and Efficiency
In an era where technological advances shape industries at an unprecedented pace, the media sector stands as a remarkable testament to the transformative potential of Artificial Intelligence [AI]. Over the past few years, 传媒业已经成为人工智能创新的熔炉, ushering in a new era of creativity and operational efficiency. 从流媒体平台到广告活动, the convergence of AI and media operations is yielding remarkable outcomes.

I suppose the ultimate question is this: Can you tell this paragraph was generated by AI, or could it pass for being written by a human (if not necessarily one particular human)? 我们对这种区别的重视还能持续多久?

“A lot of the conversations I’ve had over the last few months have led me to believe that we’re going to see an enhanced premium placed on trust and authenticity,” says Google Cloud’s Jain. “In a world where so much more content can be created with far less toil, 人们会想知道:这是人工智能生成的吗, 或者这是人类组装的东西?”

This article has been fully researched and written by a human.

Streaming Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

生成式AI如何影响流媒体货币化?

生成式人工智能是所有行业的游戏规则改变者, and how to leverage it is a key strategic and technical concern for a range of streaming organizations looking to monetize their content and operations. 巴雷特-杰克逊拍卖公司(Barrett-Jackson Auction Company)的达西.J. Leonard of Mad Leo Consulting, and Reality Software's Nadine Krefetz explore generative AI's current and imminent impact in this clip from their panel at Streaming Media Connect 2023.

生成人工智能与媒体融资的现状(及未来)

随着ChatGPT的颠覆性崛起, in the coming months and years, if a VC or private equity firm wants to invest their pool of money in a streaming or media technology business, 生成式人工智能会成为必备组件吗?

提及的公司及供应商