Why Our Feeds are Filled with AI Slop—and What to Do about It – Terms of Service with Clare Duffy

Why Our Feeds are Filled with AI Slop—and What to Do about It – Terms of Service with Clare Duffy


If you’ve spent any time on the internet recently, you have almost certainly seen an uptick in video content generated by AI. Whether it’s an AI video of a cat riding a motorcycle, like I came across this morning.

Look at him go. Little orange daredevil holding that line like he was born to ride. Wind in his whiskers, eyes on the horizon.

Or a clip of OpenAI CEO Sam Altman on top of Mount Everest.

How does it feel to be on top of the world? Experience it with Sora too.

‘What does it mean for all of us when more of our feeds are filled with this artificial content? And how can we distinguish between what’s real and what’s fake? To help us sort this out, I have Henry Ajder here with me today. Henry is an expert on AI and deep fakes. He’s the co-creator of the University of Cambridge’s AI in Business program. And he served as an advisor on AI for organizations ranging from Meta to the World Economic Forum. This is Terms of Service. I’m CNN Tech reporter Clare Duffy. My conversation with Henry after a short break. Henry, thank you so much for being here.

Thanks so much for having me.

‘So, the biggest news in the sort of AI-generated content space in the last few weeks has been OpenAI launching its AI video app, Sora. It’s sort of like a TikTok feed, but AI- generated videos. Are you on Sora?

So it’s a little bit difficult to get on Sora here in the UK. It’s currently US exclusive, but I have been tinkering with some VPNs to let’s say cross the pond virtually. So I’ve used it a little bit, not as much as I’d like to really get under the bonnet, but I’ve actually drawn on all of my friends saying, look, I know you guys are coming across a huge amount of this kind of AI slop content, particularly stuff generated by Sora. And I want you to share it with me as you come across it in your brain rot sessions as they refer to it. And so I’ve seen the whole gamut from really quite awful stuff, very racist, homophobic, all the way through to stuff which is just plain ridiculous and some stuff which can be quite engaging, quite fun. But I think it’s safe to say that Sora 2 has really changed the landscape of generative video and how this content is being circulated and critically how it’s also being created.

‘And Sora videos are obviously what many people are talking about right now, but it’s certainly not the only kind of AI-generated content that we’ve seen sort of taking over the internet. Give us a highlight reel, if you will, of where and how we’re seeing AI- generated content show up right now.

So since kind of early 2018, I’ve been doing what I refer to as like AI in deep fate cartography, I have been mapping the landscape and it’s safe to say over the last two, three years, there has been this absolutely kind of monumental shift in the kinds of content we’re seeing. Going out into the world using AI generated tool sets. So we’re seeing people creating text to speech voice clones. We’re seeing people using voice skinning tools or voice to voice tool sets, we’re seeing people generating highly realistic AI generated image sets. So that could be with tools like Mid Journey, that could be with some of the other tools like Imogen from Google. And then of course, we’re seeing the video space really changing. And that’s not just whole cloth generation of wacky weird videos. That’s also really interesting tools that use existing videos to drive animations as it’s called. So you can take a video of us right now. And you could skin it to make it look like we’re a Tudor court or in a cyberpunk dystopia, right? The amount of tools out there is endless. Now, it used to be the case I could spend a couple of hours a day looking at the new research papers looking at new tools being released and I could be confidently on top of the generative landscape. Now there are teams of hundreds of people working on this trying to get on top of it. And I don’t think anyone truly can. And the danger right now is people are saying, oh my God, ha ha, this slop content, it’s so low effort. And they’re not realizing that actually maybe one in 20 of the videos they’re watching that they don’t think is slop content is still actually AI generated.

‘Yeah, this was something I was thinking about is like, you’ve been able to create AI-generated photos, AI-generated content for some time now. They weren’t very good up until recently. But I was sort of pondering like, why is this sort of takeover of the internet happening now? And is it because AI-generated content becomes more valuable or meaningful the more realistic it gets?

It’s a really good question. And I don’t think we can say it’s immediately realism, because some of the videos that are going viral are clearly ridiculous, right? For example, the one that went quite big a few months ago was the ASMR of kind of glass fruit being cut with a knife, people knew that wasn’t authentic, but it was still, you know, satisfying what good ASMR should do. You know, it’s something that catches people’s imagination. And the realism of the outputs doesn’t necessarily mean that they’re photorealistic, you think it’s real, it actually happened. But it’s maybe about the physics being consistent. That’s why I think Demis Hassabis, when he, uh, kind of showed off Veo 3, which was Google’s model. One of the examples he showed off was onions sizzling in a pan. And the sound was consistent, the kind of the physics of what the pan and the onions and the oil did were consistent. So even if you’re creating ridiculous things, you’re creating ridiculous things that kind of make sense within our understanding of how the world works. I think the fact that we can now generate semantically consistent audio at the same time as video, it makes it pop. It makes it more visceral in a way that previously just wasn’t the case.

‘For people who haven’t tried it, explain the process of making an AI-generated video now.

‘So if we fast forward back to late 2017, early 2018, when kind of deep fakes the term was first coined and generative video was really starting to enter the sort of hobbyist dimension, I guess we can say. That was for face swapping. So you would be specifically taking a video and you’d be swapping one face into that piece of content. At that time, it was mostly non-consensual image abuse against women, a problem that persists to this day. Those tools were incredibly clunky to use. You would have to gather tens, if not hundreds of images. You’d then have to kind of clean that up, align the faces in those screenshots or those frames, and then run it through a model, which wasn’t easy to kind of navigate. You had to have some proficiency with software. Now those tool sets for face swapping are pretty accessible. They’re much easier than they were, but a lot of people just don’t bother with face swaping when you can use tools like Veo 3.1 or Sora 2 or Flux or Kling or some of these other models out there. What you do is as if you were generating text with ChatGPT, you purely prompt for it. You are given the option to provide some direction depending on how much you want to go into the details. And off you go, you press a button and your output is generated, often with the top models now within minutes. One of the extra things that also comes in, particularly with Sora 2, is that there is this cameo feature, which allows you to swap real people who have consented to their likeness being used into features as well as yourself, which is notable compared to some of the previous models that have come before.

On the back end, what goes into improving the quality of this content? Like, how have these tools gotten so much better so quickly?

So I don’t really have the kind of the inside baseball and kind of what they’ve exactly done. That’s kind of a secret source that hasn’t been made widely available or at least certainly the kind of the details haven’t been disclosed by the companies that are really pushing it forwards. But I think certainly we’re seeing compared to, yeah, again, three, four years ago, the ability to generate that photorealistic output clearly suggests more data being put into the models, right, much higher quality data than previously was available perhaps. I think we’re also probably seeing a lot more computational resource being dedicated than prior efforts to just brute force it. That’s certainly how we’ve seen a lot of the kind of AI revolution of the last six, seven years or so being pushed is just by throwing more data and more compute at these kind of problem sets at, you know, generating better video or better audio.

‘Henry says that while the quality of this AI-generated content is rapidly improving, it’s still challenging to generate specific detail in videos, like with characters coming in and out of frame. It’s also difficult to create realistic replicas of people with less of an internet presence. For instance, Sora 2 is great at creating depictions of OpenAI CEO Sam Altman, but It can be less accurate for people with less of an online footprint.

‘One of the points that actually came out to me speaking to a correspondent at the BBC who was doing a report on this, who has brown skin was saying that it really struggled to recreate his cameo well. And it particularly struggled to give him an English accent that sounded like he actually did, because it kind of almost kept failing to kind of match together those two parts of his identity. There is a kind of a concern that seems to be emerging and on a small scale here, but it’s part of a broader concern that in a kind of an AI-first world where avatars are potentially a big part of how we communicate online, we might have a kind of almost like a second-class citizen category where white people are hyper-realistically represented with fine-tuned details, and people of ethnic minority backgrounds don’t get the same kind of quality of avatar, which is something that I think a lot of these companies are certainly aware of, but need to keep working on.

‘Yeah, it’s such a good point. I mean, we’ve talked about how bias can show up in AI systems in a number of ways. But if the companies don’t make a concerted effort to train these models with certain kinds of data that are representative of diverse groups of people, then the outputs of these models might also struggle to be representative of the world. I’m curious, Meta released a standalone scrolling feed of AI-generated videos, even before OpenAI launched the Sora app, and people kind of mocked it. It did not take off in the way that Sora did. Why do you think Sora worked so much better?

‘Mmm. I think it was Meta Vibes, right? It was the name of the app. I think just simply put, Sora 2, the quality of the model was better. I also think that not being able to clone yourself in the same way and not being to critically clone someone like Sam Altman, who opened up his likeness for anyone to use, you know, as they wished within the bounds of the model in terms of its safety measures. That was just the perfect viral concoction. Whereas when you saw more of the kind of stuff that was coming out on vibes, which was a little bit more kind of typically anime, not hyper-realistic, not featuring well-known individuals, it doesn’t catch your attention in the same way. So I think the ability to use it on yourself and indeed of celebrities and critically deceased individuals, um, which is a huge ethical kind of worms, but I think it was just perhaps a slightly better executed launch of a more powerful tool with broader functionality.

‘You talked about the point about deceased individuals being depicted in this AI-generated content, which I want to talk about a little bit more, in part just because it sort of raises this question around who can consent to their likeness being reflected. Last week, OpenAI announced that they were pulling back on the ability to create AI depictions of Martin Luther King, Jr. on Sora, and that came after MLK Jr.’s daughter, Bernice King called on people to stop sending her AI generated videos of her late father. How do we work through this issue? And it also strikes me that, you know, even if OpenAI decides to pull back on this, they’ve shown that it’s possible. And so almost certainly we’re gonna see other companies with lower standards continue to make this possible, right?

Yeah, this is, in my view, one of the most devilish ethical challenges that we’re sort of dealing with right now. I talk about this use of AI to generate deceased individuals in kind of two ways. So one is sort of synthetic resurrection. And I think this is closer to a format where you perhaps have something done with an estate or the family members of a deceased individual, which is done in consultation with them. And really puts emphasis on respecting the individual who’s deceased. It’s done thoughtfully. The alternative to this is what I call tech romance. This is this idea that we almost kind of wantonly puppeteer the dead and make them kind of dance for us. And I think this is closer to a lot of what I see kind of coming out on Sora 2 is closer to tech romance than kind of respectful synthetic resurrection. So the problem is we cannot get informed consent for the synthetic generation of each deceased individual in every instance, right? They’re dead, we can’t ask them. We can, like Robin Williams, who was very ahead of his time, put in our will that we don’t want people to bring us back in some form after we die. But there are many people who haven’t had that choice. So my attitude is I don’t think we’re gonna be able to stop this happening completely, but we need to make sure that there are clear procedures and protocols as to how it should be done.

‘Zooming out just a little bit, what’s in it for the companies that are making this technology? Like many of these AI-generated content tools, Sora, Vibes, are currently free. Why do they want us creating this AI content?

Yeah, it’s a great question. And I think it’s worth saying that I think Sora 2 in particular is costing OpenAI a lot of money.

Lots of money, billions of dollars in data center investments.

But in terms of why they’re doing it, I mean, I think there are a couple of reactions I have. One is the idea of the attention economy is pretty well established now, and that by gaining attention, you are ultimately gaining power, gaining political currency, commercial currency. And I think the format that OpenAI have taken here. Is they will have no doubt seen the popularity of some of these AI generated videos on mainstream platforms, right? For some people, it’s one in three videos now, if not even more are AI generated, right? And the way that these algorithms work, particularly for TikTok, is that it knows what you want. And so the reason that people are getting so many of them is because they’re actually engaging with them. So I think these companies would say, well, look, why should we just control the means of production and not also the means of distribution, right? So I, I think it’s that combination of the attention economy, drawing people to this content on your terms, on your platform, and recognizing that this kind of video, despite the fact it’s controversial, despite the fact that it’s polarizing, the numbers speak for themselves. This is clearly gaining traction among some people. I think the key challenge that these businesses and these organizations are going to need to try and figure out though, is how much of this is novelty, how much of this has staying power, and will people ultimately get bored or just become apathetic towards this sort of stuff? And will it actually lead to more growing backlash in the long term, even if it leads to short term gain in the immediate?

‘After the break, Henry and I discuss what our brain rot scroll sessions through AI slop content mean for all of us. And I ask him, is it even possible to distinguish the real from the artificial anymore? We’ll be right back. What does it mean for all of us that more and more of what we’re seeing online is artificial, even if it isn’t the sort of extreme examples like we’ve talked about on this show of fake war footage that’s meant to mislead people, like even if the more generic slop, what does it means for us that more of our media that we’re consuming is this AI-generated content?

I think one of the big things here is a kind of reality apathy. And I think this is something that by no means is brand new, but I think it’s being accelerated by AI generated content. Now, we’ve lived in a synthetic world for a very long time. Really, since media has existed, we’ve tried to manipulate it. You know, in the digital age, obviously, tools like Photoshop and others have been around for a long time Many people don’t recognize that computational photography is ubiquitous with most flagship smartphones. When you take a picture and say no filter, really, there’s a huge amount of algorithmic work still going on to shape the image you take. You know, synthetic media is everywhere. It’s already part of our daily lives. I think what has changed is how whole cloth generative much of the tools now available are as well. So I think that kind of flood of this content with a much higher level of sophistication to what was previously available has led to a kind of an awareness, right? I think there is this sort of moment of people going, oh, I really can’t trust what I see and hear anymore because look at what I’m coming across in my feed on a daily basis. That’s why I think you’ve seen such a huge spike in people saying, hey, Grok, is this real on X or on Twitter? As if Grok is a good judge of what’s real. And this is this is Musk has said that Grok will be able to do this, and perhaps they are working on on making Grok a better digital forensic classifier. But at the moment, it’s not designed to do that. And it worries me precisely, as you just kind of indicated, Clare, that people are relying on this, but it shows they want to know, right, they are asking because they want To know. And right now, with the landscape as it stands, there aren’t a huge amount of ways I can categorically give someone confidence in the authenticity of what they’re looking at. Sometimes there are still clear tells, but there aren’t clear enough tells anymore across the board that we can rely on them. And what’s the reality in that situation? Well, it’s people saying, Oh, sorry, or I can’t tell. So I’m going to go with my gut. It doesn’t matter. Like they wouldn’t say this. Um, you know,

Whoever I like is real, whoever I don’t like, maybe that video is AI generated.

I can’t tell if this is AI or not, but I don’t think they would ever say that. Right. And it’s interesting because we’ve seen this across the political spectrum from the left and the right. People clearly wanting to believe certain videos which suit that political perspective are real. And this is something that was quite well encapsulated on Joe Rogan’s podcast recently where he reacted to a video of Tim Wolf AI video of him I think dancing kind of strangely And, um. I think paraphrasing here, but I think Joe Rogan’s response was kind of like, oh, he’s kind of creepy. He’s kind of weird. And the person said on his show, I think I think that’s actually AI generated. He was like, Oh, yeah, you’re right. But you know what, that’s the kind of thing he would do though.

And I think this is the kind of the worry I have with this reality apathy, there’s a sense of kind of revelatory truth that AI can provide, where you can know it’s fake, and you still find it impacting your view of the world or political candidates or celebrities of your family, whoever it might be. And that’s, that’s something that is concerning to me is that if we lose that ability to know what’s real or not, it’s the same as saying our biases get free reign.

You serve as an advisor to some of the big companies making and using this technology. What are you telling them in terms of how they should be doing this responsibly? And I wonder if there are certain things that you think this technology just shouldn’t be able to generate? Like, one of the most popular sort of formats we’ve seen on Sora since it launched is these fake videos of someone being pulled over by cops or CCTV video showing people stealing things. Like, are there clear lines that need to be drawn here in your mind?

There are challenges because sometimes we want to allow certain kinds of content, which in some contexts is allowable. You know, maybe we want to be able to show a video of someone stepping out of a fast car or, you know, in fancy dress or something like this. At the same time, we really clearly don’t want to creating content which is, you, know, potentially incriminating for the subject being targeted. We definitely don’t want that to be of people who haven’t consented to their likeness being used in that way. The good news is that a lot of these models have gone onto some fairly robust safety testing to avoid things like pornographic content, certain kinds of violence, certain kind of child abuse content, and certain kinds phrases, racial slurs, and things like this. But that’s really the bare minimum. I don’t think we should be applauding companies for just doing that, and there is an expectation in my mind to do more. Now, the problem is that, particularly when they do inevitably reach a global audience. There are so many different contexts on the local level that you have to be accounting for that it’s really difficult to do that without effectively human moderation. And so it’s a very classic problem for a lot of these companies in terms of how do we get the balance right between freedoms of speech, what people should and should not be able to generate, and then how do we critically kind of moderate it using either automated classifiers and having humans in the loop. You know, I think that a lot of these companies need to be doing a bit better on this, to be honest. We need a little bit more of an iterative and more thoughtful approach, which, with the kind of current arms race dynamic, is challenging. There’s no two ways about it.

‘Yeah. Do you have tips for people about how to navigate this new world of AI-generated content, filling our feeds, how to tell what’s real or what’s not, or do we need to just sort of like let go of trying to do that and go about trying to verify if events really happened or people really said these things?

I was worried you’d ask me this, Clare. This is a really tricky question to answer, because as I said, there is a kind of nihilism or a kind apathy that can be created if we don’t give people a sense of empowerment or a way out of the problem. But right now, I think there is an overestimation of the ability of particularly free deepfake detection tools online. Those that are typically highly accessible or the ones that are least reliable. And I have concerns about the widespread use of unreliable detection systems. There’s things like digital nutrition labels. So being able to provide provenance about a piece of media, how it’s been created, what tools have been used, when that was perhaps how it has been changed since. But right now adoption is low. And for example, with Sora 2, that had this metadata attached. But because so many platforms don’t support it right now, many people didn’t even know it was there. I think most people still don’t, right? So at the moment, what I have to try and do is tell people, well, look, you can’t be a digital Sherlock. It’s not fair for me to tell you it is your responsibility to learn, to do something that I struggle with now as someone who’s been working on this for almost eight years. It is not fair to your mom, your grocer, your lawyer, your best friend to try and become that digital Sherlock is not going to happen. And it can do more harm than good again, as the signs I tell you to look for get trained out of these models. Back in 2018, it was deep fakes don’t blink. And then within a few months, deep faked were blinking. But all of those articles and all of these podcasts where people were asked. What they should look for, which is why I’m not going to say Clare stayed online. Um, so what we need is more energy from the companies, from government, from civil society, from stakeholders, and from everyday people demanding more of this digital trust infrastructure to help them navigate this new synthetic world in a way where yeah, they can view AI content. It’s not that it’s inherently bad. It’s not a guilty the mission. But it’s giving them the informed position to make a judgment themselves about how they feel about the content that they’re viewing from a place of actually having that knowledge, not having to make it guess based on their gut.

Well, Henry, thank you so much for doing this. This was such an important conversation, and I’m sure that we could check in in six months and things will have changed again, but I really appreciate your time.

Absolutely. We remain on the slop front lines. I’m sure we can speak again in a year and it’ll be very different.

‘So for better or worse, we’re all going to have to get used to seeing more AI-generated content all over the internet. And that means going forward, you have to take things you see online with a grain of salt. Like Henry said, given how advanced this technology has become, there’s no foolproof way of detecting AI- generated videos across the board. So if you see a video of a public figure that seems a little too wild to be true, take a beat before sharing it. And check trusted sources to see whether what the video shows is real. That’s it for this week’s episode of Terms of Service. I’m Clare Duffy. Talk to you next week.



Source link