The Black Box: In AI we trust?

Publish Date: 2023/7/19

Chapters

This episode is brought to you by Shopify. Whether you're selling a little or a lot, Shopify helps you do your thing, however you cha-ching. From the launch your online shop stage, all the way to the we just hit a million orders stage. No matter what stage you're in, Shopify's there to help you grow. Sign up for a $1 per month trial period at shopify.com slash special offer, all lowercase. That's shopify.com slash special offer.

Hey, I'm Sean Ely. For more than 70 years, people from all political backgrounds have been using the word Orwellian to mean whatever they want it to mean.

But what did George Orwell actually stand for? Orwell was not just an advocate for free speech, even though he was that. But he was an advocate for truth in speech. He's someone who argues that you should be able to say that two plus two equals four. We'll meet the real George Orwell, a man who was prescient and flawed, this week on The Gray Area. I went to see the latest Mission Impossible movie this weekend.

And it had a bad guy that felt very 2023. The entity has since become sentient. An AI becoming super intelligent and turning on us. You're telling me this thing has a mind of its own? And it's just the latest entry in a long line of super smart AI villains. Open the pod bay doors, Hal.

Like in 2001: A Space Odyssey. "I'm sorry Dave, I'm afraid I can't do that." Or Ex Machina. "Ava, go back to your room!" Or maybe the most famous example: Terminator. "They say it got smart. A new order of intelligence decided our fate in a microsecond." But AI doesn't need to be super intelligent in order to pose some pretty major risks.

Last week on the first episode of our Black Box series, we talked about the unknowns at the center of modern AI, how even the experts often don't understand how these systems work or what they might be able to do. And it's true that understanding isn't necessary for technology. Engineers don't always understand exactly how their inventions work when they first design them. But the difference here is that researchers using AI often can't predict what outcome they're going to get.

They can't really steer these systems all that well. And that's what keeps a lot of researchers up at night. It's not Terminator. It's a much likelier and maybe even stranger scenario. It's the story of a little boat. Specifically, a boat in this retro-looking online video game.

It's called Coast Runners, and it's a pretty straightforward racing game. There are these power-ups that give you points if your boat hits them. There are obstacles to dodge. There are these kind of lagoons where your boat can get all turned around. And a couple of years ago, the research company OpenAI wanted to see if they could get an AI to teach itself how to get a high score on the game.

without being explicitly told how. We are supposed to train a boat to complete a course from start to finish. This is Dario Amode. He used to be a researcher at OpenAI. Now he's the CEO of another AI company called Anthropic. And he gave a talk about this boat at a think tank called the Center for a New American Security. I remember setting it running one day, just telling it to teach itself. And I figured that it would learn to complete the course.

Dario had the AI run tons of simulated races over and over. But when he came back to check on it, the boat hadn't even come close to the end of the track. What it does instead, this thing that's been looping is it finds this isolated lagoon and it goes backwards in the course. The boat wasn't just going backwards in this lagoon. It was on fire.

covered in pixelated flames, crashing into docks and other boats, and just spinning around in circles. But somehow the AI's score was going up. Turns out that by spinning around in this isolated lagoon in exactly the right way, it can get more points than it could possibly ever have gotten by completing the race in the most straightforward way. When he looked into it, Dario realized that the game didn't award points for finishing first.

For some reason, it gave them out for picking up power-ups. Every time you get one, you increase your score and they're kind of laid out mostly linearly along the course. But this one lagoon was just full of these power-ups. And the power-ups would regenerate after a couple seconds. So the AI learned to time its movement to get these power-ups over and over by spinning around and exploiting this weird game design.

There's nothing wrong with this in the sense that we asked it to find a solution to a mathematical problem, how do you get the most points, and this is how it did it. But, you know, if this was a passenger ferry or something, you wouldn't want it spinning around, setting itself on fire, crashing into everything. This boat game might seem like a small, glitchy example, but it illustrates one of the most concerning aspects of AI. It's called the alignment problem.

Essentially, an AI's solution to a problem isn't always aligned with its designer's values, how they might want it to solve the problem. And like this game, our world isn't perfectly designed. So if scientists don't account for every small detail in our society when they train an AI, it

It can solve problems in unexpected ways, sometimes even harmful ways. Something like this can happen without us even knowing that it's happening, where our system has found a way to do the thing we think we want in a way that we really don't want. The problem here isn't with the AI itself. It's with our expectations of it. Given what AIs can do, it's tempting to just give them a task and assume the whole thing won't end up in flames.

But despite this risk, more and more institutions, companies, and even militaries are considering how AI might be useful to make important real-world decisions. Hiring, self-driving cars, even battlefield judgment calls. Using AI like this can almost feel like making a wish with a super annoying, super literal genie.

You got real potential for a wish, but you need to be extremely careful. This reminds me of the tale of the man who wished to be the richest man in the world, but was then crushed. I'm Noam Hassenfeld, and this is the second episode of The Black Box, unexplainable series on the unknowns at the heart of AI. If there's so much we still don't understand about AI, how can we make sure it does what we want, the way we want? And what happens if we can't?

Thinking intelligent thoughts is a mysterious activity. The future of the computer is just hard moment. I just have to admit, I don't really know. You're confused, Doctor. How do you think I feel? Activity intelligence. Can the computer think? No! So given the risks here, that AI can solve problems in ways its designers don't intend, it's easy to wonder why anyone would want to use AI to make decisions in the first place.

It's because of all this promise, the positive side of this potential genie. Here's just a couple examples. Last year, an AI built by Google predicted almost all known protein structures. It was a problem that had frustrated scientists for decades, and this development has already started accelerating drug discovery.

AI has helped astronomers detect undiscovered stars. It's allowed scientists to make progress on decoding animal communication. And like we talked about last week, it was able to beat humans at Go, arguably the most complicated game ever made.

In all of these situations, AI has given humans access to knowledge we just didn't have before. So the powerful and compelling thing about AI when it's playing Go is sometimes it will tell you a brilliant Go move that you would never have thought of, that no Go master would ever have thought of, that does advance your goal of winning the game. This is Kelsey Piper. She's a reporter for Vox who we heard from last episode, and

And she says this kind of innovation is really useful, at least in the context of a game. But when you're operating in a very complicated context like the world, then those brilliant moves that advance your goals might do it by having a bunch of side effects or inviting a bunch of risks that you don't know, don't understand, and aren't evaluating. Essentially, there's always that risk of the boat on fire.

We've already seen this kind of thing happen outside a video game. Just take the example of Amazon back in 2014. So Amazon tried to use an AI hiring algorithm, looked at candidates and then recommended which ones would proceed to the interview process.

Amazon fed this hiring AI 10 years worth of submitted resumes, and they told it to find patterns that were associated with stronger candidates. And then an analysis came out finding that the AI was biased. It had learned, you know, that Amazon generally preferred to hire men, so it was happily more likely to recommend Amazon men. Amazon never actually used this AI in the real world. They only tested it. But a report by Reuters found exactly which patterns the AI might have internalized.

"The technology thought, oh, Amazon doesn't like any resume that has the word 'women's' in it. An all-women's university, captain of a women's chess club, captain of a women's soccer team." Essentially, when they were training their AI, Amazon hadn't accounted for their own flaws in how they'd been measuring success internally. Kind of like how OpenAI hadn't accounted for the way the boat game gave out points based on power-ups, not based on who finished first.

And of course, when Amazon realized that, they took the AI out of their process. But it seems like they might be getting back in the AI hiring game. According to an internal document obtained by former Vox reporter Jason Del Rey, Amazon's been working on a new AI system for recruitment. At the same time, they've been extending buyout offers to hundreds of human recruiters.

And these flaws aren't unique to hiring AIs. The way AIs are trained has led to all kinds of problems. Take what happened with Uber in 2018, when they didn't include jaywalkers in the training data for their self-driving cars. And then a car killed a pedestrian. Tempe, Arizona police say 49-year-old Elaine Herzberg was walking a bicycle across a busy thoroughfare frequented by pedestrians Sunday night. She was not in a crosswalk.

And a similar thing happened a few years ago with a self-training AI Google used in its Photos app. The company's automatic image recognition feature in its photo application identified two black persons as gorillas and in fact even tagged them as so.

According to some former Google employees, this may have happened because Google had a biased dataset. They may just not have included enough Black people. The worrying thing is if you're using AIs to make decisions and the data they have reflects our own biased processes, like a biased justice system that sends some people to prison for crimes where it lets other people off with a slap on the wrist, or a biased hiring process, then the AI is going to learn the same thing.

But despite these risks, more companies are using AI to guide them in making important decisions. This is changing very fast. Like, there are a lot more companies doing this now than there were even a year ago. And there will be a lot more in a couple more years. Companies see a lot of benefits here. First, on a simple level, AI is cheap. Systems like ChatGPT are currently being heavily subsidized by investors, and they're

But at least for now, AI is way cheaper than hiring real people. If you want to look over thousands of job applicants, AI is cheaper than having humans screen those thousands of job applicants. If you want to make salary decisions, AI is cheaper than having a human whose job is to think about and make those salary decisions. If you want to make firing decisions, those get done by algorithm because it's easier to fire who the algorithm spits out than to have human judgment and human analysis in the picture.

And even if companies know that AI decision-making can lead to boat-on-fire situations...

Kelsey says they might be okay with that risk. It's so much cheaper that that's like a good business trade-off. And so we hand off more and more decision-making to AI systems for financial reasons. The second reason behind this push to use AI to make decisions is because it could offer a competitive advantage. Companies that are employing AI in a very winner-take-all capitalist market, they might outperform the companies that are still relying on expensive human labor.

And the companies that aren't are much more expensive. So fewer people want to work with them and they're a smaller share of the economy.

And you might have huge, like, economic behemoths that are making decisions almost entirely with AI systems. But it's not just companies. Kelsey says competitive pressure is even leading the military to look into using AI to make decisions. I think there is a lot of fear that the first country to successfully integrate AI into its decision-making will have a major battlefield advantage over anyone still relying on slow humans.

And that's the driver of a lot in the military, right? If we don't do it, somebody else will, and maybe it will be a huge advantage. This kind of thing may have already happened in actual battlefields. In 2021, a UN panel determined that an autonomous Turkish drone may have killed Libyan soldiers without a human controlling it or even ordering it to fire.

And lots of other countries, including the U.S., are actively researching AI-controlled weapons. You don't want to be the people still fighting on horses when someone else has invented fighting with guns. And you don't want to be the people who don't have AI when the other side has AI. So I think there's this very powerful pressure not just to figure this out, but to have it ready to go. And finally, the third reason behind the push toward AI decision-making is because of the promise we talked about at the top.

AI can provide novel solutions for problems humans might not be able to solve on their own. Just look at the Department of Defense. They're hoping to build AI systems that, quote, function more as colleagues than as tools.

And they're studying how to use AI to help soldiers make extremely difficult battlefield decisions, specifically when it comes to medical triage. I'm going to talk about how we can build AI-based systems that we would be willing to bet our lives with and not be foolish to do so. AI has already shown an ability to beat humans in war game scenarios, like with the board game Diplomacy. And researchers think this ability could be used to advise militaries on bigger decisions, like strategic planning.

Cyber security expert Matt DeVos talked about this on a recent episode of On the Media. I think it'll probably get really good at threat assessment. I think analysts might also use it to help them through their thinking, right? They might come up with an assessment and say, tell me how I'm wrong. So I think there'll be a lot of unique ways in which the technology is used in the intelligence community. But this whole time, that boat on fire possibility is just lurking.

One of the things that makes AI so promising, the novelty of its solutions, it's also the thing that makes it so hard to predict. Kelsey imagines a situation where AI recommendations are initially successful, which leads humans to start relying on them uncritically, even when the recommendations seem counterintuitive. Humans might just assume the AI sees something they don't, so they follow the recommendation anyway.

We've already seen something like this happen in a game context with AlphaGo, like we talked about last week. So the next step is just imagining it happening in the world. And we know that AI can have fundamental flaws. Things like biased training data or strange loopholes engineers haven't noticed.

But powerful actors relying on AI for decision-making might not notice these faults until it's too late. And this is before we get into the AI, like, being deliberately adversarial. This isn't the Terminator scenario with AI becoming super intelligent and wanting to kill us.

The problem is more about humans and our temptation to rely on AI uncritically. This isn't the AI trying to trick you. It's just the AI exploring options that no one would have thought of that get us into weird territory that no one has been in before. And since they're so untransparent, we can't even ask the AI, hey, what are the risks of doing this? So if it's hard to make sure that AI operates in the way its users intend...

And more institutions feel like the benefits of using AI to make decisions might outweigh the risks. What do we do? What can we do? There's a lot that we don't know, but I think we should be changing the policy and regulatory incentives so that we don't have to learn from a horrible disaster and so that we understand the problem better and can start making progress on solving it. How to start solving a problem that you don't understand after the break.

The Walt Disney Company is a sprawling business. It's got movie studios, theme parks, cable networks, a streaming service. It's a lot. So it can be hard to find just the right person to lead it all. When you have a leader with the singularly creative mind and leadership that Walt Disney had, it like goes away and disappears. I mean, you can expect what will happen. The problem is Disney CEOs have trouble letting go.

After 15 years, Bob Iger finally handed off the reins in 2020. His retirement did not last long. He now has a big black mark on his legacy because after pushing back his retirement over and over again, when he finally did choose a successor, it didn't go well for anybody involved.

And of course, now there's a sort of a bake-off going on. Everybody watching, who could it be? I don't think there's anyone where it's like the obvious no-brainer. That's not the case. I'm Joe Adalian. Vulture and the Vox Media Podcast Network present Land of the Giants, The Disney Dilemma. Follow wherever you listen to hear new episodes every Wednesday. So here's what we know.

Number one, engineers often struggle to account for all the details in the world when they program an AI. They might want it to complete a boat race and end up with a boat on fire. A company might want to use it to recommend a set of layoffs, only to realize that the AI has built-in biases. Number two, like we talked about in the first episode of this series, it isn't always possible to explain why modern AI makes the decisions it does.

which makes it difficult to predict what it'll do. And finally, number three, we've got more and more companies, financial institutions, even the military considering how to integrate these AIs into their decision making. There's essentially a race to deploy this tech into important situations, which only makes the potential risks here more unpredictable. Unknowns on unknowns on unknowns. So what do we do? I would say at this point, it's sort of unclear.

Sigal Samuel writes about AI and ethics for Vox, and she's about as confused as the rest of us here. But she says there's a few different things we can work on. The first one is interpretability, just trying to understand how these AIs work.

But like we talked about last week, interpreting modern AI systems is a huge challenge. Part of how they're so powerful and they're able to give us info that we can't just drum up easily ourselves is that they're so complex. So there might be something almost inherent about lack of interpretability being an important feature of AI systems that are going to be much more powerful than my human brain. So interpretability may not be an easy way forward.

But some researchers have put forward another idea: monitoring AIs by using more AIs. At the very least, just to alert users if AIs seem to be behaving kind of erratically. But it's a little bit circular because then you have to ask, well, how would we be sure that our helper AI is not tricking us in the same way that we're worried our original AI is doing?

So if these kind of tech-centric solutions aren't the way forward, the best path could be political, just trying to reduce the power and ubiquity of certain kinds of AI. A great model for this is the EU, which recently put forward some promising AI regulation. The European Union is now trying to put forward these regulations that would basically require companies that are offering AI products in especially high-risk areas to

to prove that these products are safe. This could mean doing assessments for bias, requiring humans to be involved in the process of creating and monitoring these systems, or even just trying to reasonably demonstrate that the AI won't cause harm. We've unwittingly bought this premise that they can just bring anything to market when we would never do that for other similarly impactful technologies. Like think about medication. You gotta get your FDA approval. You've gotta jump through these hoops. Why not with AI?

Why not with AI? Well, there's a couple reasons regulation might be pretty hard here. First, AI is different from something like a medication that the FDA would approve. The FDA has clear, agreed-upon hoops to jump through. Clinical testing. That's how they assess the dangers of a medicine before it goes out into the world. But with AI, researchers often don't know what it can do until it's been made public. And if even the experts are often in the dark, it may not be possible to prove to regulators that AI is safe.

The second problem here is that even aside from AI, big tech regulation doesn't exactly have the greatest track record of really holding companies accountable.

which might explain why some of the biggest AI companies like OpenAI have actually been publicly calling for more regulation. The cynical read is that this is very much a repeat of what we saw with a company like Facebook, now Meta, where people like Mark Zuckerberg were going to Washington, D.C. and saying, oh, yes, we're all in favor of regulation. We'll help you. We want to regulate, too. When they heard this, a lot of politicians said they thought Zuckerberg's proposed changes were vague and essentially self-serving.

that he just wanted to be seen supporting the rules, rules which he never really thought would hold them accountable. Allowing them to regulate in certain ways, but where really they maintain control of their data sets. They're not being super transparent and having external auditors. So really, they're getting to continue to drive the ship and make profits while creating the semblance that society or politicians are really driving the ship.

Regulation with real teeth seems like such a huge challenge that one major AI researcher even wrote an op-ed in Time magazine calling for an indefinite ban on AI research. Just shutting it all down.

But Seagal isn't sure that's such a good idea. I mean, I think we would lose all the potential benefits it stands to bring. So drug discovery, you know, cures for certain diseases, potentially huge economic growth that if it's managed wisely, big if, could help alleviate some kinds of poverty. I mean, at least potentially it could do a lot of good. And so you don't necessarily want to throw that baby out with the bathwater.

At the very least, Seagal does want to turn down the faucet. I think the problem is we are rushing at breakneck speed towards more and more advanced forms of AI when the AIs that we already currently have, we don't even know how they're working. When ChatGPT launched, it was the fastest publicly deployed technology in history and

Twitter took two years to reach a million users. Instagram took two and a half months. ChatGPT took five days. And there are so many things researchers learned ChatGPT could do only after it was released to the public. There's so much we still don't understand about them. So what I would argue for is just slowing down.

Slowing down AI could happen in a whole bunch of different ways. So you could say, you know, we're going to stop working on making AI more powerful for the next few years, right? We're just not going to try to develop AI that's got even more capabilities than it already has. AI isn't just software. It runs on huge, powerful computers. It requires lots of human labor. It costs tons of money to make and operate computers.

even if those costs are currently being subsidized by investors. So the government could make it harder to get the types of computer chips necessary for huge processing power.

Or it could give more resources to researchers in academia who don't have the same profit incentive as researchers in industry. You could also say, all right, we understand researchers are going to keep doing the development and trying to make these systems more powerful. But we're going to really halt or slow down deployment and release to commercial actors or whoever. Slowing down the development of a transformative technology like this

It's a pretty big ask, especially when there's so much money to be made. It would mean major cooperation, major regulation, major complicated discussions with stakeholders that definitely don't all agree. But Seagal isn't hopeless. I'm actually reasonably optimistic.

I'm very worried about the direction AI is going in. I think it's going way too fast. But I also try to look at things with a bit of a historical perspective. Seagal says that even though tech progress can seem inevitable, there is precedent for real global cooperation. We know historically there are a lot of technological innovations that we could be doing.

doing that we're not because societally it just seems like a bad idea. Human cloning or like certain kinds of genetic experiments. Like humanity has shown that we are capable of putting a stop or at least a slowdown on things that we think are dangerous.

But even if guardrails are possible, our society hasn't always been good about building them when we should. The fear is that sometimes society is not prepared to design those guardrails until there's been some huge catastrophe.

like Hiroshima, Nagasaki, just horrific things that happen. And then we pause and we say, "Okay, maybe we need to go to the drawing board." That's what I don't want to have happen with AI. We've seen this story play out before. Tech companies or technologists essentially run mass experiments on society

We're now prepared. Huge harms happen. And then afterwards, we start to catch up and we say, oh, we shouldn't let that catastrophe happen again. I want us to get out in front of the catastrophe. Hopefully that will be by slowing down the whole AI race.

If people are not willing to slow down, at least let's get in front by trying to think really hard about what the possible harms are and how we can use regulation to really prevent harm as much as we possibly can. Right now, the likeliest potential catastrophe might have a lot less to do with the sci-fi Terminator scenario than it does with us and how we could end up using AI, relying on it in more and more ways.

Because it's easy to look at AI and just see all the new things it can let us do. AIs are already helping enable new technologies. They've shown potential to help companies and militaries with strategy. They're even helping advance scientific and medical research. But we know they still have these blind spots that we might not be able to predict. So despite how tempting it can seem to rely on AI, we should be honest about what we don't know here.

So hopefully the powerful actors who are actually shaping this future, companies, research institutions, governments, will at the very least stay skeptical of all of this potential. Because if we're really open about how little we know, we can start to wrestle with the biggest question here. Are all of these risks worth it? That's it for our Black Box series.

This episode was reported and produced by me, Noam Hassenfeld. We had editing from Brian Resnick and Catherine Wells, with help from Meredith Hodnot, who also manages our team. Mixing and sound design from Vince Fairchild, with help from Christian Ayala. Music from me, fact-checking from Tian Wen. Manding Wen is a potential werewolf, we're not sure. And Bird Pinkerton sat in the dark room at the Octopus Hospital, listening to this prophecy.

Three thousand years ago, we were told that one day there would be an octopocalypse, and that only a bird would be able to ensure the survival of our species. You are that bird, Pinkerton. Special thanks this week to Pawan Jain, Jose Hernandez-Orayo, Samir Rawashteh, and Eric Aldridge. If you have thoughts about the show, email us at unexplainable at vox.com, or you could leave us a review or a rating, which we'd also love.

Unexplainable is part of the Vox Media Podcast Network, and we'll be back in your feed next week.

The Black Box: In AI we trust?

Unexplainable

Chapters

What Did George Orwell Actually Stand For?

Can AI Pose Risks Without Being Super Intelligent?

How Did an AI Learn to Exploit a Game's Flaws?

What Is the Alignment Problem in AI?

Why Are Institutions Rushing to Use AI for Decision-Making?

What Happens When AI Decision-Making Goes Wrong?

Why Are Companies Still Pushing AI Despite Known Risks?

How Might AI Influence Military and Strategic Decisions?

What Are the Challenges in Ensuring AI Operates as Intended?

What Steps Can Be Taken to Mitigate AI Risks?

Should We Be Skeptical of AI's Potential?

Shownotes Transcript

PodQuest PodQuest Podcast Discovery Engine tailors playlist for your curiosity

The Black Box: In AI we trust? 30:45

Unexplainable

Chapters

What Did George Orwell Actually Stand For?

Can AI Pose Risks Without Being Super Intelligent?

How Did an AI Learn to Exploit a Game's Flaws?

What Is the Alignment Problem in AI?

Why Are Institutions Rushing to Use AI for Decision-Making?

What Happens When AI Decision-Making Goes Wrong?

Why Are Companies Still Pushing AI Despite Known Risks?

How Might AI Influence Military and Strategic Decisions?

What Are the Challenges in Ensuring AI Operates as Intended?

What Steps Can Be Taken to Mitigate AI Risks?

Should We Be Skeptical of AI's Potential?

Shownotes Transcript

PodQuest PodQuest Podcast Discovery Engine tailors playlist for your curiosity

The Black Box: In AI we trust?