Bing’s Revenge + Google’s AI Faceplant

Publish Date: 2023/2/10

Chapters

This podcast is supported by KPMG. Your task as a visionary leader is simple. Harness the power of AI. Shape the future of business. Oh, and do it before anyone else does without leaving people behind or running into unforeseen risks. Simple, right? KPMG's got you. Helping you lead a people-powered transformation that accelerates AI's value with confidence. How's that for a vision? Learn more at www.kpmg.us.ai.

Casey, we've seen two big AI announcements this week, one from Bing and one from Google called Bard. I'm just curious, in this podcast of the two of us, who would you say is more of a Bard and who is more of a Bing?

Hmm. Well, you know, I've drawn a great deal of inspiration in my career from the immortal bard William Shakespeare, and so I would probably say the latter for me. I would also say the bard because you are frequently late and often underwhelming. Oh, boy! Hey! Watch out! But I would say I am the Bing because I was laughed at for many years, and now no one's laughing. Yeah.

To be clear, I'm still laughing. I'm Kevin Roos. I'm a tech columnist at The New York Times. And I'm Casey Newton from Platformer. This week on the show, OpenAI CEO Sam Altman and Microsoft CTO Kevin Scott on the new Bing, Google's AI faceplant, and then nothing forever. ♪

All right, so Casey, we had a field trip this week, our first joint hard fork field trip. It was a true whirlwind 24 hours up and back to Redmond, Washington. Yes, we went to Redmond just outside of Seattle. I got very excited on the way there because you got there a little earlier than I did, and you texted me that there was a cheesecake factory near our hotel, and I was very excited to go to the cheesecake factory. And then you pulled a little switcheroo, and you said, actually, let's go to this nicer restaurant,

where they do not have 27-page menus, and I was a little sad. Yeah, you couldn't get both sushi and spaghetti there, unlike at the Cheesecake Factory, but we had a good time anyway. And more importantly, we saw what I think, looking back, we may see as maybe one of the more important days in tech in 2023. Yeah, so to back up, last week...

A bunch of reporters, including both of us, get an email from Microsoft. And it's a pretty cryptic email. It doesn't have a lot of details in it, but it says, we're going to be doing this announcement at our campus in Redmond, Washington on Tuesday. And you're invited, but we're not going to tell you what it is.

Right. And this kind of thing, I think it's safe to say if I had gotten this email in, you know, 2018, I would have archived it. You would have said, I don't want to hear about Microsoft Word version 36. Right. I don't want to hear about the new capabilities of the relaunched Excel. You and I have both been on a lot of these over the years. I've gotten pretty skeptical about this genre of email.

sort of show and tell because, you know, the details always leak beforehand. There's never, you know, it's never quite as exciting as you want it to be. You're in kind of a scrum of other reporters. It's just not my favorite type of reporting trip. And there's also just like so much hyperbole, right? You know, it's like any company that wants to put on an event is going to tell you that everything is about to change. But the number of times that turns out to be true is generally pretty small.

Right. But this one, actually, I was very impressed by because what it turns out Microsoft was announcing is that they are relaunching Bing, their sort of famously mocked search engine. Mm-hmm.

with OpenAI's AI built into it. So it's basically, it's an updated model of GPT. We think it's GPT-4, Microsoft won't say for sure, but built right into the Bing search experience. Yeah, so two months ago, ChatGPT comes out. We start talking about it a lot. And one of the very first things that I think to myself is, when can I just do this in Google, right? And I think it's fair to say it was a real surprise that you were actually able to do it in Google.

before you could do it in the most dominant search engine on the planet. And I can't honestly even imagine what went into re-architecting their entire search engine so that it is powered by AI and then also building that into their web browser, Edge, with a fairly, you know, decent set of 1.0 features. And it seems like they did that all within just a couple of months. Yeah.

It's really impressive. And we should say that a lot of the new Bing features are similar to what's in Chatship ET, right? Like you can have it write you an essay. You can talk to it like a therapist. You can even describe something you're trying to code and it will actually spit out the lines of code from what you type.

But it feels really different to have this stuff built directly into a search engine. Yeah. So, you know, the first thing to say is that Bing's box that you can search things in is now just much bigger. You can enter up to a thousand characters of text.

and you can sign up for that waitlist now and eventually you'll be let in and then you can essentially do anything on Bing that you have been doing on ChatGPT. Microsoft has said they are going to limit the number of queries that you can do, but they have not said what those limits are going to be. I assume people are going to push this thing to the absolute limit, just like they've been doing with ChatGPT.

So that's sort of the first thing. The second thing is in this Microsoft Edge browser, which I think gets more interesting. So you install this browser, and right now you have to use the sort of like a developer version of the browser, which has the beta features. But basically, you surf the web normally, but there is now a Bing button in the top right-hand corner. And if you open that, you can do a couple of different things. One is there is a chat tab where...

which is, again, like ChatGPT, except you can now put up to 2,000 characters in there. So you could write a lot of characters. You could ask it a very, very complicated set of questions, and then it will go ahead and spit back answers. It will remember the context of your conversation. So if you ask a follow-up question, you don't have to explain it all over again. So this can become a pretty powerful research tool, particularly because, and I actually think this is huge, when...

The new Bing gives you an answer to your question. It adds a footnote and you can click the footnote and go to the web page where the information originally appeared. Right. It cites its sources in a way that ChatGPT did not. And we've talked about why that is a problem with ChatGPT, right? Like so much of the discussion is like, OK, but is what this thing telling me, is it real?

And there hasn't really been any way to know. This thing is just predicting words and sentences, right? But now with Bing, you can actually see where it's getting its information. That doesn't mean you're not going to have to hunt for it on the webpage, right? It's not perfectly cited, but it's way closer. So that's sort of the first thing that you can do in this browser. The other thing is that they have this compose feature. And now this actually does get a little bit funny. Okay, so you can...

tell the browser to write about whatever you like and there's a box where you can put something in. But you can specify the tone that you want it to write in. There are five available tones now for human conversation. Those tones are professional, casual, enthusiastic, informational, or funny.

Which one are you going to use when you write emails to me? Oh, I mean, it's going to be casual for sure. Yeah. I'm going to write to you informal, professional prose. Dear sir or madam. And then you can also choose the format so you can have it write a paragraph, an email, a blog post, or ideas. You can tell it how long you want it to write, and then you can generate a draft. So, you know, think about

all the people who are writing, you know, many of the same kinds of emails over and over again. And they probably have some templates that they're using, but now they can do this in a pretty automated way. It's pretty easy to customize. And one of the demos that they did on stage at Microsoft this week that just made me laugh was they showed somebody using the tool to write a LinkedIn post. I was just like,

If you're at Microsoft, do you really want to be encouraging people to like automate the creation of posts that were like already half spam to begin with? I gotta say, I felt a little scooped by that because I have also been using ChatGPT to write my LinkedIn posts. That hasn't really been you posting about your wins and your grind set. Yeah.

No, it actually, I used this a couple times to write LinkedIn posts and it generated my most popular LinkedIn post ever. And what was that about? It was about AI. Okay. But it also ruined my life because it was like, it wrote me this LinkedIn post that solicited pitches for AI startups.

So for the next, like, three weeks, I got buried in, like, thousands of pitches from tiny AI, like, insurance startups. Well, that serves you right for violating the trust of your LinkedIn followers. Yeah, but it did make me an influencer, so I'll give it that. So we should just say what the search engine part of this looks like, because in some ways, Bing now...

looks like kind of two products in one, right? So if you search for something on Bing, the left side of the screen is basically the same search experience we're used to. It's ads, it's blue links, it's snippets, it's the classic search experience. And then on the right side of the screen, for certain kinds of queries, it starts writing this AI-generated thing. So for example...

I searched yesterday, just doing a demo, I searched, you know, write me a menu plan for a vegetarian dinner party. And the left side of the screen popped up a bunch of recipe sites and articles from food blogs about dinner plans for vegetarians. The right side of the screen actually did the sort of GPT-style presentation.

bullet point list where it gave me a menu of things that I could serve at my next vegetarian dinner party. I then was able to click a little chat button next to that and ask a follow-up question where I said, you know, write a grocery list for this menu sorted by aisle of the grocery store, including the amounts I need to make enough food for eight people.

Which I was like, I don't know if it's going to get that. That's a pretty hard query, and it did it. Yeah, and I mean, this is complex stuff. This would have taken you a really long time to do, and I'm guessing everything that you just described took no longer than 60 seconds to type out and get the answers. It was immediate. And I should say, like, it didn't get everything right that I tried, so I asked it for a list of kid-friendly activities happening in Oakland this weekend because I was curious, like, how it's going to handle current events. Yeah.

and it gave me a list, but it was like all things that had already happened. It was like something that happened last weekend. So it still needs some work, and I think, you know, Microsoft executives who were there at the event were very sort of

clear about the fact that this is not a perfect product, that they're going to keep iterating and improving it over time. But it still does have some of these drawbacks we've been talking about. I think their basic feeling is like the stuff that it can do well is so cool that it makes up for the fact that, yes, it is broken in a lot of ways.

So we're very high on this technology, but what are the actual risks here? What should we be watching out for as this stuff actually enters into the mainstream? So one obvious risk is just the risk of inaccurate information. That seems...

pretty glaring and it's going to be a while before it's sort of the main part of the search engine. Right now it's sort of a sidebar and I think it's going to stay that way because it's not super reliable yet. Another thing I've been thinking about is the

what this does to the whole industry of search engine optimized content. Because you have right now just tons and tons of websites that make all their money by, you know, gaming Google's algorithms, by putting up, you know, recipes or gift guides. What time is the Super Bowl? What time is the Super Bowl? And that all...

risks being destroyed if people are no longer clicking the blue links on the left side of their search screen, if they're just relying on the information that's extracted from this training data and put into this natural language response on the right side. So I think publishers...

are going to be really upset about this. I think there are going to be lawsuits. I think there's going to be some attempt to get some more prominent attribution. As you mentioned, Bing right now gives you little, like, annotations on some of the answers that says, you know, this answer came from Wikipedia, this answer came from the BBC or whatever. But they're very small. They're kind of hard to find. My guess is that not a lot of people are actually going to click through to the links. And so this is going to result in

a big traffic drop for some publishers. - Yeah, and you know, I mean like look, I'm a publisher, like presumably, you know, my articles are gonna get scraped and used to feed these things, and you know, part of me feels like, well yeah, that sucks, it would be cool if I could be compensated for that.

But I don't know if I feel that strongly enough that I don't want tools like this to exist, right? Like, it's just nice to be able to quickly have a robot do really accurate research on my behalf. Yeah, I've been trying to replace you for years, so it's very exciting. Yeah.

Well, listen, we have a lot of questions. And so as part of our visit to Microsoft, we were able to put these questions to Microsoft CTO Kevin Scott and Sam Altman, the CEO of OpenAI. And we have that conversation coming up for you right now. Well, thanks for sitting down with us. Really appreciate it. Very exciting demo.

Sam, when I saw the new Bing search result page with the blue links on the left side and the AI generated answer on the right side, my thought was like, A, that's very cool. B, publishers are going to lose their minds over this because, you know, why would I click a blue link as a consumer when I can get a natural language answer right there on the other side?

there's a whole industry in search engine optimization and how to appear higher in the results. So how are you both thinking about the reaction that this might get from the companies that depend on people clicking the blue links? I think Microsoft did a really nice job of putting links into that generated thing on the right, as you see. Like one of the criticism, a fair criticism of Apache BT is you're not sure if it's accurate or not.

And so by adding all the citations that Microsoft has done, I think that will drive traffic out. And also if, you know, in the example of like buying a TV, if the click that happens at the end of that is, you know, to the top TV review and then you go buy from them and your intent likely to purchase through that or whatever is much higher, I think the quality of referred traffic should get super high, even if there's less volume. And...

Kevin, what do you think? Look, for everything that's different about this, like there are new opportunities for publishers and advertisers and users to use the technology in a different way. So I think to Sam's point, like there's going to be plenty of value there, like plenty of distribution for publishers, like plenty of ways for publishers

people to participate in like what this new marketplace is. So I'm also curious about like how the experience is going to be different in search versus what I've been doing in chat GPT with chat GPT. I think your training data stops in 2021 with search. People want results that are a lot fresher, right? Might be about something that is sort of happening right now. So like, can you sort of quickly update the models or is this stuff going to be on like a bit of a lag? No, no, no. So the way the technology actually works is we use the model to, uh,

go from the prompt to a set of queries that actually get executed on a live index. We pull that information back and then we give it to the model again to summarize and to complete whatever it is that the user wanted from the initial prompt. So it's always up-to-date with the latest information.

Sam, one thing we've seen with ChatGPT since it's been out is that it's not infallible. It makes lots of mistakes, including, you know, math mistakes, just inventing things out of thin air. You've said that, you know, ChatGPT shouldn't be used for anything important. How are you thinking about that and plugging that into what could be a, you know, giant popular search engine that's used by tons of people every day? Are you worried about the mistakes that this AI could make?

I mean, first of all, it's a better model that makes less mistakes, but there's still a lot of mistakes that come out of the base model. But as Microsoft went through today and all of the layers they've put on top of that and all the technology they've built around it, there will still be mistakes. But the accuracy and utility, I mean, check it out yourself. Don't take my word for it. But I think it is significantly improved over the ChatGPT experience, and it will get rapidly better over time.

Are there categories of search or types of search that people should be more cautious about doing on Bing that may be more prone to errors than others? I don't know the answer to that yet. I mean, like, obviously, there are whole categories of queries that we're going to try to

really moderate just because you can imagine some of the harms that could come from them. I think we're going to learn pretty quickly where the actual mistakes are that the model makes. Every time we launch something, we learn so much just from launching it and seeing how users are interacting with it. We learned a ton in co-pilot. There's an incredible amount of learning that's come from ChatGPT.

One thing I would point out is just the newness of all of this technology. You know, three years ago, GPT-3 hadn't launched. I think very few people believed that meaningful AI progress was going to happen. It's really only been...

a little bit over two months that i think this has had mainstream attention and interest and trying to like figure out how we're going to use all this how things will adapt and with any new technology you don't perfectly forecast all of the issues and mitigations but if you run a very tight feedback loop at the rate things are evolving i think we can get to very solid products very fast

Presumably you guys have been using this technology or using the new Bing for at least a few weeks, I would imagine. I'm curious, how has it changed the way you search? What are you using it to do that maybe felt inaccessible with previous search tools? It very quickly becomes indispensable. I miss it when I don't have access to it.

And it's everything from whimsical queries. Like I have a 14-year-old daughter who like says things like Riz and Bussin and like I have no idea what she's talking about. And you can like type this in, you know, like, hey, I've got a teenager. She's saying these words like I don't understand it. You have a Gen Z translator? You automated the cool dad. Yeah, and it's like really, really good at stuff like that. Just a nerdy thing that I did yesterday is like I'm a maker and I was looking for...

industrial Bartak sewing machines, which I have previously heavily researched and like using the new Bing, like it showed me something that I'd never seen before in any of my other searches. And like, I actually identified the machine that I think I'm going to go buy. Wow.

I mean, I'll say it a different way, which is I've always, given how close you are with Microsoft, I've always felt a little bit bad that I use Google and not Bing. But now I don't think I'll use Google anymore. Really? You just really don't want to go back. What about Chrome? Are you going to switch over to Edge?

So much of my, like, workflow is built into, oh yeah, I would like to get there. Look, I'm happy for people to use whatever technology makes sense to them. This isn't an attempt to Jedi mind trick the whole world into using something that's not the best possible choice for them. Like, people will choose what's doing useful things for them. Yeah.

I mean, just as sort of a quick aside, once ChatGPT launched, I talked to the Google folks who were clearly very jealous. And they would say to me, like, well, you have to understand, ChatGPT is just a demo. And the sort of implication was like, we're building the real stuff back here, and you'll sort of see it someday. But then you looked like two months later, and it's like nobody was treating ChatGPT like a demo. People were just absolutely using it as part of their everyday lives. And I imagine we're going to see something similar with Bing. Yeah, I mean, ChatGPT is a horrible product. Yeah.

And it was really not designed to be used in people's... How would you say that, though, Zed? Because it's like, it was a box you could type in, you could get almost anything you wanted. For sure. But if you think about the way people want to integrate it into their workflow, the way that you want it to be able to use your own data, integrate with your other applications and services, people are really just like,

going to a site that sometimes works and sometimes it's down, they're like typing in something, they're trying to get until they get it right. And they're like copying the answer and going to paste it somewhere else and then going back and, you know, trying to like integrate that with search results or there are other workflows like

It is cool for sure, and people really love it, which makes us very happy. But no one would say this was a great, well-integrated product yet. It's not the 1.0 version. Yeah, it's like the 0.7 or something like that. But there's so much value here that people are willing to put up with it. And this is like a real product now. Kevin, I'm thinking a lot today as I'm hearing Microsoft executives talk about AI, about Tay, which was the infamous chatbot.

That's the name I have not heard in a long, long time. This chatbot, it came out. It was almost immediately seized on by trolls who taught it awful things and turned it into like a racist Holocaust-denying disaster, and it got pulled down. And I think that became a cautionary tale for a lot of people in the AI industry. And I think actually, if you talk to people at these big AI companies, it's a lot of the reason they haven't launched stuff as quickly as they have, because they remember that as sort of a moment of, oh,

oh, God, what have we done? Was it hard to kind of rally support internally at Microsoft for this? Were there people who said, like, wait a minute, didn't we learn our lesson about releasing an AI chatbot last time? Is there still scar tissue from that? Well, so I suspect there is scar tissue. I mean, the interesting thing about Tay is that was, like, I think,

one year before I started here at Microsoft. So I don't know the full history. One of the interesting things though that we learned from it wasn't like, "Oh my God, we should never do this again." It's like, "Oh my God, we need to really get serious about responsible AI safety and ethics." It was one of the catalyzing events because we were still very, very serious about AI and had high conviction that it was going to be one of the most important, if not the most important technology that

any of us have ever seen that like we had to go solve all of these problems to avoid making similar sort of mistakes again, which was not to say that we won't make mistakes, but we have spent the past five and a half years trying to really not just articulate a set of principles so that we can push them out as marketing materials, but like how do we put them into practice every day in the work that we're doing? And so in that sense, when this new

breakthrough came from OpenAI, we didn't have as much of that resistance as you would imagine. They knew what muscle they were going to go exercise to ensure that we could launch it with confidence that we were being safe and responsible.

Yeah, also probably helps that like this technology is useful, whereas like Tay was like kind of a novelty, you know, chatbot, I think. A lot of things in that era got launched by research teams just because they were super excited about like, hey, I've got this thing that tech

technologically is interesting and they weren't really thinking about the product implications right well and on that front like i'm sure you saw the story this week there's like a reddit community that is now devoted to jailbreaking uh chat gpt and sort of they they figure out a way to do it and then you guys quickly patch it and do you think that that's just going to be like an ongoing cat and mouse dynamic forever or no because i think where we are right now is not where we want to be

The way this should work is that there are extremely wide bounds of what these systems can do that are decided by not Microsoft or OpenAI, but society, governments, something like that. Some version of that, people, direct democracy, we'll see what happens. And then within those bounds, users should have a huge amount of control over how the AI behaves for them. Because different users find very different things offensive or acceptable or whatever. And no company, certainly not either of ours, I think is in a place where they...

They want to or should want to sort of say, you know, here are the rules. Like right now, we just, again, it's very new technology. We don't know how to handle it. And so we're being conservative. But the right answer here is very broad bounds set by society that are difficult to break and then user choice. And also having a vibrant community of people who are trying to

press on the limits of the technology to find where it breaks and like where it does unusual interesting things is good super good yeah it's really i mean it's the way the security community works like i would rather have criticism and you know people finding bugs in my software than having uh those latent bugs and software get silently exploited and causing harm so you know it's painful sometimes to

hear these things that are coming at us, but it's not something that I actually am wishing that it goes away. Sam, there are people, including...

Some at OpenAI who are worried about the pace of all of this deployment of AI into tools that are used by billions of people, people who worry that maybe it's going too fast, that corners may be getting cut, that some of the safety work is not as robust as maybe it should be. So what do you say to those people who worry that this is all going too fast for sort of society to adjust or for the necessary guardrails to be put in?

I also share a concern about the speed of this and the pace. We make a lot of decisions to hold things back, slow them down. You can believe whatever you want or not believe about rumors, but maybe we've had some powerful models ready for a long time that for these reasons we have not yet released. But I feel two things very strongly. Number one, everyone has got to have the opportunity to understand these tools, the platforming

the pluses and minuses, the upsides and downsides, how they're going to be used, decide what this future looks like, co-create it together. And the idea that this technology should be kept in a narrow slice of the tech industry because those are the people who we can trust and the other people just aren't ready for it

You hear different versions of this in corners of the AI community, but I deeply reject that. That is not a world that I think we should be excited about. And given how strongly I believe this is going to change many, maybe the great majority of aspects of society, people need to be included early and they need to see it, imperfections at all as we get better and participate in the conversation about where it should go, what we should change, what we should improve, what we shouldn't do.

And keeping it like hidden in a lab bench and only showing it to like, you know, the people that like we think are ready for it or whatever, that feels wrong. The second is in all the history of technology I have seen, you cannot predict all of the wonderful things that will happen and the misuse without contact with reality. And so by deploying these systems and by learning and by, you know, getting the, the

the human feedback to improve we have made models that are much much better and what i hope is that everything we deploy gets to a higher and higher level of alignment we are not

Microsoft OpenAI. We are not the companies that are rushing these things out. We've been working on this and studying this for years and years. And we have, I think, a very responsible approach. But we do believe society has got to be brought into the tent early. Are you worried about the kind of arms race quality of this? That, you know, Google is having an event this week. They're reportedly rolling out a conversational AI called BARD and doing some updates to search.

Every other company I talk to in Silicon Valley is racing ahead to kind of catch up with ChatGPT. Does that quality worry you? I mean, we aren't the ones like, you know, rushing out announcements here. Again, the benefits of this technology are so powerful that I think a lot of people do want to build these things.

But there's totally going to need to be industry norms and regulation about what standards we need to meet on these systems. I think it's really important.

I'm curious if you could describe, even in just a vague way, what sorts of capabilities you're exploring that feel a little bit too powerful to share or haven't been properly tested. Because this stuff feels... I'm just really talking out of both sides of your mouth. Well, how do you mean? Well, you're saying, are you nervous about accelerating things? And can you tell us something crazy? Well, I...

Well, the reason I ask is because I imagine some people are going to listen to this and they've used ChatGPT and they think this is really powerful and they probably think it's really cool. And they know that y'all are also working on some more stuff. But it does sound kind of, I don't know, mystical or spooky that there's these other technologies being working on that are even more powerful. No, I don't. Yeah. Another downside of not putting this stuff out of the way is I...

Right away, I think people assume that we have a full AGI ready to push the button on, which is nowhere close. We have somewhat more powerful versions of everything you've seen and some new things that are broadly, I think, in line with what you would expect. And when we are ready, when we think we have completed our alignment work and all of our safety thinking and worked with...

external auditors, other AGI labs, then we'll release those things. But there are definitely some major missing ideas between here and super powerful systems. We don't have those figured out yet, nor as far as I know does anybody else.

We've been hitting you pretty hard on the safety and the responsibility questions, but I wonder if you want to sketch out a little bit more of a utopian vision here for like sort of once you get this stuff into the hands of hundreds of millions of people and this does become part of their everyday life, like what is the brighter future that you're hoping to see this stuff create? I think Kevin and I both very deeply believe that if you give people better tools, if you make them more creative, if you help them

think better, faster, be able to do more, like build technology that extends human will, people will change the world in unbelievably positive ways. And there will be a big handful of advanced AI efforts in the world. We will contribute one of those. Other people will contribute one. Microsoft will deploy it in all sorts of ways. And that

tool, I think, will be as big of a deal as any of the great technological revolutions that have come before it in terms of what it means for enabling human potential and the economic empowerment, the kind of creative and fulfillment.

empowerment that will happen, I think it's going to be, it could be like jaw-droppingly positive. We could hit a wall in the technology. You know, don't want to promise we've got everything figured out. We certainly don't. But the trajectory looks really good. And the trajectory is towards more accessibility. Like the thing that I come to over and over again is the first

machine learning code that I wrote 20 years ago, took a graduate degree and a bunch of grad textbooks and a bunch of research papers and six months worth of work. And like that same effect that I produced back then, a motivated high school student could do in a few hours on a weekend. And so like the tools are putting

more power in the hands of people so now it's not even that you have to have a computer science degree to do very complicated things with technology like you can sort of approach it you can program in natural language which is super cool yeah andre carpathy said i think that the next big programming language is english yes yeah which is very cool just think about it like that's

a thing I think that is holding us back so much in terms of like unlocking people's creativity and potential and to, I mean, like I grew up in rural central Virginia, like people are smart and entrepreneurial and ambitious. Like you give them powerful tools, like they're going to go do interesting things. Yeah. That's very cool. I know we're out of time, but thank you guys for doing this. Really appreciate it. Thank you guys. Thank you so much. We appreciate it.

Coming up after the break, Google puts the AI in fail. That's really good.

Welcome to the new era of PCs, supercharged by Snapdragon X Elite processors. Are you and your team overwhelmed by deadlines and deliverables? Copilot Plus PCs powered by Snapdragon will revolutionize your workflow. Experience best-in-class performance and efficiency with the new powerful NPU and two times the CPU cores, ensuring your team can not only do more, but achieve more. Enjoy groundbreaking multi-day battery life, built-in AI for next-level experiences, and enterprise chip-to-cloud security.

Give your team the power of limitless potential with Snapdragon. To learn more, visit qualcomm.com/snapdragonhardfork. Hello, this is Yuande Kamalefa from New York Times Cooking, and I'm sitting on a blanket with Melissa Clark. And we're having a picnic using recipes that feature some of our favorite summer produce. Yuande, what'd you bring? So this is a cucumber agua fresca. It's made with fresh cucumbers, ginger, and lime.

How did you get it so green? I kept the cucumber skins on and pureed the entire thing. It's really easy to put together and it's something that you can do in advance. Oh, it is so refreshing. What'd you bring, Melissa?

Well, strawberries are extra delicious this time of year, so I brought my little strawberry almond cakes. Oh, yum. I roast the strawberries before I mix them into the batter. It helps condense the berries' juices and stops them from leaking all over and getting the crumb too soft. Mmm. You get little pockets of concentrated strawberry flavor. It tastes amazing. Oh, thanks. New York Times Cooking has so many easy recipes to fit your summer plans. Find them all at NYTCooking.com. I have sticky strawberry juice all over my fingers.

So, okay, we have this big Microsoft event. We get on a plane, we come back to San Francisco, and then on Wednesday, Google...

joins the party. Yeah, and this was, I would say, the second time this week that Google tried to upstage Microsoft with AI-related announcements. Right, so on Monday, Google announced its AI chatbot, its chat GPT competitor, called BARD. And BARD is based on a version of Lambda, which is Google's language model. And BARD did not go so well in its first demo. In fact, in its first demo...

Wait, wait, wait, wait, wait. We actually can't call it a demo because unlike Microsoft, unlike OpenAI, Google still has barely let anyone actually try this thing. All that BARD is at this point is a blog post. And so in order to have a successful like pseudo launch of BARD, Google really only had one job. And that was to not screw up the one screenshot of BARD answering a question. So that was the task ahead of them, Kevin. How do they do?

They did not do so well. So Google shared a sample question that was asked of Bard in an animated GIF.

And do you say GIF or JIF? I say GIF. Yeah, GIF. This is a GIF podcast. Yeah, it's a GIF podcast. The G in GIF stands for graphical. Settled. So this GIF showed Bard answering the question, what new discoveries from the James Webb Space Telescope can I tell my nine-year-old about? Bard offered three answers, including one that stated that the telescope took the very first pictures of a planet outside our solar system.

So pretty quickly, a bunch of astronomers on Twitter pointed out that this is wrong, that the first image of an exoplanet, which is what a planet outside of our solar system is called, was taken in 2004, not by the James Webb Space Telescope. Now, there's subsequently been some debate and saying, well, maybe Bard was actually right because there is this one specific exoplanet that the James Webb Space Telescope was the first telescope to take a picture of. So if you read it...

one way it actually was right and we made the mistake as humans in interpreting it but I think the way that most people interpreted that is that Bard was wrong so in its first demo not even demo in its first screenshot its first screenshot Google's AI

AI chatbot made an error. There's lots to say about this, but when I read this headline, I immediately flashed back to a conversation I had with a former Googler recently, and we were talking about this code red inside the organization to try to really sort of galvanize energy around AI stuff. And I was sort of asking this person how they thought it was going to go. And this person said, the thing about Google is it doesn't

do well when it is panicked. And this person reminded me of the last time Google panicked, which was when Facebook was really sort of encroaching on their territory and they announced Google+. We all know how Google Plus went. And so, look, I think it's probably going to go better for Google with AI, but oh man, this is not the way to start. Yeah, an inauspicious debut for sure. And then on Wednesday, Google held an event which we did not attend in person in part because it was in Paris. Yeah. We would have loved to have attended. Yeah.

We could be strolling down the banks of the Seine right now, enjoying a nice cup of coffee and some macaron. A croissant. A pan au chocolat and a eye. Name a better morning, I dare you. But we were not invited and couldn't have made it anyway because we were en route back from Redmond. But this event, which was live-streamed from Paris...

featured Google executives showing off some of Bard's capabilities and talking about AI improvements that they're making to a number of other Google products, including Maps and Google Lens.

And while the Microsoft event, I think, generally raised investors' hopes for Microsoft, this event on Wednesday by Google seemed to have the opposite effect. No, it literally had the opposite effect. Yes. So after the event, shares in Alphabet, Google's parent company, fell about 7%. Which is like...

Do you know how bad your tech demo has to go for your stock to drop 7%? I mean, like, again, this is probably a hiccup. I think it's going to be fine. But it is really remarkable in this moment how not great this stuff is going for them. Yeah, and it's just such a stark contrast to last summer, if you can remember, when we were all talking about Google's

mysterious new AI technology that an engineer left the company, got fired because he claimed it was sentient. So we've gone from sentient chatbot mysteries to shares falling 7% because... It can't answer a telescope question. It can't answer a telescope question in less than a year. And I just think that's a remarkable turn of events. It is. I mean, also, I would say, like, looking through these announcements, and I should say, I have not had a chance to try any of these features, but none of this stuff is really making me...

super excited, right? Like, the stuff we've been talking about today is stuff that will legitimately change work for millions, potentially billions of people. And the stuff Google announced this week is just not playing in the same league. Yeah, it feels like incremental small ball, some of it. I will say, I think it's a mistake to underestimate Google here. They've got some of the best AI researchers and engineers in the world. They've got a huge...

dominant lead in search and other categories that they have all the data and processing power they need. So I think they're going to get up this hill rather quickly. But it is just, I think one effect that ChatGPT has had on the tech industry that I think is actually good is

is that it has forced companies to put up or shut up, right? We are not going to be impressed by blog posts or research papers. Unless I can use a thing, it is not real to me. You know what I read today that I had forgotten? In 2016, Sundar Pichai stands up on stage at Google I.O. and he says, as of today,

Google is an AI first company. That was almost seven years ago that he said that. Look, there are trade-offs to launching more quickly. We have a lot of concerns about what letting this stuff loose on the masses is going to mean for everyone.

and yet we also do think it's cool, and we're using it to do stuff, and that stuff is making our lives better and easier, right? And it is just shocking that the company that invented the TNGPT is still this far behind. And actually, my question to you is, if you are Sundar Pichai, and you have what you have, do you do a blog post and an event this week knowing what Microsoft is about to show off? Interesting. I think I would probably...

I think I would probably wait until my thing is ready for primetime for the masses. Because again, like if it's not, if I can't log on and try it myself, it's not real to me.

You know, this is one thing that Apple has always really understood is there's no sense in previewing something. Even if you feel like all your competitors are six months or a year ahead of you, stick to your knitting, finish the project, say nothing about it until the moment when you're ready to put it on sale, make it available for consumption. And if you do that and the product is good, then you win. Totally. What would you do if you were soon to arbitrage?

Here's what I would do. I would say nothing until Bard was ready to be shown to the public, right? It's one thing if they just sort of want to, I don't know, give a reporter a quote saying, yes, we do have something. You'll hear about it soon. I think that's fine, but I don't want to see another blog post. I don't want there to be another event until the stuff is ready for it to be shown, right? In exactly the same way that we saw this week. Because at the end of the day, what users really care about is which of these services work.

is better? Which can I use now? If it's going to cost me something, what is it going to cost me? Those are the relevant questions. The relevant question is not, is Google working on something that will be available someday? We know that, right? So say nothing, and it'll go better for you in the end. And I think it's also just quite important that companies are doing all of the necessary safety work on these powerful language models before throwing them open to the public. I am really worried, as I know a lot of people in the AI industry are, about this kind of

arms race quality, you know, that I asked Sam Altman about where like it is the case that Google is sprinting to get this stuff out the door. And whether it comes this week or next week or six months from now, like my priority as a human being is for them to do the necessary work of making sure that these things are not going to be misused and weaponized because I think that is a real danger, especially as these systems get better and better. Awesome.

All right. Now let's go look and see what that space telescope has been up to since we've been talking. If nothing else, I learned what an exoplanet is today. Coming up after the break, what is the deal with bias in AI?

Indeed believes that better work begins with better hiring, and better hiring begins with finding candidates with the right skills. But if you're like most hiring managers, those skills are harder to find than you thought. Using AI and its matching technology, Indeed is helping employers hire faster and more confidently. By featuring job seeker skills, employers can use Indeed's AI matching technology to pinpoint candidates perfect for the role. That leaves hiring managers more time to focus on what's really important, connecting with candidates at a human level.

Learn more at indeed.com slash hire.

Christine, have you ever bought something and thought, wow, this product actually made my life better? Totally. And usually I find those products through Wirecutter. Yeah, but you work here. We both do. We're the hosts of The Wirecutter Show from The New York Times. It's our job to research, test, and vet products and then recommend our favorites. We'll talk to members of our team of 140 journalists to bring you the very best product recommendations in every category that will actually make your life better. The Wirecutter Show, available wherever you get podcasts.

All right, Kevin. So have you been watching this Twitch stream called Nothing Forever? I have not. Okay. So it's an art project from a couple of guys who had this idea. What if you could take generative AI, like the kind that powers everything we've been talking about today, and instead of using it to answer questions about telescopes and planets, what if you could make an episode of Seinfeld that just went on forever? So it's a never-ending...

procedurally generated episode of Seinfeld where all of the lines are being written by AI, read by an AI voice, and there's AI animation happening in the background. That's right. So both the dialogue and the video are being created in real time by AI. There's no human animating this stuff. Yeah. Okay. It's not technically Seinfeld because that would be a copyright violation. And so instead of calling the character Jerry Seinfeld...

the lead comedian in this show is called Larry Feinberg. Everyone else has different names. Cosmo Kramer is Zoltan Kackler. No, come on. Anyway, so it's a sort of very like low quality pixelated scenes, like bad MIDI music that

And the action just kind of shifts from place to place, from like Jerry slash Larry's apartment to all the other locations in the Seinfeld universe. And the characters just talk to each other. There's a laugh track. There are some attempts at jokes. It will like cut to the comedy club where, you know, Larry is doing stand-up. And people are just really struck by the weirdness of this.

It is occasionally funny. It went viral because the characters appeared to become self-aware for like a minute, like asking what they were all doing here. Actually, let me just play that for you. Did you ever stop and think that this might all be one big cosmic joke? Well, I don't think it's necessarily all, you know. Yeah, I know. I just mean like, why are we here? To tell jokes, obviously. I mean, why are we here together?

Maybe fate put us in the same place for a reason? To make the world a funnier place? Yeah, that could be it. Alright, I'm in. Let's do it!

Okay, so the vibe is sort of Simpsons meets Minecraft, but it's got some elements of Seinfeld. Like, it cuts between, like, you know, exterior shots of, like, apartment building and, like, a sort of Minecraft-looking replica of the apartment from Seinfeld. So clearly it's trained on...

some Seinfeld episodes, but it's also like the dialogue is weird. The voices are all robotic. The graphics are very bad. So it's not like, like I wouldn't watch this for more than a few minutes, but I do think it,

It's really interesting as a sort of an art experiment. Yeah, and we should say, like, there is a genius to this, right? So much of people's TV consumption is just, like, watching The Office from, like, the start of season one until the very end and then restarting it. If you could take the humor of that show and generate endless riffs on it in a never-ending episode of The Office, like, you could probably make a lot of money. Yeah.

Someone just pitched that to Peacock. Yeah. So, anyway, unfortunately, you know, as does sometimes happen with these AI stories, this one does not have a happy ending. Recently, the Twitch stream was suspended after one of these comedy stand-up segments when Larry Feinberg, the virtual comedian, said some things that were homophobic and transphobic. And they got yanked. Yeah. Yeah.

And it does seem like this is a place we've been before. It seems like often is the case that you will let an AI chatbot have a little bit of room to run, and next thing you know, that they are reflecting all of the biases in our society. Do we know why it became transphobic and homophobic? We do, and it's super interesting.

The dialogue in Nothing Forever was running off of GPT-3, okay? And there are different flavors of GPT-3. The more sophisticated one, which Nothing Forever was using most of the time, is called DaVinci. But it was causing some problems, and so the creators switched to a less sophisticated model of GPT-3 known as Curie.

And the creators think that they accidentally stopped using OpenAI's content moderation and safety tools when they switched over to Curie. So it was after they switched models that the characters started saying the bad things. That's really interesting. How long had this thing been running before it broke bad? For a couple of months. A couple of months? Yeah, it started up in December. And, you know, I think...

I'll speak only to the gay part of it as a gay man. So one of the things that it said was sort of Larry Feinberg is musing about some comedy routines he might do in the future. And he proposes one that, quote, all liberals are secretly gay and want to impose their will on everyone. Yeah.

And we should say that don't do, don't be doing conspiracy theories about the gays. That's bad. Um, but that's not like the worst thing I've ever heard about gay people. And I'm not saying that to like, let the AI off the hook, right? Like we do want these, these things to have controls and we don't want them to be like running wild. Um, but what, what I guess what I'm saying is what I'm struck by is not like, Oh my gosh, this AI just said the absolute most offensive thing that you've ever heard. Um,

I think what's interesting is it's showing us how important it is to know what model you are using. And here's where I actually think that this story goes from a silly thing that happens online to something that has implications for anyone working on this stuff. One of the stories I've been most fascinated by, we've talked about on the show, is how CNET started to use this AI to generate articles to provide SEO bait, right?

And one of my frustrations in trying to understand this story is CNET will not tell us which AI tools it is using, right? And so it becomes very difficult to evaluate how much we should trust this AI in the future because we literally don't know what we're using. You know, the way all of these tools work is they're trained on some

body of material, and then they just predict answers based on that body of material. Because that's the case, it's really important to know what the body of material is. If you want to create an episode of Seinfeld that goes on in perpetuity, it would be really good to just train it on, I don't know, every episode of Seinfeld, right? And maybe some like

extra supplemental material. And if you did that, you would probably have a pretty good chance of not saying something super transphobic or homophobic. Right. And it's also going to be a test for platforms that host

generated content. I mean, just to kind of link this with the Microsoft visit, like the Microsoft executive who used Bing to write a LinkedIn post, there's not that much different from doing that than creating an endless Seinfeld episode. Like we are entering a world where a

a lot of the content that gets submitted to social media sites and posted online that gets streamed on Twitch and YouTube, a lot of it is going to be algorithmically generated. And maybe there will be a human sort of overseeing it and editing it on the back end before it goes up. But

Maybe there won't. Maybe we're headed into a world of never-ending Seinfeld episodes. Well, you said to me yesterday that you assume that going forward, any email you receive will not have been written by a person. Oh, yeah. I'm assuming that we have reached the singularity on this already. It's going to become increasingly hard to tell...

whether the post you see on Instagram or the email in your inbox or even the video that you're watching on Twitch or YouTube was generated by a human or not. And, you know, as Microsoft was demoing this new sort of writing assistant that it's built into its web browser now, I was thinking like,

I may never get a PR email from a human again. All right. So what do we do about all of this? I think that one of the answers here is just making people disclose what models they are using, right? And

I'm going to say it, if you're sending me email that was written by an AI, I actually want to know that. I don't think I'm going to get that in Gmail, but I'd like it. If I'm watching a stream on Twitch that's generated by an AI, I do think it should disclose this show is running off of this model. I want to be able to click on that model. I want to know where the training data came from. I want to know if it has moderation guidelines. We already have these community standards in place for all of these tech platforms. That's been really good for their businesses.

I think these large language models should have similar kinds of disclosures because, as you pointed out, this stuff really could damage trust for a lot of reasons. You never know who you're talking to. We could rebuild that if we knew what technologies were baked into all these new tools we're going to be using. Well, but to take the other side of that, like when you watch a movie that has CGI graphics in it,

they don't tell you what program they use to create the cgi right it's not we don't expect that the things that we see on screen are all generated by human animators and directors we sort of accept that there's some computer wizardry in there too so how is that any different from getting an email from a pr firm that was written to

by a robot. Well, like when, you know, when Marvel uses a computer program to make Iron Man fly, that's just a piece of art that is designed to stimulate and entertain me. When you are using it to write emails, the assumption that everyone is trading on is that a person wrote this. And if that's not going to be true, then I just think that we should have some basic questions answered for us. Yeah. Yeah, I can see there being sort of, you know, when your email program, like,

puts a little thing at the bottom that says, like, sent from my iPhone. Please excuse typos or whatever. I think we will start to see some programs that put little disclaimers in there, like, this email was drafted by Bing or this email was drafted by GPT.

Yeah, because you know what else is going to happen is just in the same way that fake Seinfeld broke bad, these AIs that people are using to write emails are going to say stupid or offensive things. And then people are going to scramble and be like, ah, it wasn't me. It was the AI. If you actually have a disclosure in there, you'll have more plausible deniability. Exactly. I can say all kinds of rude and offensive things over email now. And if people question me about it, I can just say it was Bing's fault.

Thanks, babe. All right. Well, I'd love to end this on a Seinfeld joke, but I really haven't watched it much over the past 20 years.

BP added more than $130 billion to the U.S. economy over the past two years by making investments from coast to coast. Investments like acquiring America's largest biogas producer, Arkea Energy, and starting up new infrastructure in the Gulf of Mexico. It's and, not or. See what doing both means for energy nationwide at bp.com slash investing in America.

Before we go, I just want to thank our listeners. We've been getting so many great emails, including a number that urged us to check out this endless Seinfeld episode that we just discussed. So thank you to everyone who wrote in suggesting that we take a look at that. Please keep sending us your ideas. We really are listening to you. I also wanted to shout out our reader, Robin, who wrote, in one of your latest episodes, you mentioned CNET's false AI article on compound interest.

I was struggling in my high school econ class and stumbled across the article before it was corrected. As you can probably guess, I didn't do so hot on my latest test. I feel kind of silly for not spotting the inaccuracies. Robin, don't be so hard on yourself. You were misled by a rogue AI, but hopefully you've now learned to be on alert. Yeah, and please don't use Bard for your next astronomy quiz.

Heart Fork is produced by Davis Land. We were edited this week by Jen Poyant and Paula Schumann. This episode was fact-checked by Caitlin Love. Today's show was engineered by Alyssa Moxley. Original music by Dan Powell, Alicia Bikitup, Marian Lozano, and Rowan Nemisto. Special thanks to Hannah Ingber, Nelga Logli, Kate Lopresti, and Jeffrey Miranda. That's all for this week. The solution for Wednesday's Wordle was flail.

Imagine earning a degree that prepares you with real skills for the real world. Capella University's programs teach skills relevant to your career so you can apply what you learn right away. Learn how Capella can make a difference in your life at capella.edu.

Bing’s Revenge + Google’s AI Faceplant

Hard Fork

Chapters

What are the implications of Microsoft's new AI-powered Bing?

How does the new Bing compare to Google's Bard in terms of performance and reception?

What are the ethical and practical concerns surrounding AI-generated content?

How are AI technologies changing the way we interact with information and media?

Shownotes Transcript

PodQuest PodQuest Podcast Discovery Engine tailors playlist for your curiosity

Bing’s Revenge + Google’s AI Faceplant 55:07 Share

Hard Fork

Chapters

What are the implications of Microsoft's new AI-powered Bing?

How does the new Bing compare to Google's Bard in terms of performance and reception?

What are the ethical and practical concerns surrounding AI-generated content?

How are AI technologies changing the way we interact with information and media?

Shownotes Transcript

PodQuest PodQuest Podcast Discovery Engine tailors playlist for your curiosity

Bing’s Revenge + Google’s AI Faceplant