cover of episode Nvidia Part III: The Dawn of the AI Era (2022-2023)

Nvidia Part III: The Dawn of the AI Era (2022-2023)

Publish Date: 2023/9/6
logo of podcast Acquired

Acquired

Chapters

Shownotes Transcript

you like my box t shirt i love your box t shirt i went for the first time what two weeks ago when i was down for meeting a benchmark and the nostalgian there is just unbelievable i cant believe you and been before i know jenson is a denys guy but i feel like he would meet us at box if we ask them or at the very least, we should figure out some nvideo memorabilia to get on the wallet box totally fit right and all right lets do it lets do it, isyyouisyyouisyyouseetdown welcome to season thirteen episode three of acquiredthepodcast about great technology companies and the stories and playbooks behind them Iben, Gilbert im David Rosendthall and we are your hosts today we tell a story that we thought we had already finished Nvidia but the last eighteen months have been so insane listeners that it warnted an entire episode on its own so today is a part three for us with Nvidia telling the story of the ai revolution how we got here and why its happening now starting all the way down at the level of Adams and silicon so heresomething crazy that i did a transcripsearch on to see if it was true in our April 2022 episodes, we never want said the wordgenerative that is how fast things have changed unbelievable, totally crazy and the timing of all of this ai stuff in the world is unbelievably, coincidental and very favorable, so recallback 18 ago throughout 20202, we all watched financial markets from public equities to early stage startups to realstate just fall off a cliff due to rapid rise in interest rates the crypto and web threebubbburst banks fail it seemed like the whole tech economy and potentially a lot with it was heading into a long winter!

including Nvidia!

including Nvidia who had that massive inventory right off for what they thought was over ordering yep wow!

how things have changed yeah!

but by the fall of 2022 right when everything looked the absolute blekeest, a breakthrough technology finely became useful after years in research Labs, large language models or llms built on the innovative transformer machine learning mechanism burst onto the scene firstwith open AIS ChatGPT, which became the fastest app in history to a hundred million active users, and then quickly followed by Microsoft, Google and seemingly everyother company in November of 2022, AI definallyhadits Netscape moment and timewilltell, but it may have even been its iPhone moment that is definitely what Jenson believes yeah well, today well explore exactly how this breakthrough came to be the individuals behind it and of course, why the entire thing has happened on top of Nvidia hardware and software if you want to make sure you know everytime, theres new episode go sign up at acquireddatafmslash email youre also get access to things that we aren t putting anywhere else one a clue as to what the next episode will be and two followups from previous episodes from things that we learned after release you can come talk about this episode with us after listening adiquire data fem slash slack if you want more of David and i checkout our interviewshow acq to our next few episodes are about ai with ceos leading the way in this world we are talking about today and a great interview with dog demio where uh we wanted to talk about a lot more than just portal with him but uh you know we only had eleven hourers or whatever we had in bugs garage so a lot of the uh car industry chat and learning about dog in his journey in his business we saved for ack two so go check it out one final announcement many of you have been wondering and weve been getting a lot of emails when will those hats be back in stock, well, theyre back for a limited time you can get an acq embroidered hat ad acquire data fm slash store go put your order in before they uh go back into the Disney vlog forever。

this is great i can finally get Jenny one of her own so she stops stealing mind yes!

well, that further do this show is not investment device David and i may have investments in the companies we discuss and this show is for informational an entertainment purposes only David historyinfacts oh!

man well on the one hand we only have eighteen months to talk about except that i know youre not gonna start hitemonths ago on theotherhand, we have decades and decades of foundational research to cover so when i was starting my research, i went to the natural first place, which was our old episodes from April 2022, and i was listing to them and i got to the end of the second one and a man had forgotten about this i think Jensen maybe wishes we all had forgot about this in one of Nvidia searing slides in twenty twenty one they put up their total address, blemarket and they said they had a one trillion dollar Tam and the way that they calculated this was that they were gonna serve customers who provided a hundretrilliondollar sworth of industry and they were gonna capture just one percent of it and there was some stuff on the slide that was fairly speculative you know like autonomous vehicles and the omniverse and i think robotics were a big part of it and the argument is basically like well cars plus!

factories plus, all these things added together is a hundretrillion and we can just take one percent of that causes compute will amount to one percent of that which im not arguing is wrong, but it is a very blunt way to analyze that market yes!

usually not the right way to um think about starting a startup you know oh, we can just get one percent of this big market bop its the topious downway i can think of to size a market so you Ben rightly so called this out at the end of Nvidia part two and youre like you know i think to justify where nvidias trading at the moment you kind of actually got a believe that all of this is gonna happen and happen soon autonomous cars robotics everything yeah importantly i felt like the way for them to become worth what they were worth at that time literally had to be to power all of this hardware in the physical world, yep i cant get believe that, i said this because it was unintentional and uninformed, but i was kind of grasping at straws, trying to play devils, avokid for you and we just spent most of that whole episode talking about how machine learning powered by Nvidia and did up having this incredibly valuable use case, which was powering social media feed recommendors, and that Facebook in Google had grown bigger than anyone ever imagined on the internet with those feedrecommendations an Nvidia was powering all of it and so i just sort of ideally proposed well maybe but what if you dont actually need to believe any of that to still think that Nvidia could be worth a trillian dollars what if maybe, just maybe the internet and software and the digital world are Gonna keep growing and there will be a new foundational layer that Nvidia can power is that possible and i think we were both like yeah i dont know lets end the episode yeah sure we struct it off and were like alright carvouts but the crazy thing is that of course, at least in this timeframe most things on gensids trillingdollar tamslide have not come to pass, but that crazy question just might have come to pass and from nvideas revenue inearning standpoint definitely has its just wild right, so how do we get here lets rewind and tell the story so back in 2012 there was the Big Bang moment of artificial intelligence, or as it was more humbly referred to back that machine learning and that was alexnet we talked a lot about this on the last episode it was three researchers from the university of Torono who submitted the alexnet algorithm to the Imagenet computer science competition now Imagenet was a competition where you would look at a set of 14 million images that have been hand labeled with what the pictures were of like of a strawberry or a cat or a dog or whatever?

and David, you were telling me its the largest ever use of mechanical turk up to that point was to label the imagenet data set yeah!

its well even until this competition and until alexnet there was no machine learning algorithm that could accurately label images so thousands of people on mechanical turk got paid however, much twobux an hour to label these images yeah!

and if im remembering from our episode, basically what happened is the alexnet team did way better than anybodyelse had ever done the complete step change better i think the error rate went from mislabeling images 25 percent of the time to suddenly only mislabeling them 15 percent of the time and that was like a huge leap over the tiny incremental progress that had been made along the way your spot on and the way that they did it and what completely changed the fortunes of the internet of Google。

of Facebook and certainly of Nvidia was they actually used old algorithms, a branch of computer science and artificial intelligence called neural networks, specifically convolutional neural networks, which had been around since the sixties but they were really computationally intensive to train and so nobody thought it would be practical to actually train and use these things at least not anytime soon or in our lifetimes and what these guys from Toronto did is they went out probably to their local Best Buy or equivalent in Canada they bought to g force GTX 58S。

which with the top align cards at the time theyroute theyrealgorithm theyconvolutional neural network incuda in nvidias software development platform for gps and by god they trained this thing on like a thousands dollars worth of consumer grade hardware and basically the algorithm that other people had been trying over the years just wasnt massively parallel the way that a graphic cards sort of enables so of you actually can consume the full compute of a graphic card than perhaps you could run some unique novel algorithm and do it on you know a fraction of the time in expense that it would take in these supercomputer laboratories yeah!

everybody before was trying to run these things on cpuse cpuserawesome, but they only execute one instruction at a time gpuser on the other hand execute hundreds or thousands of instructions at a time so gps Nvidia Graphics cards accelerated computing what Jenson and the company likes to call this you can really think of it like a giant archimedies whatever, advances are happening in moreslaw and the number of transistors on a chip if you have an algorithm that can run in parallel, which is not all problem spaces。

but many can then you can basically upmoreslive by hundreds of times or thousands of times or today tens of thousands of times and execute something a lot fasterthanyou otherwise could and its so interesting that there was this first market called graphics that was obviously parallel where every pixel on a screen is not sequentially dependent on the pixel next to it it literally can be computed independently and output to the screen so you have however, many tens of thousands or now hundreds of thousands of pixels on a screen that can all actually be done in parallel and little did Nvidia realizeof course that AI and crypdo and all this other linear Algebra Matrix match based things that turned into accelerated computing pulling things off the cpu when putting them on GPU another parallel processors wasan entire new Frontier of other applications that could use the very same technology they had pioneered for graphics yeah!

it was pretty useful stuff and this alexnet moment and these three researchers from Toronto kicked off you know Jenson calls it and hes absolutely right the Big Bang moment for ai so David the last time we told this story in full we talked about this team from Toronto we did not follow what this team of three went on to do afterwards yeah, so basically what we said was it turned out that a natural consequence of what these guys were doing was oh actually, you can use this to surface the next post in a social media feed, unlike an Instagram feed or the YouTube feed or something like that and that unlocked billions and billions of value and those guys and everybody else working in the field they all got scooped up by Google in Facebook well, thats true and then as a consequence of that Google and Facebook started buying a lot of Nvidia GPS but turns out theres also another chapter to that story that we completely skipped over and it starts with the question you asked Ben?

who are these people?

yes!

so the three people who made up the alexnet team were of course, Alex Kushivski who was a PhD student under his faculty adviser, the legendary computer science professor Jeff hint it i have an amazing piece of trivia about jeffhinton do you know who his great great grandparentswere?

no, i have no idea he is the great great grandson of George and Mary bool you know like Bullion Uljavar and Bullion logic this guy was borned to be a computer science researcher oh!

my god right foundational stuff for computation and computer science i also didnt there were people named bool that thats where that came from thats hilarious yeah!

we do the and or x or nor operators that comes from georgemerry while so hes the faculty adviser and then there was a third person on the team alexfellow PhD student in this lab when illion sutskiver and if you know where we going with this you are probably jumping up and down right now in your seat Illia is the cofounder and current chief scientist of open ai?

yes!

so after Alex net, Alex, Jeff and illia do the very natural thing they start a company i dont know what they were doing in the company, but uh it made sense to start one and whatever they did it was gonna get acquiredreal fast by Google within six months so they get scooped up by Google they join a bunch of other academics and researchers the Google has been monopoizing really in the field three specifically, Greg Coraldo, Jeff Dine and Juang, the famous Stanford professor, the threeof them had just formed the Google brain team within Google to turbo charge all of this ai work that is been unleashed by alexta and of course to turn it into huge amount of profit for Google turns out individually serving advertising thats perfectly targeted on the internet through Facebook or Google or YouTube is an enormously profnel business and one that consumes a whole lot of Nvidia GPUs yes, so about a year later, Google also acquires deep mind famously and then right around the same time facebooksgroups up computer science professor young lecoon, who also is a legend in the field and the two of them basically established due oppoly on leading ai researchers now at this point, nobodyis mistaking what these companies in these people are doing for true human level intelligence or anything closeto it this is ai that is very good at Narrow tasks like we talked about social media feed recommendations so the Google brain team and jeffanalycs and illia, one of the big projects they work on is redoing the YouTube baggorism and this is when YouTube goes from like money losing you do crazy thing, the Google acquiredto the just absolute juggernaut that it is today i mean back then HEINLIKE2013TWENTY fourteen we did our YouTube episode not that long after the majority of views of YouTube videos were embeddsonotherwebpages this is when they builded into a social media site, they start the feed they start auto play all this stuff is coming out of ai research some of the other stuff that happens at Google famously after they acquireddeep mind deep mind built a bunch of algorithms to save on cooling cost and Facebook of course, they probably had the last labin this generation because theyre using all this work in yangle coin is doing his thing and hiring lot of researchers there this is just a couple years after they acquiredinstagram man we need to like go back and redo that episode because Instagram would have been a great acquisition anyway, but it was ai powered recommendations in the feed that made that into a hundred 250 billion dollar asset for Facebook and i dont think youexaggerating i think that is literally what Instagram is worth to meta now by the way。

i bought a lot of things on Instagram ads。

so that the targeting works hit have sale does theres this amazing quote from asteroteller who ran Google x at the time install does in a neartimespec where he says that the gains from Google brain during this period i dont think this even includes deep mind just the gains from the Google brain team alone in terms of prophets to Google more than funded everything they were doing in Googlex。

which has there ever been anything profitable out of Googlex Google brain yeah, i mean yeah well!

leave it at that so this takes us to twenty 50 when a few people in Silicon valley, start to realize that this Google Facebook ai duopoly is actually。

a really really big problem and most people had no idea about this this is really visionary of these two people and not just a problem for like the other big tech companies cause you could make the argument at a problem because like series terrible all the other companies that have lots of consumer touchpoints have pretty bad ai at the time。

but the concern is for a much greater reason i think there are three levels of concern here one obviously the other tech companies then theres the problem of startups this is terrible for startups how are you gonna compete with Google and facebookwhen this is the primary valuedriver of this generation of technology i mean there really is another lenstoview what happened with snap, what happened with musically and having to sell themselves to ByteDance and becoming TikTok and going to the Chinese maybe it was business decisions, maybe it was execution or whatever that prevented those platforms from getting to independent scale snapsapublic company now but like its no Facebook maybe it was that they didnhave access to the same AI researches the Facebook in Google had hmm that feels like an interesting question its probably a couple steps too far in the conclusion。

but still sort of a fun strong and think about a funstru 面 nontherless this is definallyproblem the thirdlayer the problem is just like this sucks for the world that all these people are locked up in Google and Facebook this is probably a good time dimension this founding of Openai was motifated by the desire to find agi or artificial general intelligence firstbefore the big tech companies did and deepmind was the same thing it was going to be this winding and secured as path at the time since really nobody knew then or nos now the best path to get to agi but the big idea of openai is founding was whoever figures out and finds agi first will be so big and so powerful so quickly theyll have an immense amount of control and that is best in the open so these two people were quite concerned about this convene a very fatfuldinner in 2015AT of altplaces is it the Rosewood the Rosewood hotel on sandhill road naturally!

it would have been way better if it were a Danny is our boxin woodsiders something like that。

but it does actually just show like where the seeds of open ai come from it is very different than this sort of organic scrappy way that the nvidias of the world got started you know this is powers on high and existing money saying no, we need to willsomething into existence yep so of course。

those two shadowy figures are Elon Musk and sam allman who at the time was president of y combear so they get this dinner together and they invite basically all of the top ai researchers a Google and Facebook and they like yeah what is it going to take for you to leave and to break this to opling and the answer from almost all of them is nothing you cant why would we ever leave were happyest clams here wegot the higher the people that we want webuilt these great teams theres a money spicket pointed our face right, not only are we getting paid just ungodly amounts of money but we get to work directly with the best ai researchers in the field if we were still had academic institutions you know, say youre at the university Washington amazing academic institution for computer sites one the top in the world or the university of Torana where these guys came from youre still a little fragmentedmarket if you go to Google or you go to Facebook, youre with everybody yep so the answer is no from basically everybody except theres one person whos intrised by Elon and saamspeech and quote an amazing wired article from the time by kid matts that we will link to in our sources quote the troublewas so many of the people most qualified to solve all these AI problems were already working for Google on Facebook and no one at the dinner was quite sure that these thinkers could be lured to a new startup even if musk and altmen were behind it, but one keyplayer was at least open to the idea of jumping ship, and then they have a quote from that keyplayer i felt there were risks involved, but i also felt it would be a very interesting thing to try and that keyplayer was illia suschiver yep so after the dinner illion leaves Google and signesupto become as we said cofounder in chief scientist of a new independent AI, nonprofit research lab, backed by Elon and Saam open ai yes!

and before we talk about what open ai would go on to do in its first chapter, which is quite different than today this is a great time to tell you about one of our very favorite companies and actually the perfect fit, fit for this episode crusal so crusal as listeners know by now!

is a clean compute cloud provider specifically built for AI workloads Nvidia is one of their major partners and literally Crusos data centers are nothing but rax and racks of a one hundreds and h one hundreds and because Crusos cloud is purpose built for ai and run on wasted stranded or clean energy they can provide significantly better performance per dollar than traditional cloud providers just as one example, cruser was one of the first clouds to deploy the 32 gigabitinfinaband, which we will be talking about a lot more later on the episode to dramatically accelerate the performance of large ai training clusters in their data centers yes!

we talked about that on our acq to episode with cruise o CEO chase lockmiller, which David that actually ended up being very helpfulfor my research for this episode in finaband is wild and its just one example of how Nvidia has built such a dominant position in AI that will talk about later in this episode will link to the interview in the show notes you can hear what its like for cruise actually deployying that technology in their data centers and the length that Cruso goes to to maximize performance the other element the mixcruso special is the environmental angle Cruso of course。

locates their data centers at stranded energy sites so think oil flayers, wind farms they cant use all the energy they generate。

etc and uses that power that would otherwise be wasted to run your ai workloads instead yep obviously its a huge benefit for the environment and for customers on cost since cruise doesnt rely on the energygrid energy is the second largest cost of running AI after of course。

the price you pay Nvidia for the chips and these lower energy costs get passed on to customers its super cool that they can put their data centers out there and these remote locations where quotan quote energy happens as opposed to the other hyperscalers, such as AWS in Google and Azure。

who need to build their data centers closeto major traffic obs where the internet happens because they are doing everything in their clouds yup if you your company or your portfolio companies would like to use the lower cost and more performance infrastructure for your aiworkloads go to crusoclouddot com slash acquiredthatcru soe cloud dot com slash acquired or click the link in the shownotes OK?

so David Openai is formed its 2015 here are 8 years later and we have ChatGPT superlinear path from there to here right turns out uh no so as we were talking about a little bit ai at this point in time super good for narrow use cases looks nothing like gbt for today the capabilities that it had were pretty limited and one of the big reasons was that the amount of data that you could practically train these models on was pretty limited so the Alex net example。

your talking about 14 million images in the grand scheme of the internet 14 million images is a drop in the bucket and this was both a hardware and a software constring on the software side we just didnt actually have the algorithms to serve suppose that we could be so bold to train one single foundational model on the whole internet like it wasna thing yeah that was a crazy idea right people were excited about the concept of language models, but we actually didnt know how we could algorithmically getit done so in 2015, Andre carpathie who was then at open AI and went on to lead ai for Tesla and is actually now back at open ai writes this semidall blogpost called the unreasonable effectiveness of neural networks and David i only gregdid i go into it on this episode but note that recurrent neural networks are a little bit of a different thing than convolutional neural networks, which was the twenty twelvepaper, the state of the yard had evolved yes and right around that same time there is also a video that hits YouTube are a little bit later in 2016, that is actually on nvidias channel and has two people in this very short one minute and 45 second video one is a young aliasa skiver and two is andrey carpathy and here is a quote from andrey from that YouTube video one algorithm im excited about is a language model the idea that you can take a large amount of data and you feed it into the network and it figures out the pattern in how words follow each other incenses so for example, you could take a large amount of data on how people talk to each other on the internet you can train basically a chat bot but you can do it in a way that the computer learns how language works and how people interact eventually will use that to talk to computers just like we talk to each other wow!

this is 2015!

this is two years before the transformer well carpathias at open ai he both comes up with the idea or the espouses the idea of a chat bot so that sort of had already been discussed but even before we had the transformer the method to actually pull this off he sort of had the idea that theres an important part here it figures out the pattern in how words。

follow each other incentises so theres this idea that the very structure of language and the way to interpret knowledge is actually embedded in the training data itself rather than requiring labeling this is so cool so at spring gtc this year Jensen did a fire side chat with illiup and its amazing you should go watch the whole thing but in it this question comes up Jensen kind opposes as a strong man like hey, some people say that gpt three four ChatGPT are everything going on all these lms theyre just probabilistically predicting the next word in a sentence they dont actually have knowledge and elia has this amazing response to that he says OK well, consider a detective novel yes!

at the end of the novel!

the detective gathers everyone together in a room and says i am now going to tell you all the name of the person who committed the crime and that persons name is blink the more accurately an lm predicts that next word IEEE the name of the criminal ipsofacto the greater its understanding not only of the novel but of all general human level knowledge and intelligence because you need all of your experience in the world and as a human to be able to guess who the criminal is and the lms that are out there today, GBT3, GBT4, Lama, bardthese others they can guess who the criminal is oh!

yeah, put up in in that understanding vs predicting the hot topic desuer so David is now a good time to fast forward two years to 2017 to the transformer paper absolutely and tell us about the Transformer OK, so Google 2017 Transformer paper comes out the called attention is all you need and its from the Google brain team right?

yes!

said ilia just left just left two years before to start open ai so machine learning on natural language just to set the table here had long been used for things like autocorrect or foreign language translation but in 2017, Google came out with this paper and discovered a new model that would change everything for these fieldsn unlockanother one so hereisthe scenario your translating a sentence from English to French you could imagine that a way to do this would be one worded to time in order but for anyone whoever traveled abroad and try to do this you know that words are sometimes rearranged in different languages so thats a terrible way to do it you know United States in Spanish is a status unidos so failure on the very first word in that example so enter this concept of attention, which is a key part of this research paper so this attention this fairly magical component of the transformer paper it literally is what it sounds like it is a way for the bottle to attend to different areas of the input text at different times you can look at a large amount of context, while considering what word to pick next in your translation so for every single word that youre about to output in French you can look over the entire set of inputedwords to figure out what words you should wait heavily in your decision for what to do next?

this is why ai machine learning was so narrowlyapplicable before if you enter promorfies it and you think of it like a human it was like a human with a very very short attentionspan yes!

now heres the magical part well, it does look at the whole inputtext to consider what thenexword should be it doesnt mean that it throws away the notion of position entirely it uses a technique called positional encoding so it doesnt forgettheposition of the words all together so its got this cool thing where it waits the important part relevant to your particular word, and it still understands position so remember i said the attention mechanism looks over the entire input everytime its pickingwhatwired output that sounds very computationally hard yes, in computer science, this means that the attention mechanism is oof and squared oh!

thats givenme the hebgis back to my interacs classes in college oh!

just wait till we get through this episode it gets deeper so obviously yes, traditionally youd say this is very very inefficient and it actually means that the larger yourcontext window aka token limit aka prompt length gets the more computationally expensive it gets on a quadatic basis so doubling your input means quadrubling the cost to compute an output or tripling your input means 9 times the cost it gets real nearly yeah, it gets real expensive, real fast but gbus to the rescue the amazing news for us here is that these transformer comparisons can be done in parallel so even though there are lots of them to do if you have big gpu chips with tons of cores, you can do them all at exactly the same time and previous technologies to accomplish this like recurrent neural networks or lstms long short term memory networks, which is a type of recurrent neural network, etc those required knowingtheoutput of each step before beginningthe next one before you picked the next word so anotherwords they were sequential since they dependent on the previous word now with transformers even if your string of text that your inputing is a thousandwars long it can happen just as quickly in humid measurable time as if it were tenwardlongsupposing that there were enough cores in that big gpu so the big innovation here is you could now train sequence based models in a parallel way you couldnt train models of this size at all before i let alone cost effectively yeah!

this is huge alllistnews out there starting the sound very familiar to the world that we live in today yeah!

i sort of did a slide of hand there more thing translation to using words like context。

window and token length you can kind of see where this is going yup so this transformer paper comes out in 2017 the significance is huge but for whatever reason, theres a window of time where the rest of the world doesnt quite realize it so Google obviously knows how important this is and theres like a year where googlesai work even though Alia has left an open ai is a thing now accelerates again beyond else in the field so this is when Google comes out with smart compos in gmail and they do that thing where they have an ai bot that all call local businesses for you remember that the demo from io that they did did that overship i dont know maybe it did maybe this is Google here like the capabilities are there the product sense about as much this is when they really start investing in waymo。

but again where really manifest is just back to serving ads in searching recommending YouTube videos like theyre just crushing it in this period of time open ai and everyone else though they have an adopted transformers yet theyre kind of stuck in the past and theyre still doing this really research 一 computer vision project so like this is when they build a bot to play Dota to defence the agency to the video game and super impressive stuff like they beat the best Dota players in the world at Dota by literally just consuming computer vision like consuming screen shots an inferring from there and thats a really hard problem because Dota two is not a game where you get to see the whole board at once so it has to do a lot of like really intelligent construction of the rest of the game based on just a single players worth of input so its unbelievably cutting edge research for the past generation its a fasterhorse basically maybe yeah, i mean they were also doing stuff like a universe, which was the 3D modeldworld world to train selfdriving cars you dont really hear anything about that anymore, but they built this whole thing i think it was using grand theft, auto as the environment, and then it was doing computer vision training for cars using the gta world it means crazy stuff but it was kinda scatter shot yeah!

it was scatter shot i guess what im saying is it was still in this narrow use keys world they werendoing anything approaching BBT at this point in time meanwhile Google had kind of moved on yep now one thing i do want to say in defense of open ai and everybody else in the field at the time they didnt just have their heads in the sand to do what transformers enabled you to do which Ben youre gonna talk about in a sec cost, a lot in computing power gbuseannvidia and the transformer made it possible but to work with the size of models youtalking about your talking about spending on amount of money thats certainly for a nonprofit and anybodyreally accept Google was untenable right its funny David!

you made this leap to expensive and large models all we were doing before was mirrlytalking about translating one sentence to another the application of a transformer does not necessarily require you to go and consume the whole internet and create a foundational model but lets talk about this transformers lenthemselves quite well, as we now know to a different type of task, so fora giveninputsentence instead of translating to a target language they can also be used as nextwordpredictors to figure out what word should come next to the sequence you could even do this idea of pretraining with some corpus of text to help the model understand how it should go about predicting that next word so backing up a little bit lets go back to the recurrent neural networks the state of the art before transformers all they had this problem in addition to the fact that they were sequential rather than parallel they also had a very short context window so you could do a nextword predictor, but it wasnt that useful because it didnt know what you were saying more than a few words ago by the time youget to the end of the paragraph, it would forget what was happening at the beginning it couldn sort of hold on to all that information at the same time so this idea of a nextwordpredictor that was pretrained with a transformer could really start to do something pretty powerful, which is consume large amounts of text, and then complete the next word based on a huge amount of context yep were starting to come up this idea of a large language model and were going to flash forward here just for a moment to do some illustration that will come back to the story in gpt one the first open ai model this generative pretrained transformer modelgpt it used unsupervised pretraining, which basically meant that as it was consuming this corpus of language it was unlabeled data the model was inferring the structure and meaning of language mirrlyby reading it, which is a very new concept in machine learning the cononicalwisdom is that you need extremely structure data to train your smallish modelon because how else are you going to learn what the data?

actually means this was a new thing you can learn what the data means from the data itself its like how a child consumes the world where only occasionally does there parentsay no, no, no you have that wrong thats actually, the color red, but most of the time theyre just self teaching by observing the world as a parent of a two year old can confirm, and then a second thing happens after this unsupervised pretraining step where you?

then have supervised finetuning the unsupervised pretraining used a large corpus of text to learn the sort of general language and then it was fine tuned on label data sets for specific tasks that you sort of really want the model to be actually useful for so to give people a sense of why were saying that the idea of training on very very very large amounts of data here is crazy expensive DBT one had roughly a hundred and twenty million parameters that it was trained on GPT2 had one point 5 billion GPT3 had a hundred and 75 billion and GPT4 open AI has an announce but its rimmed that it has about one point seven trillionparameters that it was trained on this is a longway from alexnet here its scalinglike nvidias market cap。

there is this interesting discovery basically that the more parameters you have the more correctly you can predict the next word these models were basically bad sub ten billion parameters i mean maybe even sub a hundred billion parameters they would just hallucinator they would be nonsensical its funny when you look at some of the like one billion parameter models youre like thereisno chance that turns it anything useful ever but by nearly adding more training data and more parameters, it just gets way way better theres this weirdly emergent property where Transformer base models scale really well due to the paralysm so as you throw huge amounts of data at training them you can also throw huge amounts of Nvidia GPU set processing that exactly and the output sort of unexpectedly gets magically better i mean i know, i keep saying that, but it is like wait so we dont change anything about the structure we just give it way more data and let it run these models for a long time and make the parameters of the modelway bigger and like no researchers expected them to reason about the world as well as they do, but it just kind of happened as they were exploring larger and larger models so in defensive open ai they knew all this。

but the amount of money that you would have to spend to buy gpu sorrentgpuser in the cloud to train these models is prohibitively expensive and you know even Google at this point time this is when they start building their own chips tpuse cause you theyre still buying tons of hardware from Nvidia。

but theyre also starting to source their own here yeah and importantly theyve at this point were getting ready to release Tensorflow to the public, so they have a framework where people can develop first off and theyre like look if people are developing using our software, then maybe it should run on our hardware thats optimized work with that software, so they actually do have this very plazable story around why theyre hardware?

why theyre software framework he was kind of a surprising move when they open sourced it because people were like gas you know why is Google giving away the farm for free here but this was three four years early in a very pressant move to really get a lot of people using Google architecture compute at scale yep allwithin Google cloud yep so with this it starts to look like maybe this whole open ai booendloggle didnactually accomplish anything and the world ai resources are more than ever just locked back into Google so in 2018。

eelon get super frustrated by all this basically theres a hissy fit and quits and pieces out of open ai theres a lot of drama around this that were not going to cover now he mayor may not have given an ultimatum to the rest of the team that he would either take over and run things or leave whoknowsdidtyland but whatever happened this turns out to be a major catalyst for the rest of the open aitamen truly a history turning on a knife point moment it was also a probably super bad decision by Elon but again story for another day so theres this great explanation of what happened in the summer for peace?

that will link to on our sources theauthor says that fall it became even more apparent to some people at open AI that the costs of becoming a cutting edge AI company were going to go up Google brains Transformer had blown open a new Frontier where AI could improve endlessly, but that meant feeding endless data to training a cosley and ever open AI made a big decision to pivot toward these transformer models on March 11TH, 2019 Openai announceit was creating a four profit entity so we could raise enough money to pay for all the compute power necessary to pursue the most ambitious AI models we want to increase our ability to raise capital, while still serving our mission and no preexisting legal structure that we know of strikes the right balance the company wrote at the time, open, AI said it was capingprofitfor investors with any access going back to the original nonprofit less than six months later, open。

ai took a 一 billion dollar investment from Microsoft yeah and i believe this is mostly if not all due to sample ments influence and taking over here so you know on the one hand, you can look at this sort of skeptically and say okay, say um you took your nonprofit and you converted it into an entity worth 30 billion dollars today on the otherhandsknowingthishistory now this was kind of the only path they had they had the raise money to get the computing resources to compete with Google and sam goes out and does these landmark deals with Microsoft yeah truly amazing and theyre opinion at the time of why theyre doing this is basically this is going to be super expensive we still have the same mission to ensure that artificial general intelligence benefits all of humanity。

but its going to be ludicrouslyexpensive to get there and so we need to basically be a for profit enterprise and a going concern and have a business that funds are research eventually to pursue that mission yep so 2019。

theydo the conversion to a four profit company Microsoft investa billion dollars as you say and becomes the exclusive cloud provider for open ai, which is going to become highlyrelevant here for Nvidia moreon that in a minute June of 2020GBT, threecomes out in September of 2020, Microsoft license is exclusive commercial use of the underlying model for Microsoft products twenty twenty one GitHub copilot comes out Microsoft investanother 2 billion dollars in open ai and enof course, this all leads to November thirtyeth 2022INJENSONSWORDS the ai heard around the world Openai comes out with chat dbt as you said Ben the fastest product in history to reach a hundred million users in January 2023 this year Microsoft investanother ten billion dollars inopenai announcestheyreintegrating gpt into all of their products and then in may of this year gpt for comes out and that basically catches us up to today we eventually need to go to a wholeanother episode about all the details here of open ai Microsoft, but for today, the salient points are one thanks to all this general ai as a userfacing product emerges as this enormous opportunity two to facilitate that happiting you needed enormous amounts of gpu compute obviously benefitting Nvidia。

but just as important three。

it becomes obvious now that the predominant way that companies are going to access and provide that compute is through the cloud and thecombination of those three things turns out to be basically the single greatest moment that could ever happen for Nvidia yes!

so yourteingall of this up and so far im thinking so this is like the open ai and Microsoft episode like what does this have to do with Nvidia and god theres a great Nvidia story here to be told so lets get to the Nvidia side of it but first, we want to thank our friends at statsig so weve been talking this episode about names in ai that you know nvideo, Google, etc but theres a name you probably dont know thats powering a lot of the ai wave behind the scenes static a ton of big ai companies and propic characterai rely on statser to test deploy and improve their models and applications and how this happened is crazy because that sigdid not start as an ai company yeah!

if you listen to our ack to interview with founder and covjagi, you know, that stat seg is a platform that combines, feature, flagging, experimentation and product analytics they help teams run experiments in their products automatianalysis, launch new features and analyze product performance their focus was on taking these pretty traditional product workflows and making them easier by giving teams one connected tool to move fastmakedata driven product decisions so how does that really to ai well?

if you veribute anything with these ai apis, you know, there are a ton of things to test, like the modelversion, the prompt, or the temperature and adjusting these can have huge impact on the ai applications performance, so these ai companies have started using stat signed to measure the impact of changes to their models and the customer facing applications using real user data even nonai companies like notion and figma have been using static to launch their ai features ensuring that these new features drivesuccessful outcoms for their businesses in todaysgenerative ai world product decisions arent just the product features anymore theyre literally like the waitsome temperatures of the models underline the products yep so whether youbuildenwithai are not static can help your teamshipfasterunmake data driven product decisions if youre startup。

they have a supergenerous free tier anespecial program for venturebacked companies if youre a large enterprise, theyof clear transparent pricing with no seedbased feeds, acquirecommunity members can take advantage of a special offer to including five million free events a month and white glove onboarding support discovisitstatsigdomslashacquiredthatsthatsig .com slash acquiredto get started on your data driven journey OK so when video OK, so we just said these three things that weve painted the picture of on the first part of the episode here that a generative ai is like possible a thing and its now getting traction be irequires an unbelievable massive amount of gpu compute to train and three it looks like the predominant way that companies are going to use that compute is going to be in the cloud the combination of these three things is i think the most perfect example, weve ever covered on this show of the old saying about luck beingwhat happens when preparation meets opportunity for Nvidia here so obviously the opportunity is generative ai, but the preparation from Nvidia has literally just spent the past five years working insanely hard to build a new computing platform for the data center a gpu accelerated computing platform to in their minds replace the old CPU led Intel dominated X86 Architecture in the data center and for many years, i mean they were getting some traction right in the data center segment was growing for Nvidia, but people were like OK?

you want this to happen but like why is it gonna happen right theres these little workloads here in there that will toss you Jensen that we think can be accelerated by your cool GPUs and then you know crazy things like crypdo happened and there is like AI researchers in academic labs they are using it as you know supercomputers but for the longest time, the data center segment of Nvidia, it just wasnt clear that organizations had enormous parts of their software stack that they were going to shift to gpuse like why whats driving this and now we know what could be driving it and that is ai uh not only could be but if you look at their most recent quarter, absolutely freaking is OK so now it begs the question why is it driving it and David are you open to me giving a little computer science lecture on computer architecture oh, please do all right i need to do my best professorimpressionhere do i love computer sciencing college they were my favorite classes i will say doing these episodes this tsmc it really does bring back the thrill of being in a cs lecture and being like oh, thats how that works like it just really fun so lets take a step back and consider the classic computer architecture the vanoymen architecture now the vanoymen architecture is what most computer smos are based on today where they can store a program in the computers memory and run that program you can imagine why this is the dominant architecture otherwise we need a computer that is specialized for every single task the keything to know is that the memory of the computer can store two different things the data that the program uses and the instructions of the program itself, the literal lines of code and in this example were about to paint all of this is wildly simplified because i dont want to get into caching and speeds of memory and you know where memories located, not located so lets just keep it simple so the processor in the von noimen architecture executees this program written in assembly language, which is the language that compiles down to the bytecode that the processor itself can speak so its written in an instructionset architecture, an isa from arm for example or Intel before that yes and each line of the program is very simplistic, so were going to consider this example where im going to use some assembly language sudo code to add the numbers two and three to equal five Ben are you bat?

a program live on acquired well!

its sudo assembly language code so the first line is weve going to load the number two from memory weve got a fetch it out of memory and were going to load it into a register on the processor so now weve got the number two actually sitting right there on rcpu ready to do something with adline of code number one, two were get a load the number three inexactly the same fashion into a second registers weve got two cpu registers with two different numbers the third line were going to perform an ad operation, which performance the arithmetic to add the two registers together on the cpu and store the value in some either third register or into one of those regions, so thats a more complex instructions since its arithmetic that we actually have to perform, but these are the things that CPUs are very good at doing math operations on data fetched from memory and then the fourth in final line of code in our example, is we are going to take that five that has just been computed and is currently held temporary and a register on the cpu and we are going to write that back to an address in memory so the forelines of code are load load ads store this all sounds familiar to me, so you can see each of those four steps is capable of performing one and only one operation at a time and each of these happens with one cycle of the cpu so if youve heard of Giga Hertz, thats the number of cycles per second, so a one gigahertz computer could handle the simple program that we just wrote, 2050X inasinglesecond, but you can see something going on here three of our four clockcycles are taken up by loading and storing data to memory now this is known as the vanoymen bottleneck and it is one of the central constraints of ai or eleasted has been historically each step must happen in order and only one at a time so in this simple example, it actually would not be helpful for us to add a bunch more memory to this computer i cant do anything with it, its also only incrementally helpful to increase the clockspeed if i double the clockspeed, i can only execute the program twice as fast if i need like a million x speed up for some ai work with im doing im not going to get it there with just a faster clockspeed thats not going to do it and it would of course be helpful to increase the speed at which i can read and write to memory but im kind abound by the laws of physics there theres only so fast that i can transmit data over a wire now the great irony of all of this is that the bottleneck actually gets worse over time not better because the cpu is get faster and the memorysize increases, but the architecture is still limited so this one passky single channel known as a bus i dont actually get to enjoy the performance gains nearly as much as i should because im jaming everything through that one channel and that only gets to sort of be used one time per every clock cycle so the magical onlock of course, is to make a computer that is not of anoymen architecture to make programs executible in parallel and massively increase the number of processors or course, and that is exactly what Nvidia did on the hardware side and all these ai researchers figgered out how to leverage on the software side but interestingly now that weve done that David the constraint is not the clockspeed or the number of cors anymore for these absolutely enormous language models its actually!

the amount of onship memory that concerns us i thought youre going and this is why the data center and what Nvidia is been doing is so important yes!

theres is amazing video that will link to on the asianometry YouTube channel that we link to also on the tsmc episode, but the constraint today is actually in how much high performance memory is available on the chip these models need to be in memory all at the same time and they take up hundreds of gigabytes, so while memory has scalled up i mean were gonna get flashing all the way forward the each one hundreds onship ram is like eighty gigabites, the memory hasnt scaled up nearly as fast as the models have actually scaled in size the memory requirements for training ai are just obscene so it becomes imperative to network multiblechips and multiple servers of chips and multiple racks of servers of chips together into one single computer and im putting computer in air quotes there in order to actually train these models its also worth noting we cant make the memorychips any bigger due to a quirk of the extreme ultraviolet photolithography that we talked about the uv on the tsmc episode chips are already the full size of the retical, its a physics and waveley constrain you really cant catch chips larger without some new inventions that we dont have commercially viable yet so what it ends up meaning is you need huge amounts of memory very closeto the processors all running in parallel with the fastest possible data transfer and again this is a vast over simplification, but you cant get the idea of while of this becomes so important OK!

so back to the data center and heres what Nvidia is doing that i dont think anybody else out there is doing and why its so important for them that all of this new generative ai world, this new computing era as Jensen dubzip runsinthedatasetter so Nvidia has done three things over the last fiveyears one impably most importantly related to what youtalking about Ben theymade one of the best acquisitions of all timeback in twenty, twenty and nobody had any idea they bought a quirky little networking company out of Israel called melinox wow!

wasa little, theypaid seven billion dollars OK!

yeah edit was rd a public company right?

it was yeah yeah!

but it was definallyquickly now what was melinox?

melinoxis primary product was something called in finiband, which we talked about a lot with chase lock Miller on our ack two episode with him from cruise so and actually。

infiniband was an open source standard or managed by a consourcium there were a bunch of players in it, but the traditional wisdom was well in final band is way, faster way, higher band with the much more efficient way to transfer data around a data center at the end of the day Ethernet is the Louis combdidnominator, and so everyone had to implement Ethernet anyway, and so most companies actually exited the market and melinox was kind of the only infiniband spec provider left yeah!

so you said wait what is in finiband?

it is a competing standard to Ethernet it is a way to move data between racks in a data center back in 2020 everybody was like Ethernets fine, why do you need more Bandwithin Ethernet between racks in a data center?

what could ever require 32 gigabits a second of bandwith running down a wire in a data center well, it turns out if youre trying to address hundreds maybe more than hundreds of GPUs as one single compute cluster detraying a massive ai model yeah!

you want really fast data enter connects between them right people thought oh sure for super computers for these academic purposes, but what the enterprise market needs in my shared cloudcomputing data center is Ethernet thats fine and most workloads are gonna happen right there on one rack and maybe maybe things will expand to multiple computers on that rack, but certainly they wont need to network multiple racks together an Nvidia stepsin and you get Jensen saying hey, dummies, the data center is the computer listenome when i tell you, the whole data center needs to be one computer and when you start thinking that way you start thinking she is were really going to be cramming huge amounts of data through wires that are going between these relic how can we sort of thinkabout them as if thats all sort of onship memory or as closest, we can make it to onship memory even though thats in a box located three feed away yep!

so thats peace number one of nvidiasgranddata center plan over the last fiveyears, peace number two is in September twenty twenty two Nvidia makes a quite surprising announcement of a new chip!

not just a new chip an entirely new class of chips the theyare making called the Grace cpu processor Nvidia is making a cpu this is like heretical but Johnson i thought all computing was gonna be accelerated what are we doing here on these armcpus?

yeah!

these Grace CPUs are not for putting in your laptop they are for being the cpu component of your entire data center solution that is specifically from the ground up design to organistrate with these massive gpu clusters this is the end game of a ballet that has been inmotion for thirty years member when the graphics card was subservient to the Pcie slot in intelsmotherboard。

and then eventually you know we fast forward to the future Nvidia makes these GPUs that are these beautiful standalone boxes in your data center perhaps these little workstations that sit next you, while youdoing graphics programming, while youre directly programming your GPU, and then of course, they need some CPU to put in that so theyre using AMD year Intel, or theyre licensingsome CPU, and now theyre saying you know what were actually just going to do the cpu to so now we make a box and its a fully integrated and video solution with rgp use rcpuse。

rnvlink between them are in finiband to network it to other boxes and you know welcome to the show one more peace to talk about the third lake of the stool there strategy before we get to what it all means that i think youre about to go to spoiler alert you say solution i hear gross margin the third part of it is the GPUs up until Nvidia current GPU generation the hopper generation of GPUs for the data center there was only one GPU architecture ednvidia, and that same architecture, and those same chips from the same wafers made it tsmc some of them went to consumer gaming graphics, cards, and some of those dies went to A100 GPUs in the data center it was all the same architecture starting in September of 2022 they broke out the two business lines into different architectures so theres the hopper architecture named after great computer scientist Grace Hopper!

i think rear adrol in the us navy Grace hopper!

get it Grace cpu Hopper gpu Grace Hopper the h one hundreds that was for the data centers and then on the consumer side they start a whole new architecture called love lace after a love lace and that is the rtx for xx so you buy uh you know top of the line rtx for what have you gaming card right now that is no longer the same architecture as the h one hundreds that are poweriingchat youbt its an architecture this is a really big deal because what they do with the hopper architecture is they start using whatcalled chip on a wafer on subst COWOS coos when you start talking to the real seminords thats when they start bustandout the coos conversation this is when a certain segment of our listers are gonna get really excited, so essentially what this is back to this whole concept of memory being so important for GPUs and for ai workloads this is a way to stack more memory on the dpu chips themselves essentially by going vertical in how you build the chips, this is the absolute bleating edge technology is coming out of tsmc and by Nvidia by forceating their chip architectures into a gaming segment that does not have this latest cowasttechnology this allows them to monopoize like a huge amount of tsmcsed capacity to make the cowaschips specifically for these h one hundreds, which allows them to have way more memory thenothergpusonthemarket yes!

so this gets to the point of why cant they seem to make enough chips right now well, its literally a tsmc capacity problem so there these two components theyre extremely related that youtalking about the coos chip on wayforon substrate and the highbadwith memory so theres this great post from semianalysis where the author points out A2POINT 5D chip, which is basically how you assemble this coos stuff to get the memory really closeto the processor and of course, two point 5D it is literally 3D, but 3D meanssomething else its even more 3D so they came up with this two point 5D denometer anyway, the two point 5D chip packaging technology from tsmc is where you take multiple active silicon dies like the logic chips and the stack of high band with memory and they stack up on one piece of silicon and theres more complexity here but the important thing is coos is the most popular technology for gps and ai accelerators for packaging these chips and its the primary method to copackage highband with memory again remember think back to the thing thats most important right now is get as much highband with memory as you can closest to the cpu next to logic to get the most performance for trading in inference, so coosrepresents right now about ten to 15 percent of tsmcs capacities and many the facilities are custom built for exactly these types of chips that theyre producing so when Nvidia needs to reserve more capacity, theres a pretty good chance that theyve already reserved some large part of the ten to 15 percent of tsmcse total footprint and tsmc needs to like go make more fabs in order for Nvidia to have access to more cooscapable capacity yeah!

which as we know it takes years for tsmc to do this yep!

there are more experimental things that are happening like i would be a remiss not to mention there are actually experimentof doing compute in memory like as we shift away from von noimen and sort of all bets are off now that were open to new computing architectures there are people exploring well, what if we just process the data, where is in memory instead of doing the very lossy expensive energy intensive thing of moving data over the copper wire to get it to the cpu all sorts of trade offsinthere。

but it is very fun to sort of dive into the academic computer science world right now where they really are rethinking like what is a computer so these three things that Nvidia has been building the dedicated hopper data center GPU architecture the Grace CPU platform the melinox power never king stack they now have a full sweet solution for generative ai data centers and then when i say solution?

i hear margins, but lets be clear you dont need to offer some sort of solution to get highmargins of your Nvidia prices setwheresupply meetsdemand and there adding as much supply as they possibly can right now like believeme for all sorts of reasons Nvidia wants everyone who wants h one hundreds to have each one hundreds but for now the prices kind of like a all right you a blank check and Nvidia you writewhatever you want on the check so there margins are crazy right now just literally because theres way more demand than supply for these things yes OK!

so lets break down what theyre actually selling so like yous thinking of course, you can and lots of people do just go by eight one so like icare about the Grace cpu ive care about this melinox stuff im right of my own data center im really good at it and the people who are most likely to do this are the hyperscalers or as Nvidia refers to them the CSPS the cloudservice providers this is AWS this is Azure this is Google this is Facebook for their internal use like Nvidia dont give me one of these DGX servers that you assemble just give me the chip and i will integrate it the way that i want integrated i am a world class data center architecture operator i dont want your solution, i just want your tips so they sell a lot of those now Nvidia of course, has a also been seeding new cloud providers out there in the ecosystem like our friends, a cruso also core weave and lambda labs if you heard of them, these are all new GPU dedicated clouds that Nvidia is working closely with so theys selling h one hundreds and a one hundreds before that to all these cloudproviders。

but lets say you are an arbituri company in the fortune 500 that is not a technology company and my god do you not want to miss the boat on general of ai and you got a data center of your own well Nvidia has a DGX for you yes!

they do full gpu base supercomputer solution in a box that you can just plugrate into your data center and it just works there is nothing else on the market like this and it all runs cuda it is all speakin the exact language of the entire ecosystem of developers that know exactly how to write software for this thing, which means that whatever developers you already had who were working on ai or anything else everything they were working on is just going to come right over and run within your brand new shineaisupercomputer cuz it all runs cuda amazing more on CUDA in a minute but as we said you say solution high here gross margin Nvidia sells these DGX systems for like a hundred and fifty to 300 dollars a box thats wild and now with all these three new likes the stool, hopper, Grace and melinox these systems are just getting way more integrated, way more proprietary and way better so if you want to buy a new top of the line DGX H100 system the price starts at five hundred thousand dollars for one box and if you want to buy the DGX GH20 Superpod this is the ai walle that Jensen recently unveiled the huge like roomfull of ai and its like twenty racks wide imagine an entire row at a data center yes, this is 2056 Grace, Hopper, DGX racks all connected together in one wave theyrebuilingthis as the first turnkeyai data center that you can just buy and contrain up trillian parameter dbt for class model the pricing on that is callas courses。

but im imagining like hundreds of millions dollars like i doubt its a billion but hundreds of millions easily wild lets talk about the h one hundred of the baseball card right here on this insane thing that theyve built so they launched it in September 2022, its the successor to the A100 1 gpu one H100 cost 40 dollars so thats how you get to that pricepointyous talking about thats what theyselling to Amazon in Google in Facebook right, and you mention that five hundred thousand dollar pricepoint the five hundred thousand dollars is the eight 4002H 100 inabox with the grey cpu and you know the nice ball around yep which do the math on that so 8 times 40 thats three hundred thousand dollars so thats essentially an extra hundred eighty thousand dollars a margin that Nvidia is getting out of selling the solution its an arm cpu it doesnt cost the beddything to make that and these 40H 100 have margin of their own so like every time they bundle more theres more margin in the fully assembled i mean thats literally bundle economics you are entitled to margin when you bundle more things together and provide more value for customers but just a like illustrate the way that this pricing works so the reason you want an h one hundred is there thirtymesfasterthan an A100WHICHMIND you is only like two and a half years older it is 9 timesfaster for ai training the H100 is literally purpose built for training lms like the full selfdriving video stuff its super easy to scale up its got eighteen and a half thousand CUDA cors memory are talking about the von noimen example, earlier like that is one computing core that is able to handle you know those four assembly language instructions this one H100, which theycalling agpu has eighteen and a half thousand cores that are capable of running CUDA software its GOT640TENSORCORES, which are highly specialized for Matrix multipllication they have ad streaming multiprocessors so what are we up to here?

close to twenty thousand unique cores on this thing its got meaningfully higher energy usage than the A100 i mean a big takeaway here is that Nvidia is massively increasing the power requirement every time they come out with an exgeneration theyre both figring out how to push the edge of Physics but theyre also constrained by Physics some of the stuff is only possible with way more energy this thing weize 70 pounds this is one h one jessifix a big deal about this every keynote that he gives a like oh, i cant lift it got a quarter trillian transistors across 35 parts it requires robots to assemble it not only does it requirephysical robots to assemble it it requires ai to design it theyre actually using ai to design the chips themselves now i mean they have completely reinvented the notion of what a computer is totally and this is all part of jensons pitch here to customers yes!

are solutions are very expensive however, he uses the line that he loves the more you buy, the more you save if you could get your hands on some right, but what he means by that is like OK, say your mcdonalts and youre trying to build a generative ai so that i dont know customers can order something youre using it in your business if you were gonna try and build and run that in your existing data center infrastructure, it would take so much time and cost you so much more over the long running compute then if you just went and bought my superpod here。

you can play and play and have it up and running in a month yep and by the fact of this is all accelerated computing the thingsyoudoing on it you literally wouldnt be able to do otherwise or might take you a lot, more energy, a lot, more time, a lot, more cost there is a very valid story to buying and running your workloads here or renting from any of the cloudservice providers and running your workloads here is more performance because the results just happened much faster!

much keeper or at all yep imitant energy here like this is also jensins argument hes like yes, these things take a ton of energy, but the alternative takes even more energy so we are actually saving energy if you assume, this stuff is going to happen now theres a bit of caveout here in that it cant have an accept on these types of machines so he enabled this whole thing but he has a point oh!

i totally buy it though i mean i think theres a very real case around look you only have to train a model once and then you can do inference on it over and over and over again i mean the analogy i think makes a lot of sense for model training is to think about it as a form of compression, lmsareturning the entire internet of text into a much smaller set of modelweits this has the benefit of storing a huge amount of usefulness in a small footprint, but also enabling a very inexpensive amount of compute again relatively speaking in the inference step for everytime that you need to prompt that model for an answer of course, the tradeoff youmaking there is once you encode all the training data into the model it is very expensive to redo it, so you better do it right the first time or figure out little ways to modify it later, which a lot of ml researchers are working on, but i leads think a reasonable comparison here is to compress a zillion layer photoshopfile for anybody, thats ever dealt with oh, ive got a threegigabyte photoshopfile well, thats not a thing youve gonna send to a client youve got a compress it into a jpeg and youre gonna send that and the jpeg is in many ways more useful as a compressed faq, simily of the original layers comprising the photoshopfile。

but the tradeoff is you can never get from that compresslittle jpeg back to the original thing so i think the analogy here is like youre saving everyone from needing to make the full psd every time because you can just use the jpeg the vast vast majority of the time so hopefully weve now painted a relatively coherent picture of both the advances that made the generative ai opportunity possible that it has truly become a real opportunity and why Nvidia even above the obvious reasons was just so well positioned here particularly because of the data center centric nature of these workloads and that they had been working so hard for the past five years to fundamentally rearchitec the data center yep so on top of all this Nvidia recently announced yet another pretty incredible piece of their cloud strategy here so today like weve been saying if you want to use h one hundreds and a one hundreds say youre ai startup the way youre probably gonna do that is youre gonna go to a cloud either a hyperscale or or dedicated jupu cloud like crusal or core weave or flamed a labs at like and yougonna rent your gps and Ben you did some research on this so like what is that cost oh!

i just look at the pricing pages on public clouds today i think azure in aws where i look you can get access to a DGX server thats eight a one hundreds for about thirty bucks an hour or you can go over day aws and get a p 5 dot 48X large instance, which is 8H 100, which i believe is an HGX server for about a hundred dollars an hour so about three times as much and again what i say you can get access i dont actually mean you can get access i mean thats the price right if you could get access thats what you would pay for it correct OK?

thats just getting the GPUs but if you buy everything we were talking about a minute ago say your McDonalds are ups or whoever and youlike you know i really like jenson i buy what youselling i want this whole integrated package i wanted ai supercomputer in a box that i can plug into my wall and have it run but im all in on the cloud i dont run my own data set ers anymore Nvidia has now introduced DGX cloud yeah!

and of course, you could rent these instances from Amazon, Microsoft, Google, Oracle but like youre not getting that full integrated solution right and yougetting some integration the way that the cloud servicerwriter wants to create the integration using their proprietary services and to be honest you might not have the right people on staff to be able to deal with this stuff in a sudo bear metal way even if its not in your data center and yourenting it from the cloud, you might actually need based on your workforce to just use a webrowser and just use a real nice easy web interface to load some models in from a trusted source that you can easily pair with your data and just click run and not have to worry about any the complexity of managing a cloud application thats in Amazon or Microsoft。

or something a little bit scarier and closer to the metal yup so Nvidia has introduced DGX cloud, which is a virtualized DGX system that is provided to you right now the aothercloud so azure, an Oracle and Google right the boxes are sitting in the data centers of these other csps right theyre sitting in the other cloudservice providers, but as a customer。

it looks like you have your own box that youre enting you log into the DGX cloud website through Nvidia and its all nice whizzywig stuff theres an integration with huggingface where you can easily deploy models right off of huggingface you can upload your data like everything is just really!

this is unbelievable Nvidia launch their own cloudservice through other clouds in Nvidia does have i think six data centers but that i dont believe is what theyre actually using to backdgx cloud no so starting price for DGX cloud is 37 万 dollars a month?

which will get you an A100 based system。

not an eight one hundredbase system so the margins on this arens seen for Nvidia and their partners, a listener helps us out an estimated that the cost to actually build an equivalent A100 DGX system would be today something like a hundred and twentyk remember this is the previous generation this is not h one hundreds and you can rent it for 30 七 k a month, so thats three month payback on the capacs for this stuff for Nvidia and their cloudpartners together and even more for Nvidia more important longer term for enterprisesthebythis, Nvidia now has a direct sales relationship with those companies not necessarily intermediated by sales through Azure or Google or AWS even though the compute is sitting in their clouds。

which is crucially important because at this point, the CFO collect crass said on their last earnings call that about half of the revenue from the data center business unit is csp and then i believe after that is the consumer internet companies and after that is enterprises so theres a few interesting things in there one of which is oh my god there revenue for this is concentrated among like five eight companies with these csp two they dont necessarily own the customer relationship they own in the developer relationship through CUDA you know theyve got this unbelievable legosystem right now of Nvidia developers thats stronger than ever, but in terms of the actual customer half of their revenue is intermediated by cloud providers the second interesting thing about this is even today in this ai explosion the second biggest segment of data centers is still the consumer internet companies its still all that stuff we are talking about before of the uh uses of machine learning to figure out what should show up in your social media algorithms and match ads to you?

thatactually bigger then all of the direct enterprises who are buying from Nvidia so the DGX cloud play is a way to sort of shift some of that CSP revenue into direct relationship revenue so all of this brings us to twenty twenty three mayof this year Nvidia reported there Q1FFISQL 24 earnings nvidias on this weird January physicalyear and thing so key one twenty four is essentially key one twenty three but anyway, inwhichrevenue was up 19 percent quarter over quarter to seven point 2 billion, which is great cause mer data terrible end of twenty twenty two with the right offs encrypto falling off a claffin all that yes!

its amazing that in that stratechery interview when was that an uh March of 2023, Jensen said last year was unquestionably a disappointing year this is the year chat upt was released and this, while the roller coaster this company has been on the timeframe is so compressed here and part of that of course, is etherie a moving to proof stay the end of the crypdo thing for Nvidia, which im sure theyre actually thrilled about, but part of it was they also put in a ton of pre orders for capacity with tsmc that then they thought they werenkineed so they had to write down so from an accounting perspective it looks like a big loss, like a really big blamish on their finance is last year but now oh!

my god are they glad that they reserved all that capacity yup its actually going to be quite valuable so speakingof you know this key one earnings is like great up nineteen percent quarter over quarter but then they drop the bombshell due to unprecedented demand for generative ai compute in data centers Nvidia Forecasq 2 revenue of Eleven billion dollars, which would be upanother fifty 3 percent quarter of a quarter of a key one 65 percent year over year the stock goes not five percent in afterhourstraining yep this is a trillion dollar company or at least this made them a trillion dollar company。

but like a company that was previously valued at around 80 dollars popped 25 percent after earnings well。

it is even crazier than that back when we did our episodes last April, Nvidia was the eight floors company in the world by market cap had about A660 billion dollar market cap that was down slightly off the highs, but that was kind of the order of magnitude back then it crash down below 300 billion, and then within a matter of months its now backup over a trillion just while and then all of this combinates last week at the time of this recording when Nvidia reports queue to physical 24 ernings and this earnings release, we usually know talk about like individual earnings releases on acquiredbecause like in the long argoftime, who cares this was a historic event i think this was one of if not the most incredible earnings release by any scalled public ever seriously no matter what happens going forward last week was a historic moment the thing that blows my mind the most is that there data center segment alone did ten billion dollars in the quarter thats more than doubling off of the previous quarter in three months。

they grew from forish billion to ten billion of revenue in that segment and revenue only happens when they deliver products to customers this isnt pre orders, this isnt clicks!

this isnt wave your hands around stuff this is we delivered stuff to customers and they paid us an additional six billion dollars this quarter than they did last quarter so here the full numbers for the quarter total company revenue of thirteen point 5 billion up eighty 8 percent from the previous quarter and over a hundred percent from yeargo and then Ben like he said in the data center, segment revenue of ten point three billion so ten point three out of thirteentpoint 5 for a segment, then basically didnt exist five years ago for the company, thats up a hundred and 41 percent from key one and a hundred and 71 percent from a year ago this is ten billing dollars that kind of growth at this scale ive never seen anything like it no neither has the market thats right and so this is the first time i noticed it Jensen talked about this in key one earning so wasnt the first time but he brings back the trillian dollar Tam not in a slide i think this time he just talks about it no but in a new way that i think is a better way to slice it this time its different get a look will spend a while here now talking about what we think about this?

but this is very different this time he frames nvidias trillian dollar opportunity is the data center and this is what he says there is one trillin dollars worth of hard assets sitting in data centers around the world right now growing at 250 billion a year annual spend on data centers to update and add to that capex is 250 billion dollars a year Nvidia has certainly the most cohesive fulsum and coherent platform to be the future of what those data centers are gonna look like for a large amount of compy workloads this is a very different story than like over gonna get one percent of this hundredtrilllion dollars of industry out there and the thing you have to believe now cause whenever someone pains a picture you say OK what do i have to believe the thing you have to believe is there is real user value being created by these ai workloads and the applications that they are creating and theres pretty good evidence i mean ChatGPT made it so open ai is roomored to be doing over a billion dollar run right now maybe multiple single digit billions and still growing meaningfully and so that is like the shining example again thats the netscape navigator here of this whole boom but the bet especially with all these fortune 500 is that there are going to be gpt like experiences in everyones private applications in a zillien other public interfaces i mean Jenson, frame said as in the future, every application will have a gpt front, and it will be a way that you decide that you want to interact with computers this more natural and i dont think he means like versus clicking buttons i think he means everyone can kind of become a programmer, but the programming language is English and so when youre sort of like well, why is everyone spending all of this money itisthatthe worlds executives with the purchasing power to go right a ten billion dollar check last quarter to Nvidia for all this stuff wholeheartedly believes from the data theyve seen so far that this technology is going to change the world enough for them to make these huge bits and the thing that we dont know yet is is that true is the gpt like experiences going to be an enduring thing for the far future or not theres pretty good evidence so far that people like this stuff and that its quite useful and transforming the way that you know everyone lives there lives and goes about day to day and does their jobs and goes through school and you know on and on and on but that is the thing you have to believe so we have a lot to talk about with regard to that in analysis。

but before we move to analysis, i think we should talk about another one of our very favorite companies here required biggest from go on yes!

absolutely and listeners you know overdoing something very cool with them this season as you know blinkest takesbooks and condenses them into the most important points so you can read or listen to the summaries its almost like a large language model compression for books there you go so couple cool things were doing one of which is David and i have made a blink this page that represents our bookshelf so if you want to read the books that influence us, you can go to blink this dot com slash required and for this particular Nvidia episode blink is has made a special collection for us amazingly, there are not really books about the history of Nvidia itself at least not yet what is unbelievable yeah, but there are plenty on ai through the years so if you go to blakis dot com slash, Nvidia you can find books by Gary Casper of Caifiui Lee and cadets who already mentioned earlier on the show who wrote the amazing wired article yep, youlovcourseget free access to that Nvidia blink is collection and anyone who signs up through that link or uses the couponcode Nvidia will then get a fifty percent off premioubsubscription to all 65 titles in their library and for those of you。

who are leaders at companies check out blinkest for business this gives your whole team the power to tap into world class knowledge right from their phones anytime they needed available at blinkest dot com slash business blinkest is a great way for your team to mastersofskills, which if you believe that this new aiworld is coming is going to be even more important for the humans in your workforce?

yes, are huge thanks to blink est and their parent company go one where David and i are both huge fans and angel investors go one and blink guest are both amazing ways for your company to get access to the most engaging and compelling content in the world are thanks to both of them and links in the showdotes OK, so David analysis we got to talk about cuda before we start analyzing anything else here talk about a lot of hardware so far on this episode but theres this huge piece of the Nvidia puzzled that we havent talked about since part two and CUDA as folks know was the initiative started in two thousand six by Jenson and Ian buck and a bunch of other folks on the Nvidia team to really make a bat on scientific computing that people could use graphic cards for more than just graphics and they would need great software tools to help them do that it also was the glimmering jentes i of oh maybe i can build my own relationship with developers and you know there can be this notion not of a microsofter Intel developer who happens to be able to you know i have a standard interface to my chip, but i can have my own developer ecosystem, which has been huge for the company so CUDA has become the foundation that everything that weve talked about all the ai applications are written on top of today, so you know you hear Jensen in these keynote reference, could the platform, could the language, and i spent some time trying to figure out like when i was watching developer sessions, unlike literally learning some cuda programs what is the right way to characterize it and what is the right way to characterize it?

today because it is evolved a lot yes!

so today cuda is starting from the bottom and going up a compiler, a runtime, a set of development tools, like a debugger and a profiler it is its own programming language cutoc plus plus it has industry specific libraries it works on every card that they ship and have shipped since 2600, which is a really important thing to know and if youre good developer your stuff works on everything anything Nvidia all this unified interface it has many layers of abstractions at existing libraries that are optimized so these libraries of code that you can call to keep your development work short and simple instead of reinventing the wheel so you know there are things that you can decide that you want to write in c plus plus and just rely on their compiler to make it run well on Nvidia hardware for you or you can write stuff in there native language and try to implement things yourself in cutesy plus plus, the answer is its incredibly flexible it is very well supported and theres a huge community of people that are developing with you and building stuff for you to build on top of if you look at the number of cuda developers over time, it was released in two thousand and six it took four years to get the first hundred thousand people, then by 2016 thirteyears in they got to a million developers, then just two years later, they got to two million, so thirteenyears to add their first thirtemillion, then two years to add their second twenty two they had three million developers, and then just one year later in may of 2023, CUDA has four million register developers so at this point theres a huge mode for Nvidia and i think when you talk to focus there and frankly when we did talk to focus there, they dont describe it this way they dont think about it like well CUDA is our moteversus competitors its more like well, look, we envisioned a world of accelerated computing in the future and we thought there are way more workloads that should be paralyzed made more efficient that we want people to run on our hardware and we need to make it as easy as possible for them to do that and were going to go to great length and have one two thousand people that work at our company there are going to be full time software engineers building this programming languaging compiler in foundation and framework and everything on top of it to let the maximum number of people build on our stuff that is how you build the developer ecosystem its different language but the bottom line is they have a huge reference for the power that it gives them at the company this is something we touched on on our last episode。

but has really crystalized for me in doing this, one Nvidia thinksofthemselves as and i believe is a platform company especially this week after the blowout earnings and everything that happened this quarter and the stock and whatnot sort of a popular takeout there that youbeen seeing a lot is oh, weve seen this moviebefore this happened with Cisco you could say over a longertimescale this happened with Intel yeah, these hardware providers these semi conductor companies theyre hot when they are hot and people want a either spend cap x and then when theyre not hot theyre not hot but i dont think thats quite the right way to characterize Nvidia they do make send my conductors and they do make data center gear but really they are a platform company the right analogie from video also is Microsoft they make the operating system they make the programming environment they make many of the applications right Cisco doesnt really have developers Intel never had developers Microsoft had developers and Intel had Microsoft but Intel then have developers Nvidia has developers i mean theyve built a new Architecture that is not avonomen computer theyve bucked fifty years of progress and instead every GPU has a stream Processor unit and as you imagine you need a whole new type of programming language and compiler and everything to deal with this new computing model and thats cuda and it freakin works and theres all these people that develop there lively who did it you talk to Jeanson and you talk to other people the company and they will tell you we are a foundational computer science company were not just sling and hardware here yeah!

i mean its interesting there are platform company for sure there are also a systems company theyre effectively selling main frames is not that different than ibm waveback when theyre trying to sell you, a you know hundred million dollar walle that goes in your data center and its all fully integrated!

and it all just works yeah and maybe ibm actually is a really good analogy like old school ibm here they make the underline technology, they make the hardware, they make the silicon, they make the operating system for the silicon, they make the solutions for customers。

they make everything and they sell it as a solution yep OK so couple other things to catch us up here as were starting analysis one big point i want to make is lets look at a timeline cause i didndiscover this until like two hours before we started recording in March of 2019, Nvidia announced they were acquiring Melinox for 7 billion dollars in cache, and i think Intel was considering the purchase in the Nvidia came in and kind of Bleu out the water and it is fair to say nobodyreally understood what Nvidia was going to do there and why it was so important but the question is why well, Nvidia knew that these new models coming out would need to run across multiple servers, multiple racks and they put a huge level of importance on the bandwith between the machines and of course, how did they know that well in August of 2019, Nvidia released what was at the time the largest transformer based language model called Megatron 8 point 3 billion parameters trained on 52GPUSEFOR 9 days, which at the time at retail would have cossomething like half a million dollars to train, which at the time was a huge amount of money to spend on modeltraining, which is what only four years ago but today thats quant Nvidia did that because they do a huge amount of research at the company and they work with every other company doing ai research and they were like oh yes, this stuff is going to work and this stuff is going to require the fastis networking available and i think that has to do with why no one else saw how valuable the melinox technology could be another thing that i want to talk about for nvidias business today is this notion of the data center is the computer and jenson did a great interview with benthompson last year, where he talks about the idea that they build their systems full stack like their dream is that you own an operate a dgx superpod and he says we build our systems full stack, but we go to market in a disagregated way integrating into the compute fabric of the industry so i think thats his sort of way of saying look customers need to use us in a bunch a different way, so we need to be flexible on that but we dont want to build each of our components such that if you do assemble them all together, it is unbelievable experience and will figure out how to provide the right experience to you if you only want to use them in peace meal ways, or you want to use us in the cloud, or the cloudproviders want to use us again is build the product as a system build the system full stack but go to market in a disagregated way and i think if i remember it in that interview Ben picked up on this and was like where are you building your own cloud intention was like well maybe well see and of course。

then they launch dtx cloud in a well maybe WEC sort of way yeah!

you could imagine there are more Nvidia data centers likely on the way that are uh fully on denoperated speaking of all this we got a talk some numbers on margin this last quarter they had a growth margin of 70 percent and they forecasted for next quarter to have a growth margin of 72 percent i mean if you go back precuda when they were a commodatized graphic card manufacture, it was twenty four percent, so theyve gone twenty four to 70 on growth margin and with the exception of a few quarters along the way for these strange one time events thats basically been a linear climb quarter over quarter, as theyve deependtheir mode, and as theyve deepen their differentiation in the industry were definitely a place right now that i think is temporary due to the supply shortage of the worlds enterprises and in some cases even governments you look at the uk or some of the middle eastern countries like blank check i just need access to Nvidia hardware thats gonna go away but i dont think this very high you know sixty five percent plus margin is gonna arrode too much yes!

i mean i think two things here one i really do believe what we are talking about a minute ago that Nvidia is not just a hardware company theyre not just a chips company they are a platform company and there is a lot of differentiation baked into what they do if you want a train GPT or a gpt classmodel, theres one option youdoing it on Nvidia theres one option and yes, we should talk about theres lots of less than gpt class stuff out there that you can do it especially inferences more of a wide open market versus training that you can do out other platforms but theyre the best and theyre not just the best because of their hardware theyre not just the best because of their data center solutions theyre not just the best because acuda theyre the best because of all those so the other sort of illustrative thing for me that shows how wide their leadis we havent talked about China yet the land of A800S yes, so whats going on last year China was twenty five percent or sales to mainland China was twenty five percent of nvidias revenue and a lot of that is they were selling to the hyperscalers to the cloud providers in China baidue, Alibaba attends and others and by the way。

baidue has potentially the largest model of anyone there gpt competitor is over a trillian parameters and may actually be larger than gbt for wow!

i dont know that yep huh is well so then i believe also in September of 2022 last year, the Biden administration announce pretty sweeping regulations and bands on sales of advance computing infrastructure David their export controls dont say bands i mean yes, thats a fine line and this is pretty close to bands what the administration introduce as part of that Nvidia can no longer sell there top of the line h one hundreds or a one hundreds to anybody in China so they created a nerftskue essentially that meets the regulations the performance regulations the A80H80?

which i think they basically just crank down the nvlinks data transfer speeds so its like buying a top of the line A100, but not with as fast of data connections as you need。

which basically makes it so you cant train large models right or you can train them as well or as fast as you could with the latest stop the incredibly telling thing is that those chips in those machines are still selling like hot cakes in China theyre still the best hardware and platform that you can get in China even a crippled version and i think thats true anywhere in the world and theres been a even a more recent spike of them because a lot of Chinese companies are reading the tea leaves and saying who expert controls might get even more severe so i should get em while i still can these 800 yep so i mean i cant think of a better illustration of just how whytheirlead is yeah thats a great point talking about the rest of nvideo just for a moment i mean this episode is about the data setter segment。

but oh, you mean they still make gaming cards to it is worth talking about this idea that omniverse is starting to look really interesting as of their conference six months ago they had seven hundred enterprises who had signed up as customers and the reason this is interesting is it could be where there two different worlds collide 3D graphics with rate racing, which is new an amazing and the demos are mindblowing and ai they have been playing in both of these markets sensed the workloads are both massively paralyzable that is the sort of original reason for them to be in the ai market if you recall back to wayback our part one episode the original mission of Nvidia was to make graphics a storytelling medium and then there mission has expanded as theyve realized my god are hardwheres really good at other stuff that needs be paralyzed to but fascinatingly with thumb diverse the future could actually look like applications where you need both amazing graphical capability and ai capability for the same application and i mean for all the other amazing uniqueness about nvia that weve been talking about and how well position they are adding this on top were there the number one provider for graphics hardware and software and ai hardware and software oh, and by the way, theres this huge application emerging where you actually do need both theys gonna knock out of the park if that comes true?

there was a s super cool demo at a resync Keynote, it might have been at SIGGRAPH where Nvidia created a game environment you know fully retarys game environment looks like a triple a game you dont look amazing you know basically distinguishable from reality, but like you really got a look hard to tell that this isnt real and this is not real humanyourtalking to so theres a nonplayable character that youtalking to an npc whois giving you like a mission they show this demo it looks amazing then theyre like the script the words that that character was saying to you were not scripted that was all generated with ai dynamic wow see you like holy crap you know!

you think about you play a video game the charactersubscripted but in this world that youre talking about you can have generative ai controlled avatars that are undescripted that have their own intelligenceisin that drives the story totally or you know an airplane thats in a simulation of not just a wintunnel but simulating millions of hours of flying time using realtime weather thats actually going on in the world and using ai to project the weather in the future so you can sort of know the real world potential things that your aircraft could encounter all in a generated graphical ai simulation i mean theres going a lot more of this stuff to come yup totally another thing to know about Nvidia that we really didnt talk about on the last episode theyre pretty employee efficient they have twenty six thousand employees and that sounds like a big number but for comparison, Microsoft, whose market cab is only twice as big has two hundred and twenty thousand so that is five x the number of employees per dollar of market cap going on over at Microsoft and this is a little bit farcical sense you know, Nvidia only recently has had such a massive market cap。

but the scale of the platform that Nvidia is building is on the order of magnitude of Microsoft scale right they have 46 million dollars of market cap per employee wild crazy which i think translates into the culture theirs weve gotten to know some folks there it really is a very unique kind of culture like it is a big text scale company but you never hear about the same kind of silly big text off that you hear at other companies at Nvidia as far as i know i could be wrong on this there is no like you know oh work from home or return to the office policy and Nvidia is like no is just like you do the job and you know nobodyesforcing anybody to come into the office here and like theyve accelerated their shipcycles well!

i also get the sense that its a little bit of a doyourlifework or dont be here situation like Jenson is revered to have 40 direct reports and as offices basically just a md conference room because hes just bounced around so much and hes on his phone and hes talking about this person and that person and like you cant manage forty people directly if your worrying about someonescrear ambitions yeah!

hes talked about this hes like i have forty direct reports theyare the best in the world at what they do this is their lifeswork i dont talk to them about their career ambitions like i dont need to like you know yeah for like recent college grads we do mentor but if you are senior employee, youbin here for twenty years, youre the best in the world of what you do and were hyper efficient and i start my day of five am seven days a weekend you do two crazy yeah, theres actually, this amazing quote from jetson that i her doesnt interview with him that i was listing to work towards the end of the conversation the interview assume Jenson you in Nvidia do these just amazing things what do you do?

relax 这次这个词 Ive really good this is a quote direct quote i relax all the time i enjoy relaxing it work because work is relaxing for me solving problems is relaxing for me achieving something is relaxing for me and hes a hundred percent serious like a thousand percent serious howall this jenson the dude is sixty years old it kind of feels like all of his peers have either decided to retier and relax or are uh you know relaxing well running there companies like theres another crop of people that are doing that and that is just not at all interesting to him or what hes doing and i cant get the sense like hes got another thirty years in him and hes are connected the company in such a way that thats the plan i donthink theres anyone else there were theyre like getting ready for that person to take over i think the company is a extension of jensonsthoughts and will, and drive, and believe about the future and thats kind of what happens i dont know if there is。

there isnt a Jenson in Lorry Hong foundation, but if there is hes not spending his time on it, hes not buying sports franchises, hes not buying mega yachts or if he is isnt talking about him and hes working from them, yeah!

henot buying social media platforms and newspapers yeah!

totally!

i mean it is quite telling that when you watch one of their keynotes, its Jenson on stage and its some customer demos, but its not like the apple keynotes where Tim cookies calling up another apple employee its the Jensen show right nobodywouldaccusetimcookofnotworking hard i dont think but you go to this keynotes and sliketim does the welcome and then the hand off and you know operate of other executives talk about stuff good morning Tim apple i love it love Tim we gotta have tip of the show sometime that would be a base yeah textim who text em alright power lets talk power alright so for listeners who are new to the show this is the section or we talk about what it is about the company that enables them to achieve persistent differential returns or anotherwords to be more profitable than their closest competitor and do so sustainably and Nvidia is fascinating because they sort of have a direct competitor, but thats not the most interesting form of competition for them this intermediation is sure ostensibly, theres Nvidia vs amd, but like AMD doesnt have all this capacity reserved from TSMC at least not for the two point 5D packaging process for the highngbus AMD doesnt have the Developer ecosystem from CUDA theyre closesdirectcomp but its Amazon building training Inferentia its if Microsoft decides to go and build their own ship as theyre too with amd its Google on the TPU Facebook developing pytorch and then leveraging their footholed with pytorch with the developer community to figure out how to extend underneath of pytorch theres a lot of competitive vectors coming at Nvidia, but not directly not to mention all the data setder hardware providers that are their direct competitors now to yep Intel Etter ondown the line yep now all that said theyve got a lot of powers so as we move through these one by one i think lets just say them all and we can decide if theres something to talk about here counterpositioningistheonewhere i actually dont think theres anything here i dont think theres anything that Nvidia does where theres another company thats actively choosing not to do that because any company would want to be Nvidia right now i would have agreed with you。

but i actually think there is strong counter positioning in the data center world right now Nvidia and jencein put a flag in the ground several years ago where they said we are going to rearchitek the data center and all the existing data center hardware and compute providers had strongincentives not to do that but like right now what do you think other data center hermer providers?

what are they not doing?

yeah, fair point theyre all trying to put gbus in the data center to everyones just gonna chase exactly what Nvidia is doing years behind them thats the market right now yep OK enough and the question is will Nvidia be able to stay ahead in ways that matter that i think is the entire analysis on the company right now is in what ways that matter to customers at large scale and large markets will they be able to sustainably be ahead of people that are just chasing them in trying to copy what theydoingbecause the margin profile is so fat and juicy that people dont want to pay yup so the second one scale economies this has cuda written all over it you can make massive fix costs investments when you have the scale to amoratize that cost across and when you have four million developers who want to develop on your platform, you can justify whatever it is 16 people who actively on LinkedIn at Nvidia today have the word CUDA in their job title i mean isure its actually even more than that who just start you know theyre saying software or something like that, but thousands of people of an investment that they dont make any money on software they mean they make a dominimiss amount on software。

but that is ameratized across the entire developer base i think its worth singa bit more here on this to which we also talked about on our last episode to me the Dynamics here are a lot like apple and iOS yes, versus Android apple has thousands and thousands and thousands of developers work again iOS androidaalso has of developers working on a across a wide spread ecosystem, but at apple its all tightly controlled in its coupled with hardware at androidits not and like as a user maybe youll get the latest operating system update。

maybe you want i think this is exactly the right framing here that Nvidia is the apple of ai and Pytorch is sort of Android because its open source and is got a bunch different companies that care about it open cl is the Android as a proteins to graphics, but its pretty bad and pretty far behind rockm is the cuda competitor made by amd for their hardware but again new not a lot of adoption theyre working on it but theyvopensourcethat because they realize they cant go directly head to head with Nvidia they need some different strategy, but yes, they are absolutely running the apple playbook here!

Yup and i think in the current state of things its even more favorable to Nvidia than iOS versus Android because Nvidia has had firstdozens and then hundreds and now thousands of engineers working on CUDA 46 years meanwhile the Android equiveling out there in the Opensourc ecosystem has only just been getting going you know if you think about the Delta of the timeline between iOS and android it was a year and a half two years theres a probably at least ten probably closer to 15 yearleadthenvidia has and so we talk to a few people about this real like oh whatgoing on in the opensourc ecosystem is there an android equiflent and even the most bullish people we talk to were like oh yeah, you know now that Facebook has really moved pytorch into a foundation and outside a Facebook that means that other companies can now contribute you know couple doesnengineers to work on it and you know like cool so amd is gonna contribute a couple doesnmaybe a hundred engineers to work on pytorch and so will Google and so will Facebook and so will everybody else Nvidia has thousands of engineers working on CUDA ten years ahead。

i sent you this graphdavid of my estimatednumber of employees working on CUDA per year since inception in 2006 and then if you look at the area under the curve and just take the integral its a proximately ten thousand percent years that have gone into cuda like good luck now again open sources a very powerful thing the market incentives are absolutely there for this to happen right, that is the interesting point as every mode only works if the castle is sufficiently small if the prize at the end of the finish line becomes sufficiently large, youre going to need a bigger mode and you need to figure out a uh you know how to defend the castle harder imixing so any metaforce here but you get the idea yeah, i love it this was a perfectly fine mode when the address blemarket was a hundred billion dollars is it at a trillion dollar market opportunity probably not basically it means margins come down in competition get more fierce over time and i think Nvidia totally gets this because part of this as i was alluding to is covid related。

but we talked way back in part one about how Nvidia ended up to save the company moving to a six month shipping cycle for their graphics cards when their competitors were on a one to two yearshipping cycle that persisted for several years, and then they relaxed back to a annual shipping cycle there are annual gtcs since covid Nvidia has reaccelerated to a six month shipping cycle theyve been doing to GTCC year most years since covid, which is insane for the level of technology complexity that theydoingyep imagine apple doing 2WDCSEA yeah, thatswhatapping itnvidia its crazy so on the one hand thats culturething on theotherhand that is an acknowledgement of like we need to be peddled to the floor right now to outrun competition weve built some structual ways to defend a business。

but we need to continue running as fasses, weve ever run to stay ahead because its such an attractive race that were in yep alright。

so thats scale economies lets move to switch in cost now so far everything of consequence。

especially model training, especially on LMS has been built on Nvidia, and that alone is just a big pile of code and a big amount of organizational momentum so switching away from that even from the software perspective is going to be hard, but there are companies today in twenty twenty three both at the hyperscalers and fortune five hundred data centers making data center purchase and roll out decisions that will last at least the next fiveyears because these data center rearchitectures dont happen very often and so you better believe that Nvidia is trying as hard as they can to ship as much product as they can well they have the lead in order to lock in that data center architecture for the next end years yeah!

we talk to many people in preparation for this episode, but one of the most interesting conversations was with some of our favorite public market investors out there the ncs capital guys who i stole many insights from for this episode ah, theyre just so great and obviously have been following Nvidia in the space for a longtime they made the point that data sennerrevenue and data sennercapacs is some of the stickiest revenue that is known to humankind just the organizational switching costs involved in data sender procurement and data sender architecture standardization decisions god thats a mouthfuleven to say at 45 companies and the like is like theyre not changing that more than once a decade at most so even if were sort of in this bubbley moment around the excitement of general ai before we necessarily know the full set of applications Nvidia is leveraging this excitement to go get the block it Ive seen some people on the internet being like they love how supply constrained they are i dont think so i think theyre looking for capacity in every way they can get it to exploit this opportunity well exists i completely good that yeah i think you do again we didnt talk to collect Nvidia CFO of outthis, but i strongly suspectify with them i would be happy to trade some of this girls margin right now for increase three。

but on sales yep but theres only one tsmc and theres only so many fabs that they have they can do the what are they call at the two point 5D architecture so should we talk corded resource?

yeah, this is probably the textbook corded resource and Vidia has access to a huge amount of capacity at tsmc that none of their competitors can get their hands on mean they did lock into this quartergresource a little bit they reserved all that way for supply for a different purpose partially cryptomining, but AMD doesnt have it amd does have a ton of capacity as worth saying at tsmc for their other products data center cpuse, which theyve actually been doing very well in but Nvidia did end up with this wide open Lane alter themselves on cooscapacity at tsmc and they got to make the most of that for as long as they have it and i guess to say a little more though its not like this is not a commodity as we talked about on our tsmc episode oh!

there tsmc as a contract manufacture it is the opposite of a commodity!

especially at the highest and leading edge its like a invention delivered by aliens that very few humans know how to actually do yes, it is worth acknowledging its kind of a two horse race for lmtraining i know weve been Harbingon Nvidia but Google tpus are also manufactured at volume you can just only get em through Google cloud and i think i dont you have to use the tensorflow framework, which has been waining in popularity relative depty torch。

but its certainly not an industry standard to use TP use the way that it is to use nvidiahardware i suspect a lot of the volume of the tpuses being used internally by Google for Bard for doing stuff in Google search like i know theybadded a lot of the general ai capability to search yep totally two points on this just sticking to the scope of this business or market discussion this is a major casualty of a strategy conflict Google obviously the way you want to do this is the way in videosdoing this of like your customers want to buy through the cloud you want to be in every cloud, but obviously Google is not going to be in aws and hasher and Oracle and often new cloudproviders theyre only gonna be in GCP maybe David, but i was gonna say though do the expanded length?

though i think this makes sense for Google because their primary business is their own products right and they run among the most proftablebusinesses the world has ever seen so anything they can do to further advantage and extend that runway they probably should do nothing has changed through all of this with respect to the fact that what the previous generation of ai enabled with machine learning with regard to social media internet applications being the most profitable cashflow guysersknowntoman none of that has changed that is still true in this current world and still true for Google yep the last one that i had highlighted is network economies they have a large number of developers out there and a large number of customers that they can averatize these technology investments across and who all benefit from each other i mean remember there are people building libraries on top of CUDA and you can use the building blocks or other people built to build your code you can write amazing CUDA programs that just dont have that many lines of code because its calling other preexisting stuff and Nvidia made a decision in 206 that at the time was very costly like big investment decision, but it looks genius in hindsight to make sure that every gpu that went out the door was fully CUDA capable and today there are five million CUDA capable gpusfor developers to target this is very attractive im putting this in the network economy i think its probably more a scale economy than a network economy but you could imagine a lot of people who humming around Nvidia in 2062 saying why do i have to make it so that my software fits on this tiny little footprint and we can include CUDA taking up a huge amount of space on this thing and make all these tradeoffs and are hardware so that we can write why there are people going to use CUDA and today it just looks so genius yeah!

i mean weve talked about this many times on the show including with Hamilton, Helmer and Chenyi themselves but for platform companies like Nvidia clearly is there is this special brand of power that is a combination of scale economesan network economies this is where you getting at yep they do have branding power for sure yeah i actually think its were talking about this a little bit this is the nobody gets fired for bigibm i mean Nvidia is the modern ibm in the ai era yep look, i dont feel confident enough to look pound the table on this, but given the nature of how the company started how long theyve been around and effect that they also have the market leading product in a totally different business in graphics yeah, which is both consumers, but also professional graphics thinkthat probably does lend some brand power to them especially, when the cio and the c suite of McDonalds is making a bind decision here like everybody knows nvideo hmm your thing that they carried their consumerbrand into their enterprise posture this is way way way down the stack empower, but i dont think its hurt them theyve always been known as a technology leader and the whole world has known for decades at this point that the stuff that they can enable is magical yeah!

theres a big strength leads to strength thing here to where i bet the revenue results from last quarter, massively dwarf any brand benefit that they ever got from the consumer side i think its just the fact that like hey, look everyone else is buying Nvidia ID be an idiot not to nobody is getting fired for buying Nvidia anytime soon yep right or taking a big dependency on them or targeting that development platform its just the like if youre innovating in your business, you dont want to take risk on the platform your building on top of you want to be the only risk in the value chain right?

then the last one right is process power yeah and this is probably the weakest one even though im sure you could make some argument that they have process power its just that all the other power are so much more valuable its always so tricky to tease out yeah you know i think the argument here would just be like nvideas culture and there six monthshipping cycle that clearly they had in the past then they didnt have for a while and now they have again i dont know i think you can make a argument here is it feasible lets do a thought exercise could anyof there competitors really in any domain moved to a six month shipcycle would not be really hard yeah?

you do could a apple size company due to wwdcsayyear like no the question is does that actually matter there so many people that are using a one hundreds right now and in fact, most workloads can be run on a one hundreds unless, youre doing modeltraining of gpt for i just dont know that it actually matters that much or as much as other factors and ill give you an example, amd doeshave 3D packaging on one of their latest GPUs its a more sophisticated way of doing real copper or realcopper direct connection without a silicon interposer im getting into a little bit the details but basically its more sophisticated than the process that the H100 2POINT 5D is using to make sure that memory is extremely closeto compute and does that matter not really what matters is everything else, what even talking about and nobodygonna make a purchase decision on this thing because its you know a little bit of a better mouse trap yeah thinking about this board i think actually, brand is a really important power for nvideo right now yeah and in a strengthlease to strengthway so you can see why theyre trying to sort of seize this moment yep playbook alright lets be able to playbook so one thing that i want to point out is jenson keeps preferring to this as the iPhone moment for ai and when he says it, the common understanding is that he means a new mainstream method for interacting with computers, but theres another way to interpret it does this soundfamiliar David when i say a hardware company differentiated by software that then expanded into services 哈哈哈 yes!

yes!

it does its tongue in cheek to be referring to the iPhone moment of AI when referring to oneself, Nvidia as the apple because i really think that the parallz are uncanny that they have this vertically integrated hardware and software stack provided by Nvidia you use their tools to develop for it, theyve shipped the most units so developers have a bigan set of to target that market its the best individual buyers to target because theyre the least cost sensitive and they appreciate you building the best experiences for them i mean its the iPhone but in many ways its better because the target is a B2B target instead of consumers yo the only way in which its different is apple has always had a market cap that sort of lagged its proven value to userswhereas Nvidia right now was umm exactly over there skis well lets save that for balenbear at the end great the second one is that theyve moved on from becoming a hardware company to truly being a systems company wellanvideochips are typicallyahead it really doesnt matter on a chip to chip comparison that is not the playing field it is all about how well multiplegpuesand multiple racks of gpueswork together as one the system with all the hardware, networking and software the enables that they have just entirely changed the vector of competition, which i think lots of companies can learn from in my third one here is this quote that Johnson had again from the same stratequery interview, which is you build a great company by doing things that other people cant do you dont build a company by fighting other people to do things that everyone can do and i think its so salient it comes out in all these interesting ways one of which is Nvidia never dedicated resources to building a cpu until there was a differentiated way at a real reason for them to build their own cpu, which is now and the way that theydoingit by the way is not terribly differentiated its an offtheshelf arm architecture that theyre putting some of their own secret sass on, but its not like theyre doing apple style m three creation of a chip from scratch its not the hero product right, there are many ways that Nvidia sort of applies this where i think we talk about on the last episode if they think its going to be a low margin opportunity, they dont go after it, but the nicer way to say that is we dont want to compete for things that anybody can do we want to do things that only we can do own by the way we will fully realize the value of those things when we do them。

yep!

i think this may be a related playbook theme here for Nvidia of strike when the timing is right, i suspect that a lot of the inner competitive drive and motivation for jenson in the company over the past 1015 yearshere has been to really fight against Intel Intel try to kill them as we talked about manytimes in the previous episodes, we talk to somebody who framed it is Intel was the country club and Nvidia is the fight club and back in the days the Intel country club didnt want a lead Nvidia in intelcontrolthemotherboard intelcontrolthemost importanchip was the CPU intelwould integrating commoditize all otherchips into the motherboard eventually, and if they couldndo that well, then they try and make the chips themselves and they try to run all these playbooks on Nvidia an Nvidia just barely survived and then in the data center intelcontrolthe data center for so long pciexprew that was the interconnect in the data center for so long and Nvidia had to live in there and im sure they heated every single minute of it but they didnt turn around ten years ago and just be like guess what were making a cpu two they waited until the time was right is crazy they used after plugin to other peoples servers。

and then they started making servers that plugin to other peoples racks and rows and architectures, and then they started making their own entire, rows and walls and some point here theyre gonna start running their own buildings follow servers to and theyre gonna say we dont have to plug in at anything yep but i think for a lot of their leaders it would have been hard to have the patience that theyve had totally you only get to do the stuff theyre doing if you invested ten years ahead of the industry were wildly inventive and innovative in creating these like true breakthrough innovations and were really really right about huge markets dump none of this stuff applies unless youdoing those three things yeah fortune 500CIOS arenmakingbuying decisions if none of what?

you just said isnt true right?

so theres this interesting conversation i wanted to have with you ahead of winding it up with the ball in badcase, so think back to our AWS episode we talked a lot about how AWS is just locked in the databases are ridiculously durable advantage once your data has been shipped to a particular cloud often literally in semi trucks full of hard drives snowball yeah is hard to move off of it theres a sort of interesting question of will winning cloud 一 point o for all these Google, Microsoft, Amazon will that tohold actually enable them to win in the cloud ai era on the one hand youd think yes absolutely because i want to train my ai models right next to where my data is its really expensive to move my data some were else to do that case in point Microsoft is the exclusive cloud infrastructure provider for open ai。

which runs as far as we know solely on Nvidia infrastructure。

but they buy it all through Microsoft right on the other hand the experience that customers are demanding is the full stack Nvidia experience not this oh, you found the cheapeus possible cost of goodsoldway to offer me something thats like the experience that i want and sometimes the cloud providers have to offer me A100, H100 because my code is way too complicated to ever re architect for whatever accelerated computing devices theyoffering me thats first party in cheaper for them i dont know, i just think for the first time in the last five years or so ive sort of cocked my head a little bit at the moat of these existing cloud providers and said huh maybe there really is a vector to compete with them and cloud is not a settled frontier yeah well!

this is majority of here cloud is a eufemism for data centers right, theres so much more to the hyperskillers in public clouds than just data centers right?

but physically there data centers yeah!

there is a mile of distance metaphorically between like an econix and a aws yeah, but theyre data centers and there is a fundamental shift at least accorded Jenson。

a fundamental shift that is happening in data centers so i think that probably does create some shifting science that the cloud market is gonna have to navigate yep i bet the way it plays out is that where you landed in cloud one point o stronglydictates were you landing this AI cloud error cause at the end of the day if customers are demanding Nvidia stuff then the cloud providers have everyonset of in the world to make it so that you can run your applications create and their cloud but also!

like theres more of this to crusoexist core we exists, lambda labs exists these are well funded startups with billions of dollars that a lot of smart people think theres a major cloud size opportunity for yep that would not have happened a few years ago super true all right lets do the ballcasein。

bearcase and bring this one home oh!

boy, weve been trying to delaythis as long as possible this is the correx of the question right now yeah!

i mean part of it is is there existing mode big enough if gp use actually become a hundredbillion dollar a year market i mean right now gps in the data center or like a thirty billion dollar year market going to like a fifty billion dollar next year and like if this actually goes the way that everyone seems to think its gonna go theres just too many margin dollars out there for these big companies to not invest heavily meta through tens of billions of dollars making the metaverse meanapples put 15 billion dollars into a rumored into their headset amazons put tends of billions of dollars into devices。

which by all means was a terrible investment how is echo paying anything back uh man total sidebar im so disappointed i have standardized myhouse on the echo ecosystem and it keeps getting dummer how in this world of incredibly accelerating ai capabilities are my echos getting dummer oh!

they need to training and inferential a little bit harder ah!

Jesus OK randoffer yeah!

i mean never doubt big texability to throw tenths of billions of dollars into something if the payoff could be big enough, these are ludicrously profitable minopleys except your Amazon not that profable AWS is yeah, but the Google, Facebook, apple at some point here there is a game of chicken that ends and some of these companies go all in and say yeah, we are smart engineers to like were gonna figure this out yeah!

but also never underestimate the inability of big tech to execute on stuff that if things can especially with major strategy chefs。

yeah!

yeah, alright, so lets actually do this bear case?

lets start with the bearer case so you just illustrated i think barecase number one, which is literally everybody else in the technology, go system is now aligned in set of eyes to say i want to take a piece of Nvidia spy and these companies have untold resources yep!

and i put a firepoint on that lets look at pytorch for a minute now that all the developers are lots of developers are using pytorch it does enable pytorch to aggregate customers, which gives them, the opportunity to disinnermediate a maybe you get a write a lot, a new stuff underneath and ship a lot a hardware i mean the cloud service providers have taken some steps here he was originally developed by meta, and while its open source its still hard for all these companies to invest in it, if its really sort of owned and controlled by meta, so now, pytorch has been moved out into a foundation that a lot of companies are contributing to and again it is a absolute false equivalence to be like Pytorch vs Nvidia but in real benthops in aggregation theory parlance if you aggregate the customers, you have the opportunity, then to take more margin to disinder mediate to direct where that attention is going and pytorch has that opportunity that feels like the vector that a lot of these csps will try and compete on and say look if youbuilding for pytorch。

it runsreally well on our thing to yep for sure no doubt that thats gonna happen alright!

so thats bearcasenumber two kind as part of bear case number one the next one is like literally the market isnt as big as the market cap reflex i think theres a pretty reasonable chance that theres some falter in the next twelve to eighteen months where theres a crisis of confidence among investors where at some point something will come out where we all observe oh, maybe gpt is arenas useful as we thought maybe people dont want chainerfaces and that crisis of confidence that many bubble burst will trickle out to americascios and ceos make it harder to advocate in the boardroome to make this big fundamental purchase and rearchitecture of our whole budget from this year that we agreed on that im trying to propose us changing theres a crypto like element to a excitement bubble bursting that will for some companies slow there spend and the question is sort of like when that happens, cause not an effeto when i have a hard time believing that given all the hype around everything right now, ai will be even more useful than everyone believes, and it will continue in a linear fashion where without any drawdowns, everyones excitement only gets bigger from here it may end up being way more useful than anyonethought but there at some point will be some vallier trough and a sort of about how does Nvidia fair during that crisis of confidence its funny you know again we talk to a lot of people for this episode including a set of some of the foremost。

ai researchers and practitioners out there and founders in c suite of companies, that are doing all this and pretty much to a t, they all said the same thing when we ask them about this question, they all said yeah, this is overhive right now of course, obviously, but on a tenyear timescale, you havent seen anything yeah!

the transformative change that we believe is coming you cant even the most interesting thing about the overhype is that its actually showing up in revenue its everyone who is buying access to all this compute believes something and for Nvidia because its showingup in the form of revenue the beliefis real then so they just need to make sure that they smooth the gap to customers actually realizing as much value as the cios of the world are currently investing ahead of yep。

so i think the subpoint to that thats worth the discussion right now is like OK generative ai yeah is it all its cracked up to be will David!

i ve asked you about this end like a month or so, but a monthago you were pounding the table insisting to me like i have no need for ive never used chahjbt i cant find it to be useful itallucenating all the time i never think to use it its not a part of my workflow ulike, where are you at still?

basically, there including forcing myself to try to use it a bunch in preparation for this episode, but also, as we talk to more people, i think Ive realized that like David rosentals usecase doesnt really matter here at all right a because as a business, we are such a hyper specialized unique little unicorn thing where accuracy and the depth of the work and thought that we ourselves put into episodes is the Paramount thing well, and we have no coworkers theres so many things about our business that is weird like we never have to prepare a brief for a meeting right all this stuff anything external that we prepare is a labor of love for us and there is nothing we prepare internal i know people who used chat gbt to setherokrs and im like OK whats okr and theyre like i wish my life are like that too thats why i have chatgbt do it right honestly like i think through doing this and talking to some focus and reading i think there is a very compelling use case for it for writing code right now no matter what level of software developer you are from zero all the way up through elite software developer you can get a lot more leverage out of this they can get hubcopilot, so is that valuable you like for sure thats valuable yeah the lms are unbelievable good at writing and helping you write good im a huge believer in that use case yeah and then i think you know theres the slightly more speculative stuff but you can actually sort of see it now of like that gaming demo that i mentioned recently from in video of like oh, youre talking to a nonplayable character that wasnt scripted we did an acq to episode recently with Chris foundswilloff from the co runway that was used in everything everywhere all at once and he said thats this the tivity ice bird like the stuff that you can do that is happening thats out there today with generative ai in these domains is astounding yeah!

i think what youre saying is one could be a bear on your own experience every time you try to use a generate i application it doesnt fit into your workflow you dont find it useful youre not sticky but on the other hand actually, what ai will be is a sum of a whole bunch of niches, theres a video game market, theres a writingmarket, theres a creative writing market, theres a software developer market, theres a marketing copy market you know, theres a million of these things。

and you just may happen to not fall into one of the first few niches of it yeah i think for me at least again to speaking personally to ide a very strong element of skepticism initially, because the timing was just too perfect you know is like yeah all uvcs out there you just told everybody about how cryptos the future and whatever your talking about and then interesting went to you know five percent on your world fellofaclif uh the number of people who were like outraising a fund and theyre like the future is ai yeah right this the best time ever to be investing do do do and so there was a large part of me that i was just like come on guys yeah its too perfect dirright its too perfect but this most recent couple months in this quarter for Nvidia has shown that put all that aside in this stuff cios are adopting the stuff Nvidia selling real dollars and learning also about what it takes to train these models and the step scale function of knowledge and utility going from a billion parameters to ten billion parameters to two hundred to a trillian parameter models uh yeah like somethinggiven on there for sure so this leads be my next barecase!

which is the models will get good enough, and then theyll all be trained, and then will shift to influence, and most of the compute load will be on influence wherein video is less differentiated theres a bunch reasons, i dont believe that that is a popular narrative though one of the big reasons, i dont believe that is the transformer is not the end of the road in a bunch of the research that we did David its very clear that there are things beyond the transformer that are in the research phase right now and the experiences are only gonna get more magical and only going to get more efficient so there are sort of a second barecasethere, which is right now we through a brute force kitchen at training these things and all of that revenue accrued to nvideo because there are the ones that make the kitchen sinks and over time like you look at googleschin, Chila or Lama to they actually use less parameters than gbt for and have equivalent quality, or you know many other people can be the judge of that but were high quality models with less parameters so there is this potential barecase around future models will be more clever and not require as much compute its worth saying that even today the vast majority of aiworkloads dont look like lms at least until very recently, lmsare like the current maximum in human history of jobs to be done that require a ton of compute and i guess the question is will that continue i mean many other magical recent, ai experiences have happened with far less expensive modeltraining, like diffusion models and the entire genre of general ai on images, which we really havent talked about a lot on this episode because theyre less compute intensive, but many tasks dont require an entire internet of training data and a trillion parameters to pull off yep that make sense to me and i think there also。

is some merit to workloads are shifting to influence that is happening i agree with you, i dont think training is going anywhere but until recently, you dont think back to the Google days training was what everybody was spending money on thats what everybody was focused on as usage scales with this stuff then influence and influence of course, being the compute that has to happen to get outputs out of the models after they already trained that becomes a bigger part of the pigh and as you say the infrastructure any ghost systems around doing that is less differentiated than training yep OK, those the bare cases thereprobably also, a bare case around China, which is a legitimate one because thats gonna be a problem for lots of people a large market that they wont be able to address for the first able future in a meaningful way and just whats gonna happen generally like obviously, China is racing to develop their own homegrown, ecosystemsing competitors and like thats gonna be a closed off market, so whatgonna come out of there?

whats happen yep thats definitely one, two, my last one is a Barry case, but it ends up not being a barecase for most companies i would say that if they were trading at this very high multiple, and they just experienced to this tremendous realgrowth in revenue and operating profit that that sort of spike to the system when it goes away will evurbly hard the company when things lowdown stock compensations in issue, employee morals in issue, customer perceptions in issue but this is Nvidia yeah, this is nothing new the number of times that theyve reasoned from the ashes after you know years longterrible sentiment with something mindblowingly innovative theyprobably the best positioned company or the company with the best disposition to handle that when it happens oh!

i love thats a great turn of phrase there you are up to training model why good there you should see the number of parameters hello of it alright!

just a listable cases one jenson is right about accelerated computing the majority of workloads right now are not accelerated theyre bound to CPUs they could be accelerated and that shifs from some crazy low number like five or ten percent of workloads being accelerated today to fifty plus percent in the future and theres way more compute happening in parallel and emostly agrees to Nvidia oh!

i have one new once i wanna add to that on the surface i think a lot of people look at that in theyre like yeah come on but i think there actually is a lot of merit to that argument in the generative ai, world and everything weve talked about on this episode i dont think densen in Nvidia are saying that traditional compute is going away or getting gets smaller i think what he saying is that ai compete will be added on to everything and the amount of compute required for doing that will dwarf whatapping in general purpose compute so like its not that people are gonna stop running sharepointservers or that whatever products you use are gonna stop using there whatever interfaces that they use its that general ai will be added to all of those things and be usecases will pop up, which will also use traditional general purpose cpu based computing but the amount of workloads that going to making those things magical is just going to be so much bigger yep also。

just a general statement on software development writingparalyzablecode is really hard unless you have a framework to do it for you even writing code with multiple threads like if anybodyremembers acs college in class where they had a race condition or they needed to write a summa for these are the hardest things to debug and i would argue that a lot of things that could happen in an accelerated way arent just because its harder to develop for and so if we live in some future, where Nvidia has reinvented the notion of a computer to shift away from von doinmen architecture into this stream processor architecture that theyve developed and they have the full stack to make it just as easy to write applications and move existing applications especially once all the hardwheres been bought and paid for insiding in data centers theres probably a lot of workloads that actually do make sense to accelerate if its easy enough to do so yeah, thats great to see your point is that theres a lot of a latent accelerated addressible computing out there that just hasnt been accelerated yet right its like uh this workloads not that expensive and im not gonna pay a engineer to go rearc tech the system so its fine how it is about up i think theres a lot of that so both case one Jensen is right about accelerated computing both case 2 Jenson is right about generative AI mean combined with accelerated computing this will massively shift spend in the data center to Nvidia hardware and as we mentioned open AI is rumored to be doing over a billion dollars in recurring revenue on chat GBT so i think theres lets call it three billion because thats the most sort of credible estimate that Ive heard maybe that was a forecast for next year, but like theyre not the only one of a Google withbard, which i found tremendously useful, actually preparing for this episode is not directly monetizing that, but there sort of retaining me as a Google search customer by doing it, there is a lot of real economic value even today not nearly the amount that sort of baked into the valuation but i suppose the bare case of this is that everything has to go right from video, but the bolcases indications are things are going right for Nvidia dope thirdbolcases Nvidia just move so fast whatever, the developments are its hard to believe that theyre not gonna find a way to be really well positioned to capture it its just a cultural thing for is the point that you brought up earlier that theres a trillian dollars installed in data centers 250 billion more being spent every year to refresh and expand capacity and that Nvidia could take a meaningful share of that i think today whattheyreanyalrevenue at like 30 billion or something was you run rate this current quarter than its at like fifty fifty plus yeah, so right now that puts them at like twenty percent of the current data center spend you could imagine that being much higher okay!

wait that includes the gaming revenue is about 40 because the data set a revenue is 40 is ten so 40 a and realize all right so 15 percent yeah we!

but you could imagine that creepen up again if the accelerated computing and general, ai belief comes true, like theyll expand that to fifty number and theyll take a greater percent of it yep, an interesting way to do a sort of a check on this math is to look at what other people in the ecosystem are reporting in their numbers tsmc in their last earning, said that aihardware currently only represents 6 percent of their revenue but all indications over there is that they expect ai revenue to grow fifty percent per year for the next five years well, so were trying to come at it from the customer workload side and say is it useful there but if you come at it from this other side of what do Nvidia suppliers for casting and they have to put their money with their mouth is building these new wafer fabs to be able to facilitate that and packaging and all the other things that go into chip so its expensive for tsmc to be wrong yep thats another ballcase the last one that i have before leaving you with one final thought are you saying you have one more thing yes, is that Nvidia isnt Intel and i think thats the biggest realization that you help me have and its not Cisco yeah, the comparison were making in the last episode was wrong they are Microsoft, they control the whole software stack and they simultaneously can have relationships with the developer and customer, ecosystems and i mean it may be better than Microsoft because they make all the heardbar to yeah maybe all school ibm right imagine if ibm operated in the computing market of two days magnitude computing with tiny little market back, then right?

i mean it was like i mean it took the PC wave to disrupt IBM, which was a personal computer in todays, parliche edge computing you know, device base computing ibm dominated the B2B main frame cycle of computing and again if you believe everything, Jenson is saying and how he steered the company for the last fiveyears。

we are going back into a centalize data center modern version of a main frame dominated competing cycle yep i suspect a lot of inference will get done on the edge you think about the insane amount of compute thats walking around in our pockets that is not fully leveraged right now theres going to be a lot machine learning done on phones that are going to like call up to cloudbase models for the hard stuff no doubt i dont think training is happening at the edge anytime soon no no, i certainly agree with that all right well, just like rtsmc episode i wanted to end and leave you with a thoughtdavid of what it would take to compete with Nvidia because my big takeaway from the TSMC episode was like wow thats a lot of things you have to believe about a government putting all the town and i was like whats the equivalent for Nvidia so heres what you would need to do to compete lets say you could design GPU chips that are just as good which arguly AMD, Google and Amazon are doing unicourse then need to build up the chip to chip networking capabilities like env link that very few have and unicourse need to build relationships with hardware assembler like foxcon to actually build these ships into servers like the DGX and even if you did all that you need to create server to server and rack to rack networking capabilities as good as melinox, whois the best on the market with infinaband that Nvidia now fully ones in controls, which basically nobody has an even if you did all that youd need to go convince all the customers to buy your thing, which means it would need to be either better or cheaper or both not just equal to Nvidia and by a wide margin to to this brand。

youre not going to get fired for buying Nvidia anytime soon like this is the cononical you gottbe Tenx better than Nvidia on this stuff if youve gonna commence, a cio yep。

and even if you got, the customer demand youd need to contract with tsmc to get the manufacturing capability of their newest cutting edge fabs to do this to point, 5D, coos, lithography and packaging, which there of course isnt any more of so you know good luck getting that, and even if you figured out how to do that you need to build software that is as good or better than cuda, and of course, thats gonna take ten thousand percent years, which would of course cost you not only billions in billions of dollars but all that actual time and even if you made all these investments unlined all of this opute of course, need to go and convince the developers to actually start using your thing instead of CUDA well, Nvidia also would be standing still so youhave to do all of this in record time to catch up to them and surpass whatever, additional capabilities they developed sense you started this effort, so i think the bottom line here is it nearly impossible to compete with them head on and if anybodys gonna unseed Nvidia in the future of ai and accelerate computing its either gonna be from some unknown flank attack that they dont see or the future will turn out to just not be accelerated computing an ai, which seems very unlikely yeah well。

when you put it that way, i think the conclusion that we can come to is that mark and dreson was right in what years was this we were talking about on the he was like 2015 something yeah like 2015 in 2016 they should have put every dollar of every fund that a sixteenzer raised into nvidias market price of the stock every single day yeah cause they were seeing all of these startups doing deep learning machine learning at the time early ai and they were all building on Nvidia and they should have just said nottakyoutolldominputoinvideo mark is right once again strengthleads to strength there you there is all listeners iknowledge that this episode generalized a lot of the details especially for technical listeners out there。

but also, for the finance folks were listening our goal was to make this more of a lasting Nvidia part three big picture episode than sort of a how did they do last quarter and what are the implications on that of the next three quarters?

so hopefully this holds up a little bit longer than just some current Nvidia commentary, but thank you so much for going on the journey with us yep!

we also as we voluted to throughout the show we owe a bunch of thankuseto lots of people who are so kind help us out including people who have way better things to do with their time so were very very grateful i mean one een block from Nvidia who leads the data center effort, and is one of the original team members that invented CUDA wayback when really grateful to him for speaking with us to prep for this absolutely also bigshout out to friend and listener the show Jeremy from abc data who prepared for pdfs for us completely unprompted like an insanying writeupforus about a lot of the technical detail behind this private blogpost yeah, private blogpost so our acquire community is just the best like you guys continue to blow us away so thank you Julian?

the cto of huggingface ornetsione from ai two Louis from octoml and of course, our friends at nzs capital thank you all for helping us research this and dude all right carvouts?

whatshiffgears carvouts?

what to get my wife and i have been on an aliasbench oh!

wow!

yeah!

Jennifer Garner yes, i never saw it when it came out, it is like the perfects early 2 thousands junk food when you have one more hour at the end of the day in your slaying on the couch?

then i never have one more hour into the day i have a two year old, but i really appreciate it for uh sixteenyears from now when she goes to college, ill keep that on my list oh!

you play games wow!

thats true!

but thats research im just checking out the list graphic technology so my review of aliasis its a little bit campy they repeat themselves pretty often i mean its weird to observe how much tv has changed between now, and then because they make very similar shows today, but theyre just much more subtle, theyre much darker they leave much more sort of to the imagination and in there only two thousands everything was just so like explicit and on the nose and restated three times im just glad the show doesnt a left track, but its well worth the watch sometimes you have to imagine it has a different soundtrack cause every episode has like a Matrix type song to id bump a bump to bump a dump dump yes!

thats right this is like the tv version of the Matrix right yes, but its great i know were having a lot of fun watching it my car about related for my stageof life also something i missed and discovered recently, we just watched our first full movie full Disney movie with our wow what you can major milestone and shefreakin loved it i think we picked a great one mawana, which neither Jenny i had seen before and in reading just a little bit about it afterwards you know how super sadly Pixar kind of fell off in recent years like such a bummer i mean theystill Pixar but like theyre not Pixar its not the uh guaranteed hit every time that it used to be yep so mawana came out in this kind of generation with tangled and some of the other stuff out of actual Disney animation after the Pixar acquisition that are just like these are returned to form eisner era Disney animated just like fires on all cylinders and we loved it we watched with our brother and sister in law who dont have kids in our thirty somethingliving separate Cisco they loved it our daughter loved it highly recommend Moana no matter what lay phase youre in alright great ady it to my list ems got the rock how can you complain there you go well listeners are huge thankyou to are good friends at blinkistin go one!

at statsig and at crucial all the links to all of those phenomial products and offerings are in the shownotes if you want to be notified every time we drop a new episode and you want to make sure you dont mess it and you want little hints to play a guessing game at our next episode or you want followups from our previous episode in case we learn from listeners hey!

heres a little piece of information that we wanted to pass along we will exclusively be dropping those in the email acquire datafm slash email it was so fun i think youboutsttalk about our slack it was so fun watching people in slack talk about the hints for this episode we wrote the little teasert i was like oh, everybodys gonna know exactly what this is no one got it i was shocked yeah eventually, somebody did but it took a couple days yeah?

uh we have a hat you should buy it and this is not a thing that we make a lot of margin on we just are excited about more people sporting the acq around so participate in the movement show to your friends not our superpod but you know yeah the pod is the superpod uh if come on acquiredlp, you can uh come closer to the kitchen and help us pick an episode once a season and well do a zoom call every other month or so acquiredatafmslashlp checkout ACK2 for more acquiredcontent inpennypodcast player and comtalk about this in the slack acquireddatafm slash slack listeners Wes next time Wes next time who got the truth is you isnt you isnyou who got the truth now 哼。