The Thinking Machine

by Stephen Witt · Finished October 21, 2025

Childhood

Famously, in a Stanford commencement speech, Jensen wished the graduates great pain. And for a while, everyone laughed and made memes. Some may have even seen through the memes to the larger point he was making: that resilience is the most important element in the formula of success. But almost nobody realized just how much pain he was wishing on the Stanford graduates.

“The way you described Chinese people back then was ‘Chinks,’” Huang told me, fifty years later in a sterile conference room during our first conversation. His face showed no emotion. “We were called that every day.” The bullies targeted Huang in and out of class, at every opportunity. They shoved him in the hallways and chased him on the playground. The bridge was their favorite location. Huang had to cross it alone, a hazardous proposition in the best of conditions. Sometimes, when Huang was in the middle, the bullies would emerge from hiding on either side of the river, then grab the ropes and begin to swing, attempting to dislodge him into the river below. “Somehow it never seemed to affect him,” Bays said. “Actually, it looked like he was having fun.”

Almost every immigrant to America has the same story: they go through adversity. They have trouble fitting in, they face racism or sexism or whatever -ism of the day, and simultaneously they are too Americanized to fit in with their native groups. They are homeless. And yet, Jensen breezes past this era of his childhood. Resilient people do not focus on the negatives. Everyone has their cross to bear, so what?

The uncle decided the boys belonged at a boarding school and searched for an institution willing to house two unsupervised Taiwanese children, ten and twelve years old, living thousands of miles from their parents. He selected the Oneida Baptist Institute in Kentucky, perhaps mistaking it for a prestigious college-preparatory school. In fact, OBI was a juvenile-reform academy located in a town of three hundred people. [sic] OBI was mostly known as a last-chance institution.

I bet most people didn’t know that he basically went to juvie! At ten years old (his older brother was twelve), he went to hang out with effectively career criminals!

“Every student smoked, and I think I was the only boy at the school without a pocket knife,” Huang said. Jensen, ten, was placed with a seventeen-year-old roommate; on their first night together, the older boy lifted his shirt to show Jensen the numerous places where he’d been stabbed in a recent fight. Huang’s roommate was illiterate; in exchange for teaching him to read, Huang said, “he taught me how to bench press. I ended up doing one hundred push-ups every night before bed.” Huang would stick to a daily push-up routine the rest of his life.

In case you were wondering just how bad it was, the answer was really bad. I didn’t include every passage in the book, but he is relentlessly bullied. Despite it all, he is resilient and has high agency. He figures out a rather clever solution. Even better, he soon earns a reputation for being someone who fights until the end. You could beat him up, but you wouldn’t do so without getting a few cuts and bruises yourself. Soon the bullies learned to choose easier targets.

“Back then, there wasn’t a counselor to talk to,” Huang said. “Back then, you just had to toughen up and move on.”

In all honesty this is for the best. I do not believe in modern counseling and modern therapy. It is nothing more than a glorified venting session. Therapy would merely rob him of his resilience and encourage him to focus on the wrong things in life.

When I asked Huang about the bamboo ceiling, he seemed dismissive—I got the sense that identity politics were not his thing. “I’m the only Chinese CEO of the time,” he said, “but it never occurred to me. And it doesn’t occur to me today.”

Exactly. Stop focusing on this bullshit. Stop looking for new ways in which the world is whatever new -ism you people keep coming up with. It does not help you overcome. You don’t change the world by naming and shaming everyone. You change the world by becoming great and being an obvious disproof of whatever -ism targets you.

“I find that I think best when I’m under adversity. When the world is just falling apart, I actually think my heart rate goes down,” he later said. “Maybe it’s Denny’s. As a waiter, you’ve got to deal with rush hour. Anyone who’s dealt with rush hour in a restaurant knows what I’m talking about.”

Elon has a very similar quote - “I’m built for war.” Resilient people usually are.

Early Career

His name is Jensen Huang, and his thirty-two-year tenure is the longest of any technology CEO in the S&P 500.

I didn’t realize he was the longest tenured! It honestly makes me longer on NVIDIA - the compounding effect of having a founder in charge for so long is such a competitive advantage. Only a matter of time until Zuck takes that crown though.

This approach, known as “parallel computing,” was a radical gamble. “The success rate of parallel computing was zero percent before we came along,” Huang said, rattling off a list of forgotten start-ups. “Literally zero. Everyone who tried to make it into a business had failed.” Huang ignored this dismal record, pursuing his unconventional vision in open defiance of Wall Street for more than a decade.

People forget just how contrarian NVIDIA was when it started. Over and over it had to prove the doubters wrong.

When I asked Huang why he, the middle child, was alone motivated to perform well in school, he shrugged. “I don’t have an answer for you,” he said. “I try not to analyze myself in that way.”

This is very similar to Steve Jobs - many of the greats are not introspective at all. Somehow, they’re in tune with themselves enough to know which industries or ventures are in their nature. But not so in tune as to practice this modern, masturbatory introspection that every therapist seems to require of us.

“There was no notion of weekends,” Horstmann said. “We’d come in at seven a.m.; then our girlfriends would call us at nine p.m. asking us when we were going to come home.”

As you would expect, Jensen is extremely hard working.

He also began taking night classes at Stanford in pursuit of a master’s degree in electrical engineering, but he was so busy at work that it took him eight years to finish. [sic] By the time he was finally awarded the degree, in 1992, much of what he’d learned in his introductory sequence was obsolete.

This was just so funny. He’s obviously very hard working, but he has no notion of how much to push, and often signs himself up for far more than even he can reasonably achieve.

If a product was going to be late or if LSI couldn’t deliver on some promised function, Huang would immediately provide a detailed description of what had gone wrong, who was responsible, and what he was doing to fix it. “When he said he was going to do something, there was a reasonable likelihood that he would actually do it, y’know?”

There is only one rule to rising up in business: Do what you said you were going to do, when you said you were going to do it. That’s it. It’s amazing how simple this is but the vast majority of people cannot do this! I include myself among them. I constantly run late on things that I promised to do, and I’m still struggling to find a system that keeps me accountable.

Malachowsky, watching from outside, believed that Corrigan was grooming Huang to one day replace him as CEO. “They let this young, twentysomething-year-old kid start a whole division!”

Can you blame them? Like I said, this characteristic is so incredibly rare in people.

Near Death 1

The product was known as a “graphics accelerator,” and at least thirty-five competitors were trying to build one. Huang worried there was no space for a thirty-sixth. The leading expert in computer graphics was Jon Peddie, who had written several textbooks on the topic. Huang had reached out to Peddie to get a sense of the market, and the two soon became friends, with Huang calling incessantly, asking questions late into the night. Peddie advised Huang that the space was too crowded and that many of the best engineers were already working for other start-ups. “I told him not to do it,” Peddie said. “That was the best advice he never took.”

Imagine being so contrarian that the guy who wrote the book on graphics acceleration told you this was a terrible idea, but you did it anyway. Insane level of conviction.

Huang’s magic number was $50 million, which he had determined was the minimum revenue his start-up would have to produce each year to make it worth the effort.

In that Stanford speech, he famously said that he had low expectations - I knew that was nonsense. Ambitious people always have an insane expectations of themselves.

The start-up didn’t have a name, so for a placeholder, Gaither wrote “NV”: new venture. This was a striking coincidence, as Priem and Malachowsky were already calling their prototype graphics chip the NV1, joking that it would, like the Sun GX, make competitors “green with envy.” Priem drew up a list of words that riffed on the “NV” concept, using dictionaries from a variety of different languages, including Latin. From this list, the three settled on “Nvision”—until a records search revealed that this name had already been taken by an environmentally friendly manufacturer of recycled toilet paper. The next selection from the list was “Nvidia,” from the Latin word invidia, for “envy.”

This part really impressed me. I’m sure in some ways it was just some meme that young engineers made, but they really did call their shot! They said we’re going to be so big that we’re going to make all of you green with envy, and then they went out and did just that!

After Huang hung up, he returned, exasperated. “Can you believe it?” he said. “My mom just told me to quit Nvidia and go back to work for a large company.”

Glad to know that even the great Jensen cannot escape the almighty Asian Tiger Mom.

The night before his presentation to Sequoia, Huang struggled to come up with a business plan for his company. “I spent all night on it, but at the end I didn’t have anything,” he said. “I still don’t.”
[sic]
The pitch went badly, with Huang fumbling over his presentation and Priem interrupting with irrelevant technical asides. After this uninspiring performance, Valentine took Huang aside. “Well, that wasn’t very good,” he said. “But Wilf Corrigan says I have to fund you, so you’re in business.”

This quotation doesn’t even come close to showing how bad it was. Jensen had never used a Mac before, so he literally went out and bought a Mac the previous day so that he could use ‘Persuasion’ (what Keynote used to be called)!¹

But remember - the slope of the CEO is everything, not the y intercept. And as we’ll see later, there is nobody that has the slope of Jensen.

“The average CEO will try to listen to the customer, but in computing, that’s a big mistake, because customers just don’t know what’s possible. They just don’t know what can be done!” Coxe observed that Intel and Microsoft had later struggled under more conventional management: “Jensen, from the beginning, was a world-class engineer who could see what was possible.”

I find the same thing to be happening with AI agents. Today we go around asking users who have no idea what’s possible “what would you like to automate?” And they just stare at us blankly. They can’t think in terms of something they have no experience with. You have to do that thinking for them, present an opinionated answer and, most importantly, be right.

Near Death 2

Nvidia had built its entire supply chain around future iterations of this same device. Huang’s master plan called not just for the rollout of Sega’s NV2 but also for the NV3, which was based on the same architecture.
[sic]
“We missed everything,” Huang said, of those early days. “Every single decision we made was wrong.”

This was referring to the architecture that Priem wrote for the NV1 - he was so confident in it that he declared it would last for 100 years. It lasted around 100 days, and immediately crashed and burned.

Kirk had invented the quadratic-mapping technique used in the NV1, but when he arrived at Nvidia, he advised the company to abandon it. “It was just an idea I had,” Kirk said. “I have lots of ideas.” But this only made Priem promote quadratic mapping more aggressively. Priem was a purist who dismissed technical compromises as spineless concessions to the money guys.

The architecture was so bad that the guy who invented it said that NVIDIA should abandon it. Yikes.

Sega had agreed to pay Nvidia $1 million upon receipt of working prototypes of the NV2. In the middle of 1996, Huang delivered these prototypes, the only chips of their kind ever made. With great deference, Huang then informed Sega that Nvidia would not help build the Dreamcast because the company was surrendering to Microsoft, but given that the delivery of the prototypes technically fulfilled the terms of the contract, he was hoping Sega would pay him anyway, or Nvidia would go bankrupt. “They took it pretty well, considering,” he said.

One of the things everyone remarks about is Jensen’s charisma. Imagine promising your most important customer not one but three generations of chips, only to find out all three have the same fatal flaw. Then, not only not fixing them, but still demanding the payment anyway… and them paying you!

This was the last of Nvidia’s money. Already the triage had begun, with the company’s bills organized in the order in which payment could be delayed. First, vendors would get stiffed, then utilities, then finally employees. Whatever else happened, Huang was determined to make payroll until the day the lights were turned off.

And obviously, as a result of this architecture fiasco, the company is almost dead. So how can they fix this problem? It takes years to tape out new chips - they’re almost out of money and definitely out of time.

Emulation was a wild gamble. If the transistors on the forthcoming NV3 chip were arranged in error, the busted real-world production run would ruin his company. But Huang was opening himself up to new frontiers of risk.

Time to emulate how the chip would work. No other way. Emulate sufficiently enough, go straight to taping and, well, mostly pray.

By the time the boards arrived in stores, in August 1997, Nvidia was running on fumes. “Vapors,” said Huang. “We had nothing left.” Nvidia, having shipped no previews to the gaming press, now had to beg for media coverage.
Nvidia sold a million Riva cards in the first four months.
[sic]
Following the Riva’s launch, Huang invested in emulators and gave up on physical prototypes. “To this day, we are the largest user of emulators in the world,” Huang said.

And it worked! They were literally 30 days away from going out of business (a matra they repeat to this day to inspire people to act with urgency).

Cofounder Breakup

With the NV1, he had been given a blank sheet of paper to design exactly what he wanted—but when he saw his remaindered product piling up on retail shelves, his ego never really recovered.
[sic]
Shortly thereafter, Priem, in a childish attempt to retain influence, had locked a number of employees out of the production database, preventing them from submitting their work.

Incredibly immature. And the gall after making such a wrong technical bet that the company nearly died!

At the advice of the board, Nvidia hired a mediator. “The mediator had previously worked with John Sculley and Steve Jobs at Apple, which ended up with Jobs getting fired,” Priem told me. “She said Jensen and I were significantly worse.” Nvidia management had by this point developed a saying: “Never let Curtis talk to investors, and never let Curtis talk to customers.” As Priem later admitted, “Both of those things were true.”

Imagine how bad the relationship must be for the mediator to say Jobs and Sculley was a better relationship!

Despite the humiliation, Priem retained his Nvidia shares. Both Huang and Malachowsky still considered him a friend, and when Priem got married in 1999, a year after his second demotion in three years, Malachowsky served as his best man.

You have to imagine that some part of Priem, despite everything, can recognize how special Jensen is.

Near Death 3

In early 1998, TSMC misapplied a chemical process at the end of the manufacturing process, introducing short circuits onto many of the chips. The mistake nearly ruined Nvidia, which had invested most of its working capital in the production run. More than half the chips needed to be discarded—Nvidia managed to save itself only by selling equity to some of its circuit-board partners. “We were close to bankruptcy that time, too,” Diercks said. “It’s not just a saying.”

Just another day in hardware - death is always around the corner. To get out of this one, they realize they need to have a much better go to market strategy and brand recognition, so they do one of the smartest things I’ve ever seen: narrow down the use case and just focus on games.

The delighted Carmack embraced his Stratocaster—he called it “the perfect card.” He tailored Quake III: Arena specifically for the twin-pipeline architecture and advised his legion of admirers that Quake games were best played on Nvidia hardware.

They work with Carmack, who was (as we all know!) the creator of Doom, then the most popular game. They build a special chip called TNT that would take advantage of all the features Carmack would want to program. And at its heart is the parallel computing engine that would unlock more compute than an Intel chip could ever dream of.

Nvidia never directly advertised the TNT’s parallel capabilities to customers—why confuse them?

Another piece of marketing brilliance - keep the value prop simple for the end customer. This chip makes your gameplay buttery smoothe. That’s all you need to know. We’ll worry about the technical bits.

Neural Nets

This section is going to be an aside on the history of neural nets. Obviously, this is vital in understanding NVIDIA’s rise and their multi-trillion dollar valuation today. But if you’re solely interested in Jensen as a person, you should feel free to skip this section.

Tesauro worked on this niche project almost entirely by himself; as with neural nets, few AI researchers took backgammon seriously. He first trained his neural nets to mimic the best human players, but this approach produced little of value. Around 1990, Tesauro decided to try a different approach. He stripped all strategic advice about the game of backgammon out of the neural net, leaving only the rules and an initial set of randomly weighted neurons. Then he had the computer play hundreds of thousands of games against itself. The technique was known as “reinforcement learning,” and Tesauro was the first person ever to get it to work.

I didn’t realize this but at the time neural nets were widely discredited as a legitimate method of achieving AI. Even more astonishing, this happened just months after Deep Blue beat Gary Kasparov! But since Chess was so much more popular than Backgammon, almost nobody paid any attention to it. (We would later find out that attention was all we needed.²)

Deep Blue was an expensive supercomputer whose brute-force approach was not replicable by humans, and thus it did not fundamentally change the expert approach to chess. (In fact, Deep Blue was dismantled after its 1997 victory.) Jellyfish, by contrast, was affordable software that could run on any Windows machine, and it revolutionized the game.
[sic]
Jellyfish was the first neural net to surpass humans at any game.

That level of intelligence at that price point would have been insane for that era.

But IBM failed to commercialize Tesauro’s project—why would a vendor of business servers sell commercial backgammon software to a few hundred customers? Why, indeed.

It’s easy to hate on IBM for this. After all, this isn’t even the innovator’s dilemma - this just reads like unadulterated foolishness. You had a machine that could aribtrarily solve games and you couldn’t imagine an application for this beyond backgammon? But it’s important to remember that, at the time, every single neural net ever plateaued once it ran out of data. This is a property that’s still true today! And it’s not always obvious how to overcome that plateau once you reach it.

“I dismissed it,” said the man who sold the public its first neural net. “I dismissed it because I just didn’t have the data.” He saw no solution, and he tried everything. He just could not imagine what might make neural nets succeed.

This was Tesauro after he was unable to adapt his neural net to other games like Poker. Yes, IBM arguably was too short-sighted. But given the technical limitations of the day it’s actually quite hard to argue that they were wrong. After all, the GPU hadn’t been invented yet.

IPO

In early 1999, fewer than six years after its founding, Nvidia went public with a $600 million valuation.
Huang was now a centimillionaire, but his newfound wealth did not distract from his objective of crushing and absorbing the competition until only his firm remained. Dwight Diercks recalled no parties, no champagne, no sense of relief, not even congratulations from the boss.
[sic]
Diercks shook his head in amazement. “He wrote that the day after the IPO,” he said.

Nothing could demonstrate Jensen’s focus better. He is an absolute execution machine. The IPO is just another day.

GPUs

Vivoli was a clever guy who viewed a limited budget as an opportunity. He had noticed that in making purchasing decisions, gamers relied on a half-dozen independent hardware reviewers. Vivoli reached out to the reviewers, informing them that the GeForce was the world’s first “graphics-processing unit,” or “GPU.” Vivoli’s team had, in fact, made this term up, but the reviewers began grouping products in the category. Soon, graphics accelerators were universally known as GPUs. “We invented the category so we could be the leader in it,” Vivoli said.

A brilliant go-to-market strategy yet again. I think people dont appreciate just how good at GTM NVIDIA is.

No Mercy

At one point, one of 3dfx’s founders publicly speculated about declaring a truce between the two companies so that technical standards could be established before the next generation of products shipped. “That’s when I knew we had him,” Kirk said. “We were in a death struggle with 3dfx, and one of us had to die.”

This describes NVIDIA as a whole, obviously, but Jensen in particular has absolutely no mercy. When you battle against him, there is no truce, no ceasefire, and no surrender. They will go until you are wiped out.

In August 2000, during a disastrous earnings call, 3dfx CEO Alex Leupp told investors that 3dfx was on pace to lose more than $100 million in a single quarter. An hour later, Nvidia announced it was countersuing 3dfx, making rather questionable patent-infringement claims of its own. Many found the timing of the lawsuit cruel; some speculated that Huang had deliberately filed a nuisance lawsuit that he knew he would not win, just to run up cash-poor 3dfx’s legal bills. A month later, the judge in the case issued a preliminary ruling in 3dfx’s favor while rejecting Nvidia’s countersuit completely. 3dfx scrambled to collect, but Nvidia, through shrewd legal maneuvering, was able to stall the payout.

This is the most cold blooded move I’ve ever seen. Thanks to this, 3dfx was forced to declare bankrupcty - and the only willing buyer was NVIDIA. Jensen swallowed up nearly all the good talent and strengthened NVIDIA as a result.

Nvidia had declined to purchase the entirety of 3dfx but had offered to buy specific assets for $70 million. Leupp had accepted, and the lawsuit was dropped. Internal documents later revealed that Huang valued 3dfx’s best engineers at $1 million per head. This estimate reflected both their worth to Nvidia—and the value of keeping them away from Nvidia’s rivals.

Reminds me of Zuck buying out AI researchers. It was so bad that at 3dfx Jensen was known as Darth Vader!

Death Marches

“At 3dfx, the motto was ‘Work hard, play hard,’ ” one former employee said. “At Nvidia, it was more just ‘Work hard.’ ” Long hours were the default, and the six-month release cycle created relentless pressure. “The end result was almost nonstop deadlines and a perpetual sense of being behind schedule,” another employee recalled.

Believe it or not, if you want to achieve outsized outcomes, this is the ideal work environment. The book Overdrive (covering Bill Gates) has a fantastic passage that describes working at the early days of Microsoft as a series of “death marches”. They rolled from one major release to the next without many breaks. It may sound insane, but this is extremely common for high performing companies.

Tawni, Netflix’s former CHRO said they tried create an insane sense of urgency on a daily basis. This is the way great companies operate. This is the price of greatness.

Little had been traveling for weeks, away from his family, working late nights at the circuit simulator; feeling he had nothing more to give, he responded to the email by submitting his resignation. A few nights later, at around two in the morning, as Little was finishing one of his last shifts, Huang arrived and sat down at the simulator beside him. The glow from the monitor illuminating his exhausted face, Huang recalled his own career, the sacrifices he’d made, the many late nights he’d spent away from his family, often working the circuit simulator himself. He expressed, frankly, that he wasn’t sure it was all worth it. He offered Little his job back if he wanted it; when Little declined, Huang thanked him for his service to the company and left. “That was absolutely the high point of my employment there,” Little said.

Again, you can see the charisma of Jensen.

Missionaries

“Dennard scaling” had governed the miniaturization of electronics. Dennard scaling dictated that transistors would continue to efficiently process electricity as they got smaller—basically, it was the reason computers got faster every year.[*] But Nickolls had calculated that sometime around 2005, the Dennard scaling relationship would collapse.

He was absolutely correct, by the way.³

Nickolls could see that the industry was in denial about this problem—especially Intel, which was confidently predicting linear gains from shrinking transistors down to components a single atom in width. Nickolls believed this was impossible, and in early 2003 he sent an unsolicited letter to Huang outlining his heretical thoughts.

One of the beautiful things about having an insanely hard mission is that you attract insanely hardcore missionaries. NVIDIA literally would not be alive without Nickolls.

Even at Nvidia, Nickolls was considered intense. Two weeks after his first day of work, he was diagnosed with malignant melanoma. He continued working seventy-two hours a week while receiving cancer treatment, concealing from both family and colleagues the discomfort he was experiencing. Soon, Nickolls’s melanoma was in remission, and the earliest versions of the CUDA platform were live.

Wow. Say what you will, this is a man who believes with his entire being that this is the future of computing. These are the types of missionaries you want to attract.

Nickolls was obsessed with getting the CUDA platform to work. Friends sometimes asked him why he was working for a video game company when he didn’t play video games. Nickolls informed them he wasn’t working on video games; he was working on one of the most important technologies of all time. He was building a platform so fast it would make every other computer look like a calculator watch. “Few inventions will have the impact on the world that CUDA will ultimately have,” he would say. This was more of a statement of faith than anything else. By the late 2000s, computers were fast enough for most consumer purposes, and there were not obvious customers for what Nickolls was building.

Last one before I move on - I really can’t overstate how important Nickolls is for us as a species. His sacrifice gave us the AI revolution (spoiler alert, he does pass away from cancer).

Even when Nickolls was dying of cancer, he never stopped working. “I think some of his best, most productive years at Nvidia were during those times,” his son, Alec, said. Nvidia funded a scholarship at the University of Illinois at Urbana-Champaign in his honor. To his final breath, Nickolls insisted that CUDA would change the world, but he witnessed only a glimpse of what CUDA would become.

CUDA

Huang had long sought a way to differentiate Nvidia from its competitors. Hardware innovations wouldn’t get him there; they were too easily cloned. Online, silicon fetishists swapped “die shots” of Nvidia’s microchips obtained by ripping the chip out of a retail board, dissolving the case in boiling sulfuric acid, then scanning the circuitry with a metallurgical microscope. The enthusiasm of the hobbyists paralleled professional espionage efforts by reverse-engineering teams at chipmaker laboratories. The denuded silicon was technically patented, but the 3dfx experience had shown the futility of lawsuits. “Everyone takes a look at their competitors’ hardware and how it works,” Diercks said. “It’s not even black ops. We just do it.” To distinguish himself, Huang had to pursue a strategy that so defied conventional business logic that ATI wouldn’t follow. He had to build an exploratory product, like a $300 entry-level scientific supercomputer that not only didn’t have competitors but also didn’t even have obvious customers. The zero-billion-dollar market, by definition, was one that only he would participate in—one that only he would even see. Huang was going to build a baseball diamond in a cornfield and wait for the players to arrive.

It’s amazing to me that even a chip-maker would view themselves as just a commodity. The whole time, Jensen is facing pricing pressure. He’s on this grand quest throughout the book to find that source of differentiation so that he can finally demand premium prices. It’s why they invented the GPU market.

Brett Coon, one of the first CUDA engineers, recalled. “In my opinion, the ‘genius’ of CUDA is getting gamers to pay for the massive chip development costs.”

Of course, it’s pitched to gamers (who are still to this day 50% of NVIDIA’s revenue) as this next-gen graphics advancement, but in reality they’re trying to become a vertical player who can control the entire stack. It’s the only way they can survive the onslaught.

Huang encouraged Nickolls to embrace the scientific customers—to embrace them very tightly and not let go. The performance gains from CUDA had to be so great, and so obvious, that customers would voluntarily build whole new academic disciplines around the platform. “After that, you will never want to leave,” Aarts said. “It’s vendor lock. There is no out.”

This is the embedding strategy I outlined.

Note that even internally at NVIDIA, this was a very controversial decision. Most people (rightfully) decried building a brand new technology that didn’t even have a customer yet. A solution in search of a problem is exactly what they teach you not to do in business schools, and yet that’s precisely what Jensen was labeling as a “0 billion dollar market”. In the end, it works - and maybe that’s all that matters.

“They’d rather wait for the hardware than switch away from CUDA,” he said. It was all this code that made Nvidia hard to compete against. Upstarts might design a new chip, but that wasn’t enough—Dwight Diercks, Nvidia’s head of software engineering, had ten thousand programmers working for him. “We’re really a software company; that’s the thing people don’t understand,” Diercks said.

And externally, it was obviously viewed as outright insane.

Even other semiconductor executives, no strangers to risk, thought CUDA unwise. It was the bet that made Jensen Jensen; it was the gamble that set him apart.

He tried everything - even making chips for mobile phones. His own leadership team had to step in and desparately get him to abandon his mobile initiative. NVIDIA’s stock price remained flat for a decade. He was betting everything on CUDA, and for a better part of a decade, there was no customer that existed that needed it!

Huang did not have a concrete vision of what the future of technology would look like. Some technologists did; for example, Elon Musk began with a vision of himself standing on the surface of Mars, then worked backward to build the technology he would need to get himself there. Huang went in the opposite direction; he started with the capabilities of the circuits sitting in front of him, then projected forward as far as logic would allow. Only there, at the frontier of reason, would he allow himself to take a single step forward into the nebulous realm of vibes.

This guy vibe-betted an entire company on this vision.

Obviously, all of that changes in 2012 with a little known lab out of the University of Toronto, headed by one Geoffrey Hinton, who would soon be known as the godfather of AI.

AI

When Hinton’s colleague designed a neural net that outperformed state-of-the-art software for recognizing pedestrians, he couldn’t even get his paper admitted to a conference. “The reaction was well, that doesn’t count, because it doesn’t explain how the computation is done—it’s just not telling us anything,” Hinton said.
The AI community of the time didn’t want to mimic intelligence—they wanted to solve it.

Even as of 2010, neural nets were the black sheep of the AI family tree. It was just common wisdom that they would not work.

He then sent an email to Nvidia: “I just told 1,000 machine learning experts at this conference that they should all go buy Nvidia cards. Would you give me one for free?” Nvidia declined. [sic] Hinton sometimes couldn’t even get the CUDA group to return his emails. The bias against neural nets was long established…

Yikes!

Krizhevsky decided he would introduce SuperVision to the world by winning ImageNet’s 2012 competition. In the weeks leading up to the event, Sutskever and Hinton began to pace the Toronto laboratory in giddy anticipation. “We knew we were going to win it,” Hinton said. They were the first to experience what would soon become a common phenomenon: the uncontainable thrill of sneak previewing embargoed AI technology that would shock the world once unveiled.

Of course, all of that changes once Ilya Sutskever and Alex Krizhevsky train their neural net using GPU’s.

When Fei-Fei Li first saw the SuperVision results, she wondered if they were in error. Li’s ImageNet contest had been her attempt to prove the value of her efforts to her advisers, but after attracting thirty-five entrants in 2010, participation had declined to just fifteen entrants in 2011. In 2012, there were only seven, and it was not clear the contest would survive another year. Now, one of those seven entrants was demonstrating a success rate above 80 percent—10 percent better than the state of the art in a field where improvement was typically measured in fractions of a percentage point. Stranger still, the winner was a neural network, a technology that Li considered to be a museum artifact. “It was like being told the land speed record had been broken by a margin of a hundred miles per hour in a Honda Civic,” Li recalled in her autobiography.

Incredible.

The accompanying academic paper for the SuperVision network, credited to Krizhevsky, Sutskever, and Hinton, has to date been cited more than 150,000 times, making it one of the single-most-important findings in the history of computer science.

And thus the AI revolution began. Google, Microsoft, and Baidu immediately engage in a bidding war. Eventually the three agree to go to Google Brain for an astronomical sum, and the AI revolution begins in earnest.

If more data was going to lead to better results, then the underlying structure for processing information should be as simple as possible. His inspiration was biology—medical scans suggested that of the estimated hundred billion neurons in the human brain, fewer than 1 percent were dedicated to language processing.

Naval Ravikant has a wonderful maxim - complex structures don’t scale. In nature, scale is only achieved by repeating the simplest mechanisms and taking it to the extreme.

Previous neural-net architectures had tried to build sentences or even paragraphs. The transformer worked by predicting exactly one word at a time, based on the probabilistic relationships. Just one word—that was the furthest it ever looked ahead.

People don’t realize that the Transformer was a dramatic simplification of previous neural nets, and that’s precisely what allows it to scale. Stop trying to predict the next paragraph, sentence, or even word. Just one token at a time. That’s it.

…the team ran “ablations,” deliberately disabling portions of the transformer code to understand what contributions they made. But the ablations had the unexpected result of making the core transformer function run even better. Shazeer removed so much of the surrounding code that in the end he was left with almost nothing. In its most primitive interpretation, the transformer was barely more than twenty lines of code.

In just wenty lines of code, Google achieved the impossible and solved an entire branch of computer science.

The transformer team expected Google to turn the technology into consumer-facing products, but management somehow didn’t see the value in the tech. Team members felt that Google’s search monopoly had resulted in a bloated, bureaucratic company unwilling to take risks. “They were like, ‘Hey, we cannot launch anything that doesn’t fit into the search box,’ ” Polosukhin said. “Fifteen years earlier, we would have just launched something bad. Then we iterate, we learn, and we improve, improve, improve, improve, until it’s actually really good. At some point, we lost that mentality.” The transformer authors began defecting to start-ups; by 2023, every one of the eight original transformer researchers had left.

Now this is truly the innovator’s dilemma.

“Literally the next day, it was clear to me, to us, that the transformer addressed the limitations of recurrent neural networks,” Sutskever said. “We switched to transformers right away.” Altman, now in charge, agreed with the change in strategy.

Talk about Google fumbling.⁴

Missionaries 2

In 2001 he was hired as a summer intern at Intel, where he was asked, as an exercise, to design a microchip that could pulse at ten billion beats per second. Doing the math, Catanzaro concluded the question was a setup: such a chip could never be built. He presented his findings to a group of senior engineers. “You must have done your work wrong,” his supervisor said. “This is part of Intel’s road map.” Catanzaro was stunned. He double-checked his calculations but could find no error. The transistors were getting too small, the end of Moore’s Law was approaching, and Intel was ignoring it.

It’s amazing how many times Intel pretends that Moore’s Law was not going to end. By the way, even today there is still not a single chip, by Intel or otherwise, that can pulse at 10 billion beats per second.

It was Catanzaro who had compared interacting with Huang to sticking a finger in the electric socket—but it was also Catanzaro who emphasized that Huang was not a man selling soap. He was a man whose passion for computing was not to be surpassed, and if there was anyone Catanzaro could convince about the coming intersection between parallel computing and AI, it was Huang. After Catanzaro was awarded his PhD, in 2011, he chose Nvidia.

Again, the sheer scope of the mission incites passion. Find the missionaries.

To Catanzaro’s surprise, Huang was immediately intrigued. Following their first meeting, Huang cleared his schedule and spent an entire weekend reading about AI, a subject about which he knew little. Another meeting soon followed, where Catanzaro found that his boss now knew as much—perhaps more—about neural nets as he did.

This is what true first principles thinking looks like. When he finds an idea that might be valuable, he immediately dives deep by reading everything he can. The ability to ramp up onto new topics is absolutely vital.

He called Catanzaro into the conference room he was using as his office and told him that he considered cuDNN to be the single most important project in his company’s twenty-year history. “He told me to imagine he’d marched all eight thousand of Nvidia’s employees into the parking lot,” Catanzaro said. “Then he told me I was free to select anyone from the parking lot to join my team.”

And once he rigorously understands the fundamentals, he has the conviction to move extremely quickly.

Huang concluded that neural networks would revolutionize society and that he could use CUDA to corner the market on the necessary hardware. He announced that he was betting the company. “He sent out an email on Friday evening saying everything is going to deep learning, and that we were no longer a graphics company,” Greg Estes, a vice president at Nvidia, said. “By Monday morning, we were an AI company. Literally, it was that fast.”

#FounderMode.

But as Nvidia moved into AI, Huang abandoned his hobbies. His appetite for mischief diminished, he stopped practicing table tennis, and the teppan grill went cold. He even stopped returning texts. “He was just so, so focused on work,” Horstmann said. “It was all he could talk about.” The conviction that Huang had been given a once-in-a-lifetime opportunity seized him. The acronym “O.I.A.L.O.” was repeated at every meeting. From the day Huang had started his career, at twenty, he had worked relentlessly, putting in consecutive twelve-hour days, six days a week, for three decades straight. Now past fifty, and with his kids grown, he began to work even harder.

Credit to Jensen, he really did recognize the sheer magnitude of the opportunity in front of him.

Jensen vs Elon

There was also the topic of loyalty. Musk did not value it; he often fired people arbitrarily and without warning, in one case canning the entire Starlink engineering team almost at random on a Sunday afternoon. Huang almost never fired anyone, and when he did, it was only after multiple cautions and the offer of a performance-improvement plan. It took truly egregious behavior to get kicked out of Nvidia, and many employees worked there for decades, including boomerang hires like Catanzaro and Aarts. Even when operating economics forced Huang to shutter a division, he reassigned employees to other useful tasks. In 2019 Curtis Priem returned to Nvidia’s offices for the first time in sixteen years to join Huang and Malachowsky for a reunion of the company’s founders. “I was astounded at how many people were still there,” he said. “Jeff Fisher, his kids were working for Nvidia.”

The passage speaks for itself.

To Musk, advanced AI posed a potentially extinction-level threat. Moreover, this opinion was shared by a great number of technologists, including both Hinton and Sutskever, the coauthors of the original AlexNet paper. Huang didn’t see it that way. He saw no risk in AI at all. Zero.

This is probably the biggest difference. Jensen truly does not believe the sci-fi nonsense. He didn’t grow up reading those novels and the message that AI may pose a risk does not resonate with him at all. He actually yells at the author near the conclusion for insinuating that a calculator could cause humanity to go extinct.

Conclusion

“Speed of light” did not mean, as one might assume, to move quickly. Instead, Huang encouraged managers to identify the absolute fastest that something could conceivably be accomplished, given an unlimited budget, and assuming that every single thing went right.

Huang, like Musk, operates at an insane speed. He starts from first principles - what is the fastest we could possibly do this assuming we dedicated all our energy to this? That is the benchmark that will be used to grade ourselves. Not what speed the industry or competitors move at.

“Once you understand the physical limits of what is possible, you understand the competition can’t go any faster either.”

The charming thing about Jensen is that he truly means this in a reassuring way.

Huang pursued this unattainable ideal every day of his life. “I should make sure that I’m sufficiently exhausted from working that no one can keep me up at night,” he later said. “That’s really the only thing I can control.”

Control what you can control. For Jensen, that means did I pour my every last scrap of energy into this.

Huang asked everyone at the company to submit a weekly list of the five most important things they were working on. Every Friday from that day forward, he received twenty thousand emails. Brevity was encouraged; Huang would randomly sample from this pool of correspondence late into the night. In turn, he communicated to his staff by writing hundreds of emails per day, often only a few words long. (One executive compared the emails to haiku. Another compared them to ransom notes.) His responsiveness was superhuman. “You’d email him at 2 a.m. and receive a reply at 2:05 a.m.,” Dally said. “Then you’d email him again at 6 a.m. and receive a reply at 6:05 a.m.”

He is relentless.

“I’ve often asked myself, how is it that we started in the same cubicle, you know, with a similar IQ, both working equally hard,” Horstmann said. “How is it that this person not only built this amazing company, but also a network around him of people that—that would just die for him if needed?” Huang, Horstmann believed, had changed himself many times. He recalled Huang at LSI, pushing the simulation software to its outer limits. “Now, he’s still doing the same thing, but what he’s engineering is himself. He was not born as a great CEO; he was not destined to be one. He transformed himself into one, just by abstracting! Just by problem-solving the inputs and outputs of what a good CEO should be.”

None of us are born to be great. We forge ourselves. We engineer ourselves. Some of the ingredients, as Jensen so eloquently told Stanford graduates, is great pain and suffering. But another ingredient is an iron will. Jensen is a man who wakes up at 4am every single day. Even after three decades, he hasn’t taken a day off. He is always on. In the long run, slope is everything.

I could not recommend this book more. It’s a very nuanced look at the brilliance and flaws of Jensen Huang, and at least as of 2025 it could not be any more timely. It’s well written, well researched, and overall gives a wonderful historical primer on AI and the computing industry as a whole.

¹ An absolutely hilarious video.

² The legendary paper, of course. No I will not apologize for the pun.

³ Dennard Scaling ended in the beginning of 2005. If this type of thing interests you, also check out Bryan Cantrill’s incredible talk.

⁴ If you’re interested in this part of Google history, the recent Acquired podcast covers this well.