One day in March of 2000, six of Google’s best engineers gathered in a makeshift war room. The company was in the midst of an unprecedented emergency. In October, its core systems, which crawled the Web to build an “index” of it, had stopped working. Although users could still type in queries at google.com, the results they received were five months out of date. More was at stake than the engineers realized. Google’s co-founders, Larry Page and Sergey Brin, were negotiating a deal to power a search engine for Yahoo, and they’d promised to deliver an index ten times bigger than the one they had at the time—one capable of keeping up with the World Wide Web, which had doubled in size the previous year. If they failed, google.com would remain a time capsule, the Yahoo deal would likely collapse, and the company would risk burning through its funding into oblivion.
In a conference room by a set of stairs, the engineers laid doors across sawhorses and set up their computers. Craig Silverstein, a twenty-seven-year-old with a small frame and a high voice, sat by the far wall. Silverstein was Google’s first employee: he’d joined the company when its offices were in Brin’s living room and had rewritten much of its code himself. After four days and nights, he and a Romanian systems engineer named Bogdan Cocosel had got nowhere. “None of the analysis we were doing made any sense,” Silverstein recalled. “Everything was broken, and we didn’t know why.”
Silverstein had barely registered the presence, over his left shoulder, of Sanjay Ghemawat, a quiet thirty-three-year-old M.I.T. graduate with thick eyebrows and black hair graying at the temples. Sanjay had joined the company only a few months earlier, in December. He’d followed a colleague of his—a rangy, energetic thirty-one-year-old named Jeff Dean—from Digital Equipment Corporation. Jeff had left D.E.C. ten months before Sanjay. They were unusually close, and preferred to write code jointly. In the war room, Jeff rolled his chair over to Sanjay’s desk, leaving his own empty. Sanjay worked the keyboard while Jeff reclined beside him, correcting and cajoling like a producer in a news anchor’s ear.
Jeff and Sanjay began poring over the stalled index. They discovered that some words were missing—they’d search for “mailbox” and get no results—and that others were listed out of order. For days, they looked for flaws in the code, immersing themselves in its logic. Section by section, everything checked out. They couldn’t find the bug.
Programmers sometimes conceptualize their software as a structure of layers ranging from the user interface, at the top, down through increasingly fundamental strata. To venture into the bottom of this structure, where the software meets the hardware, is to turn away from the Platonic order of code and toward the elemental universe of electricity and silicon on which it depends. On their fifth day in the war room, Jeff and Sanjay began to suspect that the problem they were looking for was not logical but physical. They converted the jumbled index file to its rawest form of representation: binary code. They wanted to see what their machines were seeing.
On Sanjay’s monitor, a thick column of 1s and 0s appeared, each row representing an indexed word. Sanjay pointed: a digit that should have been a 0 was a 1. When Jeff and Sanjay put all the missorted words together, they saw a pattern—the same sort of glitch in every word. Their machines’ memory chips had somehow been corrupted.
Sanjay looked at Jeff. For months, Google had been experiencing an increasing number of hardware failures. The problem was that, as Google grew, its computing infrastructure also expanded. Computer hardware rarely failed, until you had enough of it—then it failed all the time. Wires wore down, hard drives fell apart, motherboards overheated. Many machines never worked in the first place; some would unaccountably grow slower. Strange environmental factors came into play. When a supernova explodes, the blast wave creates high-energy particles that scatter in every direction; scientists believe there is a minute chance that one of the errant particles, known as a cosmic ray, can hit a computer chip on Earth, flipping a 0 to a 1. The world’s most robust computer systems, at NASA, financial firms, and the like, used special hardware that could tolerate single bit-flips. But Google, which was still operating like a startup, bought cheaper computers that lacked that feature. The company had reached an inflection point. Its computing cluster had grown so big that even unlikely hardware failures were inevitable.
Together, Jeff and Sanjay wrote code to compensate for the offending machines. Shortly afterward, the new index was completed, and the war room disbanded. Silverstein was flummoxed. He was a good debugger; the key to finding bugs was getting to the bottom of things. Jeff and Sanjay had gone deeper.
Until the March index debacle, Google’s systems had been rooted in code that its founders had written in grad school, at Stanford. Page and Brin weren’t professional software engineers. They were academics conducting an experiment in search technology. When their Web crawler crashed, there was no informative diagnostic message—just the phrase “Whoa, horsey!” Early employees referred to BigFiles, a piece of software that Page and Brin had written, as BugFiles. Their all-important indexing code took days to finish, and if it encountered a problem it had to re-start from the beginning. In the parlance of Silicon Valley, Google wasn’t “scalable.”
We say that we “search the Web,” but we don’t, really; our search engines traverse an index of the Web—a map. When Google was still called BackRub, in 1996, its map was small enough to fit on computers installed in Page’s dorm room. In March of 2000, there was no supercomputer big enough to process it. The only way that Google could keep up was by buying consumer machines and wiring them together into a fleet. Because half the cost of these computers was in parts that Google considered junk—floppy drives, metal chassis—the company would order raw motherboards and hard drives and sandwich them together. Google had fifteen hundred of these devices stacked in towers six feet high, in a building in Santa Clara, California; because of hardware glitches, only twelve hundred worked. Failures, which occurred seemingly at random, kept breaking the system. To survive, Google would have to unite its computers into a seamless, resilient whole.
Side by side, Jeff and Sanjay took charge of this effort. Wayne Rosing, who had worked at Apple on the precursor to the Macintosh, joined Google in November, 2000, to run its hundred-person engineering team. “They were the leaders,” he said. Working ninety-hour weeks, they wrote code so that a single hard drive could fail without bringing down the entire system. They added checkpoints to the crawling process so that it could be re-started midstream. By developing new encoding and compression schemes, they effectively doubled the system’s capacity. They were relentless optimizers. When a car goes around a turn, more ground must be covered by the outside wheels; likewise, the outer edge of a spinning hard disk moves faster than the inner one. Google had moved the most frequently accessed data to the outside, so that bits could flow faster under the read-head, but had left the inner half empty; Jeff and Sanjay used the space to store preprocessed data for common search queries. Over four days in 2001, they proved that Google’s index could be stored using fast random-access memory instead of relatively slow hard drives; the discovery reshaped the company’s economics. Page and Brin knew that users would flock to a service that delivered answers instantly. The problem was that speed required computing power, and computing power cost money. Jeff and Sanjay threaded the needle with software.
Alan Eustace became the head of the engineering team after Rosing left, in 2005. “To solve problems at scale, paradoxically, you have to know the smallest details,” Eustace said. Jeff and Sanjay understood computers at the level of bits. Jeff once circulated a list of “Latency Numbers Every Programmer Should Know.” In fact, it’s a list of numbers that almost no programmer knows: that an L1 cache reference usually takes half a nanosecond, or that reading one megabyte sequentially from memory takes two hundred and fifty microseconds. These numbers are hardwired into Jeff’s and Sanjay’s brains. As they helped spearhead several rewritings of Google’s core software, the system’s capacity scaled by orders of magnitude. Meanwhile, in the company’s vast data centers technicians now walked in serpentine routes, following software-generated instructions to replace hard drives, power supplies, and memory sticks. Even as its parts wore out and died, the system thrived.
Today, Google’s engineers exist in a Great Chain of Being that begins at Level 1. At the bottom are the I.T. support staff. Level 2s are fresh out of college; Level 3s often have master’s degrees. Getting to Level 4 takes several years, or a Ph.D. Most progression stops at Level 5. Level 6 engineers—the top ten per cent—are so capable that they could be said to be the reason a project succeeds; Level 7s are Level 6s with a long track record. Principal Engineers, the Level 8s, are associated with a major product or piece of infrastructure. Distinguished Engineers, the Level 9s, are spoken of with reverence. To become a Google Fellow, a Level 10, is to win an honor that will follow you for life. Google Fellows are usually the world’s leading experts in their fields. Jeff and Sanjay are Google Senior Fellows—the company’s first and only Level 11s.
The Google campus, set beside a highway a few minutes from downtown Mountain View, is a series of squat, unattractive buildings with tinted windows. One Monday last summer, after a morning of programming together, Jeff and Sanjay went to lunch at a campus cafeteria called Big Table, which was named for a system they’d helped develop, in 2005, for treating numberless computers as though they were a single database. Sanjay, who is tall and thin, wore an ancient maroon Henley, gray pants, and small wire-frame glasses. He spied a table outside and walked briskly to claim it, cranking open the umbrella and taking a seat in the shade. He moved another chair into the sun for Jeff, who arrived a minute later, broad-shouldered in a short-sleeved shirt and wearing stylish sneakers.
Like a couple, Jeff and Sanjay tell stories together by contributing pieces of the total picture. They began reminiscing about their early projects.
“We were writing things by hand,” Sanjay said. His glasses darkened in the sun. “We’d rewrite it, and it was, like, ‘Oh, that seems near to what we wrote last month.’ ”
“Or a slightly different pass in our indexing data,” Jeff added.
“Or slightly different,” Sanjay said. “And that’s how we figure out—”
“This is the essence,” Jeff said.
“—this is the common pattern,” Sanjay said, finishing their thought.
Jeff took a bite of the pizza he’d got. He has the fingers of a deckhand, knobby and leathery; Sanjay, who looks almost delicate in comparison, wondered how they ended up as a pair. “I don’t quite know how we decided that it would be better,” he said.
“We’ve been doing it since before Google,” Jeff said.
“But I don’t know why we decided it was better to do it in front of one computer instead of two,” Sanjay said.
“I would walk from my D.E.C. research lab two blocks away to his D.E.C. research lab,” Jeff said. “There was a gelato store in the middle.”
“So it’s the gelato store!” Sanjay said, delighted.
Sanjay, who is unmarried, joins Jeff, his two daughters, and his wife, Heidi, on vacations. Jeff’s daughters call him Uncle Sanjay, and the five of them often have dinner on Fridays. Sanjay and Victoria, Jeff’s eldest, have taken to baking. “I’ve seen his daughters grow up,” Sanjay said, proudly. After the Google I.P.O., in 2004, they moved into houses that are four miles apart. Sanjay lives in a modest three-bedroom in Old Mountain View; Jeff designed his house, near downtown Palo Alto, himself, installing a trampoline in the basement. While working on the house, he discovered that although he liked designing spaces, he didn’t have patience for what he calls the “Sanjay-oriented aspects” of architecture: the details of beams, bolts, and loads that keep the grand design from falling apart.
“I don’t know why more people don’t do it,” Sanjay said, of programming with a partner.
“You need to find someone that you’re gonna pair-program with who’s compatible with your way of thinking, so that the two of you together are a complementary force,” Jeff said.
They pushed back from the table and set out in search of soft-serve, strolling through Big Table and its drifting Googlers. Of the two, Jeff is more eager to expound, and while they walked he shared his soft-serve strategy. “I do the squish. I think the pushing-up approach adds stability,” he said. Sanjay, pleased and intent, swirled a chocolate-and-vanilla mix into his cone.
In his book “Collaborative Circles: Friendship Dynamics and Creative Work,” from 2001, the sociologist Michael P. Farrell made a study of close creative groups—the French Impressionists, Sigmund Freud and his contemporaries. “Most of the fragile insights that laid the foundation of a new vision emerged not when the whole group was together, and not when members worked alone, but when they collaborated and responded to one another in pairs,” he wrote. It took Monet and Renoir, working side by side in the summer of 1869, to develop the style that became Impressionism; during the six-year collaboration that gave rise to Cubism, Pablo Picasso and Georges Braque would often sign only the backs of their canvases, to obscure which of them had completed each painting. (“A canvas was not finished until both of us felt it was,” Picasso later recalled.) In “Powers of Two: Finding the Essence of Innovation in Creative Pairs,” the writer Joshua Wolf Shenk quotes from a 1971 interview in which John Lennon explained that either he or Paul McCartney would “write the good bit, the part that was easy, like ‘I read the news today’ or whatever it was.” One of them would get stuck until the other arrived—then, Lennon said, “I would sing half, and he would be inspired to write the next bit and vice versa.” Everyone falls into creative ruts, but two people rarely do so at the same time.
In the “theory building” phase of a new science or art, it’s important to explore widely without getting caught in dead ends. François Jacob, who, with Jacques Monod, pioneered the study of gene regulation, noted that by the mid-twentieth century most research in the growing field of molecular biology was the result of twosomes. “Two are better than one for dreaming up theories and constructing models,” Jacob wrote. “For with two minds working on a problem, ideas fly thicker and faster. They are bounced from partner to partner. They are grafted onto each other, like branches on a tree. And in the process, illusions are sooner nipped in the bud.” In the past thirty-five years, about half of the Nobel Prizes in Physiology or Medicine have gone to scientific partnerships.
After years of sharing their working lives, duos sometimes develop a private language, the way twins do. They imitate each other’s clothing and habits. A sense of humor osmoses from one to the other. Apportioning credit between them becomes impossible. But partnerships of this intensity are unusual in software development. Although developers sometimes talk about “pair programming”—two programmers sharing a single computer, one “driving” and the other “navigating”—they usually conceive of such partnerships in terms of redundancy, as though the pair were co-pilots on the same flight. Jeff and Sanjay, by contrast, sometimes seem to be two halves of a single mind. Some of their best-known papers have as many as a dozen co-authors. Still, Bill Coughran, one of their managers, recalled, “They were so prolific and so effective working as a pair that we often built teams around them.”
In 1966, researchers at the System Development Corporation discovered that the best programmers were more than ten times as effective as the worst. The existence of the so-called “10x programmer” has been controversial ever since. The idea venerates the individual, when software projects are often vast and collective. In programming, few achievements exist in isolation. Even so—and perhaps ironically—many coders see the work done by Jeff and Sanjay, together, as proof that the 10x programmer exists.
Jeff was born in Hawaii, in July of 1968. His father, Andy, was a tropical-disease researcher; his mother, Virginia Lee, was a medical anthropologist who spoke half a dozen languages. For fun, father and son programmed an IMSAI 8080 kit computer. They soldered upgrades onto the machine, learning every part of it.
Jeff and his parents moved often. At thirteen, he skipped the last three months of eighth grade to help them at a refugee camp in western Somalia. Later, in high school, he started writing a data-collection program for epidemiologists called Epi Info; it became a standard tool for field work and, eventually, hundreds of thousands of copies were distributed, in more than a dozen languages. (A Web site maintained by the Centers for Disease Control and Prevention, “The Epi Info Story,” includes a picture of Jeff at his high-school graduation.) Heidi, whom Jeff met in college, at the University of Minnesota, learned of the program’s significance only years later. “He didn’t brag about any of that stuff,” she said. “You had to pull it out of him.” Their first date was at a women’s basketball game; Jeff was in a gopher costume, cheerleading.
Jeff’s Ph.D. focussed on compilers, the software that turns code written by people into machine-language instructions optimized for computers. “In terms of sexiness, compilers are pretty much as boring as it gets,” Alan Eustace said; on the other hand, they get you “very close to the machine.” Describing Jeff, Sanjay twirled his index finger around his temple. “He has a model going on as you’re writing code,” he said. “ ‘What is the performance of this code going to be?’ He’ll think about all the corner cases almost semi-automatically.”