← All summaries

The 20-year journey to fully autonomous cars with Dmitri Dolgov of Waymo

Cheeky Pint · Patrick McKenzie — Dmitri Dolgov · March 24, 2026 · Original

Most important take away

Waymo’s path to 500,000+ fully autonomous rides per week was built on a foundation model approach where a single large AI model is specialized into three “teachers” (driver, simulator, critic) and then distilled into smaller on-device models. Dolgov emphasizes that full autonomy and driver-assist systems are fundamentally different problems — you cannot incrementally work your way up from driver-assist to full self-driving, which is why Waymo’s 20-year head start and Google’s long-term conviction were essential.

Chapter Summaries

Dmitri Dolgov’s Background

Dolgov grew up in the Soviet Union, went to Japan and then the US with his physicist father, returned to Russia in 1994 for his bachelor’s and master’s in physics and applied math, then came back to the US for graduate school in computer science. He joined Google’s self-driving car project in 2009 as one of its first engineers and became co-CEO in 2021.

How the Waymo Driver Works

The car uses three sensor types — cameras, LiDAR, and radar — all providing 360-degree coverage. Sensor data feeds into an AI system with encoders for each modality. All inference runs locally on the vehicle; nothing real-time depends on the cloud. Cloud is used only for non-driving tasks like detecting lost items or flagging dirty cars.

Foundation Model Architecture

Waymo starts with a large foundation model that understands the physical world, then specializes it into three “off-board teachers”: the Waymo Driver, the Simulator, and the Critic. These teachers are distilled into smaller models that run efficiently. The Driver powers the car, the Simulator generates synthetic environments for training and evaluation, and the Critic identifies interesting events and judges good vs. bad driving behavior.

End-to-End vs. Modular Debate

A pure “pixels in, trajectories out” approach works surprisingly well for nominal driving — even fine-tuning a VLM can produce decent results. But it is orders of magnitude away from the safety bar required for full autonomy. Intermediate representations (objects, roads, signs) are needed to enable efficient simulation, reinforcement learning, and safety validation layers.

Technology Evolution Over 20 Years

AI breakthroughs (especially transformers) and compute improvements were critical enablers. The journey was not about dead ends but iterative evolution. Gen 4 of the Waymo Driver used many small ML models; Gen 5 made the big bet on AI as the core backbone, trained on data from across the US, and deployed in the hardest parts of San Francisco and Phoenix. This was the discontinuous leap that enabled rapid scaling.

Sensor Complementarity: LiDAR vs. Radar

LiDAR provides extremely high-resolution 3D mapping. Radar has lower resolution but degrades gracefully in fog, snow, and heavy rain. The system fuses all sensors jointly rather than comparing separate estimates. A striking example: the AI detected a pedestrian hidden behind a bus by picking up faint LiDAR reflections of feet under the bus.

Gen 6 Hardware and Custom Vehicle

The sixth-generation vehicle is purpose-built around passengers (sliding doors, flat floor, spacious interior) rather than the driver. The sensor hardware is simpler, more capable, and a fraction of the cost of Gen 5 — comparable to a high-end driver-assist system. The software stack remains largely the same, demonstrating its generalizability across vehicle platforms. Waymo plans to also put the Gen 6 driver on the Hyundai Ioniq.

Business Metrics and Expansion

Waymo operates about 3,000 cars, delivers roughly 500,000 rides and over 4 million fully autonomous miles per week across 11 US cities (10 with riders). They recently opened four new cities in a single day — a process that previously took eight years. London and Tokyo launches are planned for this year.

Full Autonomy vs. Driver Assist

Dolgov views these as fundamentally different problems. Driver-assist systems and full autonomy will converge from both directions — cars getting smarter and autonomous fleets getting cheaper sensors — but the hardest parts of building a fully autonomous system are qualitatively different from driver-assist work. Eventually, the Waymo Driver may appear on personally owned vehicles, especially for low-density areas where ride-hailing fleets are impractical.

Operations and Fleet Management

Depot operations are increasingly automated: cars autonomously navigate to charging stalls and cleaning stations. Cleaning is still manual but flagged automatically. Inductive charging (like a phone charger pad) is being explored. Riders generally treat cars well, though college towns on Saturday nights are a different story.

Google’s Role and Culture

Dolgov credits Google/Alphabet leadership for having the vision and stamina to invest through nearly two decades of development. The company’s culture of rejecting the status quo, investing in big technical bets, and nurturing technical talent was essential. There was no single “right moment” to start — the problem is deceptively easy to begin but requires iterating through many technology waves.

Summary

Actionable insights and key takeaways:

  • Waymo’s AI architecture is a masterclass in foundation model strategy. They train one large foundation model, specialize it into three teacher models (driver, simulator, critic), then distill each into efficient on-device models. This “invest once, amplify everywhere” approach means improvements to the foundation model cascade through the entire system. Any company building AI for the physical world should study this pattern.

  • End-to-end AI is necessary but not sufficient for safety-critical applications. A vanilla VLM fine-tuned on driving data can handle normal driving surprisingly well, but is “orders of magnitude away” from what full autonomy requires. The lesson: augmenting learned representations with structured intermediate representations (objects, roads, rules) is how you get the reliability needed for production deployment. This applies broadly to any AI system where failure has serious consequences.

  • Driver-assist and full autonomy are fundamentally different problems, not points on a spectrum. Dolgov is direct: you cannot incrementally upgrade a Level 2 system into a Level 4 system. The hardest parts of full autonomy (long-tail edge cases, superhuman safety, closed-loop simulation) require qualitatively different engineering. Companies betting on a gradual path from driver-assist to full self-driving may be underestimating the gap.

  • Waymo is scaling rapidly and expanding internationally. With 500,000+ rides per week, 3,000 cars, 11 US cities, and London and Tokyo launches planned for 2026, Waymo has moved from R&D to global deployment. The Gen 6 hardware brings sensor costs down to driver-assist price points, which is a prerequisite for economic viability at scale.

  • Multi-sensor fusion provides capabilities no single sensor can match. The anecdote about detecting a pedestrian behind a bus through faint LiDAR returns of their feet is a compelling case for sensor diversity. Companies building perception systems should invest in true fusion architectures rather than comparing independent sensor outputs.

  • The personal Waymo is a frequently requested product. Dolgov acknowledged consumer demand for a personally owned autonomous vehicle — useful for areas without ride-hail density. This suggests Waymo sees a path beyond fleet-only operations to licensing its driver technology to consumers or automakers.

  • Career insight from Dolgov’s trajectory: A strong foundation in math and physics (Russian-style rigorous education), combined with US graduate school in computer science, positioned him well for a 20-year technical leadership arc. Google’s culture of investing in and promoting deep technical talent — not just managers — was key to retaining someone through such a long R&D journey.

  • On long-term R&D investment: There is no magic moment where a hard problem suddenly becomes easy. Every “nine” of reliability costs 10x more than the last. Google/Alphabet’s willingness to fund Waymo through nearly two decades of development, without a clear payoff timeline, is a rare and valuable organizational capability.