20VC: Raising $500M To Compete in the Race for AGI | Will Scaling Laws Continue: Is Access to Compute Everything | Will Nvidia Continue To Dominate | The Biggest Bottlenecks in the Race for AGI with Eiso Kant, CTO @ Poolside

20VC · Harry Stebbings — Eiso Kant · October 7, 2024 · Original

Most important take away

The race to AGI is won by mastering four ingredients — compute, data, proprietary applied research, and talent — and capital alone does not translate one-to-one into success outside of compute. For software development specifically, the missing data is not finished code (which is abundant) but the intermediate reasoning, attempts, and execution feedback that produced it, which is why Poolside invests heavily in reinforcement learning from code execution feedback across 130,000 real-world codebases.

Summary

Eiso Kant, co-founder and CEO of Poolside, lays out an actionable framework for thinking about frontier AI. His core thesis is that AGI will arrive unevenly — first in economically valuable domains where the gap between human and machine intelligence is closeable with sufficient data. For software development, public code (~3T tokens) represents only finished output; the truly valuable, missing dataset is the intermediate thinking, failed attempts, and reasoning traces. Poolside generates this synthetically using reinforcement learning from code execution feedback (RLCEF), exploiting the fact that code, unlike the real world, is near-deterministic and can be simulated at scale.

Actionable insights for builders and operators:

Four ingredients of the capabilities race: compute, data, proprietary applied research, and talent. Dollars map one-to-one to compute, but NOT to the other three. This is why $6B is not necessarily “enough” for OpenAI and $600M can be sufficient for Poolside at this stage.
Algorithmic and compute-efficiency gains are table stakes — they let you stay in the race; they do not differentiate you. Differentiation comes from data and applied research.
Synthetic data only works when paired with an “oracle of truth” (e.g., code execution, unit tests). Without that grounding signal, models generating their own training data is “the snake eating itself.”
The industry pattern: train very large (often MoE) models that are too expensive to serve, then distill down into smaller models you can run profitably. Expect this distillation pipeline to remain standard for years.
Hardware margin matters at scale. Hyperscalers with vertically integrated silicon (Google TPUs, Amazon Trainium/Inferentia, Microsoft’s emerging chips) have structural cost advantages over anyone buying Nvidia at retail margin.
Training requires tightly interconnected clusters in one location; inference does not. Interconnecting >32K GPUs is hard today, 100K is emerging, 1M+ is bounded by real physical/algorithmic limits — so unlimited cash cannot yet buy unlimited training scale.
For early-stage AI startups, compute is currently easier to source than for enterprises because Nvidia and hyperscalers are incentivized to back winners. Decisions made today on compute partners affect you 12–18 months out.
Black Well delays helped incumbents on H200s; new generations roughly double training perf every two years, but the bigger unlock from Blackwell is on inference economics.
The hyperscalers with their own silicon (Amazon, Google, Microsoft) plus Nvidia/AMD form the chip landscape. Independent AI labs left to potentially acquire are few: Reka, Mistral, and XAI as an outlier.

Career and team-building advice:

Talent is geographically distributed. Poolside mapped ~3,300 candidates globally — the Bay Area was the largest single cluster, but the majority lived elsewhere. Build where the talent wants to live.
Europe has deep AI talent pools seeded by DeepMind (London), Meta (London/Paris), and Yandex diaspora. The stereotype that Europeans won’t work hard is wrong — race-grade operators exist everywhere; you just have to find them.
Be explicit on the first call about the level of intensity. If you’re running a race (not just building a startup), say so — sacrifices come with it, and self-selection is a feature.
This decade is “table-setting” like mobile internet 2007–2010. If you want to be in the AGI race, you have to give it everything now; you cannot afford to stumble on either capabilities or go-to-market.

Patterns and mental models worth stealing:

Foundation models = compressing web-scale data into a neural net and forcing generalization. Where data is small, capability is weak. Where you can synthesize grounded data at scale, capability compounds.
Three “mountains” of this century: AGI, energy, space. Each climb reveals the prior one was smaller than it looked.
Regulate end-user applications of AI, not the inputs (compute thresholds). Compute-based regulation primarily harms small/young companies.
The “why” of a founder matters more than the metrics. When evaluating peers and competitors, Kant’s first question is always why they’re doing it.
On stuff: material things are not the journey. Kant lives on a sailboat with his partner and dog to keep life simple and stay immersed in the work.

Chapter Summaries

What is Poolside: A frontier AI company focused on building the most capable AI for software development as a path toward AGI. They believe the gap between human and machine intelligence will close unevenly, fastest in domains with abundant data and economic value.
The Missing Dataset: Public code (~3T tokens) is only the output; the missing data is the intermediate reasoning and trial-and-error that produced it. Poolside synthesizes this via reinforcement learning from code execution feedback (RLCEF) across 130K real codebases.
Compute, Data, Algorithms — and Talent: Three commonly cited inputs, plus a fourth (talent) Kant insists on. Algorithm and compute-efficiency improvements are table stakes; data and proprietary applied research are where you differentiate.
Synthetic Data and the Oracle of Truth: Synthetic data only generalizes when paired with a ground-truth signal (code execution, tests). Without it, models training on their own outputs degrade.
Scaling Laws: Plenty of room left, both in model size and in inference-time compute (e.g., generating many candidate solutions). Limits exist, but we are far from them.
Cost vs Price of Models: Hyperscalers with their own silicon (Google TPU, Amazon Trainium, Microsoft) have a structural cost advantage as the price war on general-purpose models intensifies.
Train Large, Distill Small: The standard pattern — train multi-trillion-parameter MoE models, then distill into smaller deployable models customers can afford to use.
Closing the Human–Machine Gap: Where the gap is small and data is abundant (speech), commoditization is near. Where the gap is large but data can be synthesized with deterministic feedback (code), enormous opportunity exists.
GitHub and Proprietary Data: GitHub has private code no one can train on; everyone else has the same public corpus. There is no inherent capabilities-race advantage to public data alone.
Compute Capital Required: $500M lets Poolside enter the race with 10K H200 GPUs brought online summer 2024. Larry Ellison’s $100B “entry price” applies to hyperscalers, not labs. Real physical limits cap how much money can buy today.
Nvidia Dominance and Alternatives: Nvidia still dominates, fast-followed by Google TPUs and Amazon. AMD lacks its own cloud. Blackwell may unlock major inference gains; training improvements remain ~2x every two years.
GPT-5 and the Decade Ahead: The right frame is not what GPT-5 ships, but what we’ll see looking back from 10 years out — likely an exponential climb on AGI, energy, and space.
Why No Hyperscaler in the Round: Poolside deliberately took no Google/Microsoft/Amazon money; only Nvidia, due to deep collaboration. Standalone strategy preserves optionality.
Who’s Left to Acquire: Most independent labs already absorbed. Remaining: Reka, Mistral, XAI (unlikely sellers). OpenAI at $156B, Anthropic at $40B, XAI at $24B — each has distinct strengths (XAI: infra speed; OpenAI: distribution/revenue; Anthropic: rigorous research).
Elon, Peter Thiel, and Crypto vs AI: Reflection on conviction, risk, and why crypto’s decentralization promise was overtaken by bad actors while AI is centralized by resource scarcity but with better stewards.
Talent and the European Bet: Poolside mapped 3,300 candidates worldwide; talent is distributed. London (~15), Paris (~2). DeepMind, Meta, and Yandex seeded European AI talent. Race-grade operators exist everywhere; you have to find them.
Work Ethic and the Race Mindset: The first 2–3 years post-ChatGPT are when “the table gets set.” Be honest with hires about the level of intensity required.
Was Netscape Different: Kant pushes back on the historical analogy — the conditions, capital, and understanding are different now, and some of today’s frontier-model companies will become the giants enabling the next layer.
China and Geopolitics: China is at the AI frontier and publishes openly to attract talent. Best Western response: attract Chinese talent rather than disengage.
Quickfire: Biggest mind-change — importance of data scale. Worst regulation risk — bureaucratic overhead that crushes startups. Biggest progress risk — global conflict disrupting chip supply chains. Dream board member — Mark Zuckerberg for conviction.
The Boat, Stuff, and Why: Kant lives on a sailboat; stripped of material distractions, the journey with people is what matters. The “why” is the most important question to ask any founder.
Pre-mortem: The single greatest risk to Poolside is stumbling in either the capabilities race or the go-to-market race. They cannot afford either.