Back to Blog
·6 min read

AlphaGo Architect Bets $1B That LLMs Cannot Reach AGI

David Silver leaves DeepMind to build Ineffable Intelligence, a startup pursuing superintelligence through reinforcement learning instead of language models.

Reinforcement LearningAGIDeepMindAI Research

David Silver, one of the most influential AI researchers of the past decade, has left Google DeepMind to found his own startup. Ineffable Intelligence, based in London, is reportedly raising $1 billion in what would be Europe's largest seed round ever. The company's thesis is provocative: large language models cannot achieve superintelligence. To get there, we need reinforcement learning.

This is not a contrarian bet from someone on the sidelines. Silver was a founding researcher at DeepMind in 2010 and the principal architect behind AlphaGo, AlphaZero, MuZero, and AlphaStar. He also contributed to Google's Gemini models. When someone with this track record makes a major strategic bet against the dominant paradigm, it deserves serious attention.

David Silver presenting on reinforcement learning
David Silver presenting on reinforcement learning

The Case Against Language Models for AGI

Silver has been publicly critical of the current LLM trajectory. In an April 2025 podcast, he argued that language models remain "constrained by human knowledge" because they train on human-generated text and optimize for human feedback. The result is systems that can interpolate human knowledge extremely well but cannot fundamentally transcend it.

The limitation is architectural. LLMs learn patterns from static datasets representing what humans have already written. Even with RLHF (reinforcement learning from human feedback), the optimization target remains human preferences. Silver contends that achieving superintelligence requires AI systems that can independently discover novel concepts, strategies, and knowledge that humans have never conceived.

This is not a theoretical argument. AlphaGo demonstrated it empirically. The system made moves that professional Go players initially dismissed as mistakes, moves that turned out to be brilliant innovations no human had discovered in thousands of years of play. That capability emerged from reinforcement learning in a simulated environment, not from training on historical game records.

Reinforcement Learning as the Path Forward

Ineffable Intelligence will focus on what Silver calls "endlessly learning superintelligence that self-discovers the foundations of all knowledge." The technical approach centers on several key components:

Learning through experience: Rather than training on static datasets, AI systems interact with environments through trial and error. They develop strategies by experiencing consequences, not by memorizing patterns from human data.

World models: The systems build internal simulations that allow them to predict the consequences of actions before taking them. This enables planning and reasoning that extends beyond pattern matching.

Continuous adaptation: Unlike current models that are static after training, these systems continue learning over months and years, similar to how humans and animals develop expertise through ongoing experience.

Silver and his longtime collaborator Richard Sutton have argued that experience will eventually become the dominant source of AI improvement, surpassing human-generated data. The reasoning is straightforward: human data is finite and represents human limitations, while simulated experience can be generated infinitely and explore beyond human capabilities.

The Funding and Team

The $1 billion seed round, led by Sequoia Capital, would value Ineffable Intelligence at approximately $4 billion. Nvidia, Google, and Microsoft are reportedly in discussions to participate. For context, this would represent the largest seed investment in European AI history.

Silver holds a professorship at University College London alongside his role as CEO. His departure from DeepMind was supported by CEO Demis Hassabis, who publicly endorsed the venture. This suggests the split was amicable and potentially reflects a strategic bet-hedging within the AI research community.

The timing aligns with a broader trend of senior AI researchers leaving established labs to pursue alternative approaches. Ilya Sutskever left OpenAI to found Safe Superintelligence Inc. Jerry Tworek, another DeepMind alumnus, departed to explore different architectures. Yann LeCun is raising 500 million euros for AMI Labs, focused on world models.

What This Means for AI Practitioners

The debate between language model scaling and reinforcement learning approaches has significant implications for how we think about AI development and deployment.

Near-term versus long-term capabilities: LLMs will likely remain the dominant paradigm for practical applications over the next several years. The infrastructure, tooling, and integration patterns are mature. Organizations should continue building on this foundation while monitoring alternative approaches.

Hybrid architectures are already emerging: Current frontier models already incorporate reinforcement learning for capabilities like reasoning and tool use. The question is not whether RL will matter but how much weight it receives in future architectures. OpenAI's o3 series and Google's Gemini 3.1 Pro both use reinforcement learning to improve reasoning.

Talent signals matter: When researchers of Silver's caliber make bold moves, it indicates where the cutting edge of the field is heading. The concentration of talent in reinforcement learning and world model startups suggests these approaches will receive significant resources and attention.

Regional implications: Ineffable Intelligence choosing London reflects the UK's strengths in fundamental AI research. For the Gulf region, this reinforces the importance of developing local research talent and creating conditions that attract global researchers. Abu Dhabi's investments in AI research institutions position the UAE well to participate in these developments.

Looking Forward

Ineffable Intelligence will take years to validate its thesis. Building reinforcement learning systems that generalize across domains remains an unsolved challenge. The environments and reward structures that enabled AlphaGo's success do not translate directly to open-ended real-world problems.

However, the company's existence forces a productive tension in the field. If Silver is right, the current LLM scaling trajectory will hit fundamental limitations before reaching AGI. If he is wrong, his approach will still likely produce valuable insights and capabilities that feed back into hybrid architectures.

For those of us building AI applications today, the practical lesson is hedging. Architect systems that can incorporate different model types and reasoning approaches. Build abstraction layers that allow swapping underlying models as capabilities evolve. The foundation model landscape is fragmenting, and flexibility will be a competitive advantage.

The AlphaGo moment in 2016 reshaped how the world understood AI's potential. Whether Ineffable Intelligence produces a similar breakthrough remains to be seen, but the attempt itself signals that the frontier of AI research extends well beyond scaling language models.

Sources:

Book a Consultation

Business Inquiry