SEARCH
SHARE IT
Google DeepMind is launching Gemini 2.5 Deep Think, a sophisticated new version of its Gemini model, designed to handle complex reasoning tasks by testing multiple ideas at once. According to Google, this is its most advanced publicly available AI system to date, capable of tackling intricate problems with a depth of logic and strategy rarely seen in consumer-facing models.
Gemini 2.5 Deep Think will be available starting Friday through the Gemini app, but only for subscribers to Google’s Ultra plan, which costs $250 per month. This high-performance model introduces a multi-agent approach, where several AI agents work simultaneously on a single query. While this method consumes more computing power than single-agent models, it often leads to more accurate and insightful results.
First introduced at Google I/O in May 2025, Gemini 2.5 Deep Think is Google’s debut multi-agent model made available to the public. Unlike traditional models that work sequentially, the Deep Think system evaluates various possible reasoning paths in parallel, helping the AI reach conclusions that consider a broader context.
A variation of this model recently helped Google secure a gold medal at the International Math Olympiad (IMO), marking a significant achievement in the field of AI-powered problem solving. In a further step to support research, Google is also releasing the specific IMO variant of the model to select academic institutions and researchers. Unlike most commercial AI models that respond in seconds, this specialized version may require hours to deliver answers—an indication of the model’s depth of reasoning.
Google emphasizes that the current Deep Think model surpasses the one showcased at I/O earlier this year. The company claims to have implemented novel reinforcement learning strategies that enable the AI to make more efficient use of its multiple reasoning paths. In a statement, Google noted that Deep Think is particularly suited to tasks involving creativity, long-term planning, and iterative problem solving.
The model has already achieved impressive results in industry benchmarks. On the rigorous Humanity’s Last Exam (HLE)—a wide-ranging assessment covering math, science, and the humanities—Gemini 2.5 Deep Think scored 34.8% without using any external tools. That places it ahead of Grok 4 from Elon Musk’s xAI, which scored 25.4%, and OpenAI’s o3, which managed 20.3%.
In competitive coding benchmarks such as LiveCodeBench6, Google’s new model also leads the pack, achieving an 87.6% score. For comparison, Grok 4 scored 79%, and OpenAI’s o3 scored 72%. Google attributes part of this performance to the model’s ability to interact with tools like code execution engines and Google Search, enabling it to handle longer and more complex tasks than traditional AI models.
Beyond performance, Google highlights that Deep Think can generate more comprehensive and visually refined outputs for tasks like web development. The company suggests the model could be a valuable tool for academic research, product design, and even scientific discovery.
The move toward multi-agent systems is gaining traction across the AI landscape. Elon Musk’s xAI has recently introduced Grok 4 Heavy, a powerful multi-agent system with top-tier benchmark results. OpenAI, while not having released its own multi-agent system to the public, reportedly used one to win a gold medal at this year’s IMO. Anthropic has also entered the field with its Research agent, designed to produce in-depth research summaries using a similar approach.
However, the complexity and computational cost of running these systems means that companies are currently limiting access to premium tiers. Both Google and xAI have locked their most capable models behind their most expensive subscriptions—a trend that could shape how widely available this kind of advanced AI will be in the near future.
In the coming weeks, Google plans to open up Gemini 2.5 Deep Think to a small group of developers and enterprises through the Gemini API. The goal is to gather insights into how different industries and developers might leverage the multi-agent architecture for real-world applications.
With Gemini 2.5 Deep Think, Google is not just pushing the boundaries of AI performance—it’s also signaling a broader shift toward more thoughtful, collaborative AI models that mimic the way humans approach complex challenges. Whether these systems will remain tools for the elite or become more widely accessible remains to be seen.
MORE NEWS FOR YOU