
Google just supercharged its Search experience. With the launch of Gemini 2.5 Pro and Deep Search in AI Mode, the company is taking another step toward making conversational, intelligent search the norm for end-users.
These features, accessed via AI Mode, reflect the company’s gradual shift toward more intelligent, task-oriented search tools, especially for users with more complex queries.
But what exactly can users expect from this latest update? And why does it matter, especially for power users, researchers, or content creators who rely on Google every day?
Gemini Now Powers AI Mode in Search
From this week, subscribers can access Gemini 2.5 Pro and Deep Search capabilities in Search, giving them more reasoning power and improved performance for complex queries. The AI Mode experience, introduced in May as part of Google’s I/O announcements, is where this all comes together.
Gemini 2.5 Pro, Google’s most advanced public model to date, is now available via a drop-down in the AI Mode tab. The model is designed for advanced reasoning tasks and performs particularly well in math, coding, and technical problem-solving, making it ideal for more demanding queries. The default model in AI Mode remains available for general, fast assistance.
Where Gemini 2.5 Pro stands out is in its integration with Deep Search, an advanced tool that automates in-depth research. With a single query, it can run hundreds of searches, synthesise diverse information, and generate a fully-cited response within minutes. It’s especially useful for research-heavy tasks—from academic work and financial analysis to major life decisions like buying a home.
Agentic Features Come to Search
Another new capability Google is introducing is AI-powered calling—an agentic feature that allows Search to contact local businesses on a user’s behalf. For example, searching “pet groomers near me” may now surface a prompt: “Have AI check pricing.” Once submitted, the system makes calls and compiles service and pricing data from multiple businesses, streamlining the user’s decision-making process.
This functionality is now rolling out across the U.S. to all users, but with higher usage limits for Google AI Pro and Ultra subscribers. Importantly, businesses remain in control via their Business Profile settings.
The company says these additions are part of a broader goal: making AI Mode the default interaction layer for its search platform.
What AI Mode Offers That Regular Search Doesn’t
AI Mode goes beyond traditional keyword-based search. Available within Search Labs, it offers multimodal interaction—users can type, speak, or upload images. The responses evolve as users refine their queries, creating an experience that resembles a real-time conversation with Google itself.
For researchers, developers, and power users, this means fewer tabs, fewer source comparisons, and significantly faster workflows. From code troubleshooting to long-form research, the potential efficiency gains are considerable.
Benchmarking Gemini 2.5 Pro
In internal evaluations, Google compared Gemini 2.5 Pro with leading models such as Claude 3.5 Sonnet, OpenAI’s o3-mini, and DeepSeek R1. The results indicate that Gemini 2.5 Pro delivers consistently strong performance, particularly in areas like reasoning, code generation, mathematical problem-solving, and handling extended context. Performance does vary depending on the task, but Gemini 2.5 Pro holds its own across most benchmark categories.
Category | Benchmark | Gemini 2.5 Pro | 3.5 Sonnet | OpenAI’s o3-min | DeepSeek R1 |
Reasoning & General Knowledge | Humanity’s Last Exam (no tools) | 18.8 per cent | 8.9 per cent | 14 per cent | 8.6 per cent |
Math & Logic | AIME 2025 (pass@1) | 86.7 per cent | 49.5 per cent | 86.5 percent | 70 per cent |
Coding | Aider Polyglot (whole file editing) | 74.0 per cent | 64.9 per cent | 60 per cent | 56.9 per cent |
Long Context & Multimodal | MMMU (multimodal understanding; pass@1) | 81.7 per cent | 61 per cent | - | - |
Source: Google
How Gemini 2.5 Pro Measures Up on Benchmarks
Google’s Gemini 2.5 Pro has undergone a range of evaluations designed to test its capabilities across reasoning, factual knowledge, math, coding, and long-context comprehension. While results vary by domain, the model performs reliably across most categories, often placing ahead of its peers.
- Reasoning and knowledge tasks
Gemini 2.5 Pro performs particularly well on tests that assess broad knowledge and cognitive reasoning. In the Humanity’s Last Exam, a benchmark based on questions from over 100 expert-level subjects, it achieved a score of 18.8 per cent, outpacing o3-mini (14 per cent) and significantly ahead of DeepSeek-R1, which is under nine per cent. On GPQA Diamond, which tests factual accuracy across STEM and humanities, Gemini led with an 84.0 per cent score on the first attempt.
- Mathematics and logic
The model also demonstrates strong capabilities in structured problem-solving. On the AIME 2025 benchmark, Gemini 2.5 Pro maintained its lead, though slightly lower at 86.7 per cent, just ahead of o3-mini at 86.5 per cent.
- Programming and code reasoning
In code-focused benchmarks, Gemini 2.5 Pro holds its own but doesn’t always take the lead. In Aider Polyglot, a multilingual coding task, Gemini posted a respectable 74.0 per cent. For SWE-bench verified, which assesses its ability to reason across codebases and make agentic edits, it scored 63.8 per cent above several competitors, though behind Claude 3.7, Sonnet’s 70.3 per cent.
- Handling long contexts and multimodal inputs
Gemini 2.5 Pro performs most impressively in areas requiring extended comprehension and multimodal understanding. On the MRCR benchmark, which tests reading comprehension over a context window of 128,000 tokens, it scored 91.5 per cent, far surpassing o3-mini 36.3 per cent and GPT-4.5 48.8 per cent. On MMMU, which combines text and visual reasoning, Gemini led again with a score of 81.7 per cent.
The Impact on Content Discovery
For content creators and SEO professionals, however, these changes raise important considerations. With AI-generated overviews often appearing before traditional links, users may increasingly get answers without visiting source websites. This introduces a potential threat to organic traffic, especially for sites that rely on search visibility.
Google maintains that AI Overviews and summaries can increase engagement by encouraging users to explore related topics. It also states it is actively working to highlight sources within AI responses to support attribution and traffic flow.
To support publisher monetisation, Google also introduced a new tool called Offerwall, designed to offer alternative revenue streams through actions like micropayments, newsletter sign-ups, or ad views. According to the tech giant, when publishers choose to use Offerwall, they can offer audiences several ways to access content. People might decide to watch a short ad, complete a quick survey or pay in micro payments. Publishers can even add their options, like newsletter sign-ups. These options empower audiences to decide how they want to access publishers’ sites and help ensure diverse content remains available to everyone. Still, early reactions are mixed, given the historical difficulty of scaling micropayment models for digital content.
A Smarter, More Agentic Search Experience
Google is shifting toward a more intelligent, agentic, and assistive search environment. While AI Mode is still in experimental stages, its expanding capabilities—from Gemini’s deep reasoning to agent-driven calls—suggest a significant evolution in how users will navigate the web.
For everyday users, these updates simplify search and expand what can be done without leaving the interface. For content creators and marketers, it signals the need to rethink how visibility and value are delivered in a rapidly changing search ecosystem.
As Google rolls out these features more widely, AI Mode is poised to become more than an experimental tool—it may well be the future of how we search.
Comments ( 0 )