- What has changed: Google Research redefines text-to-SQL
- The BIRD benchmark and the significance of an accuracy score of 80%
- Immediate impact for Italian SMEs without a technical team
- How does it integrate with the Google Cloud ecosystem
- The competitive landscape: OpenAI and Anthropic are falling behind on this specific task
- What to do now: three operational considerations for SMEs
- What the press releases don't say: real limitations to consider
- Prospects: where will this technology lead in 2027-2028
Google Research presented Gemini-SQL2, a system based on Gemini 3.1 Pro that converts natural language into executable SQL queries. The model has achieved 80.041 TP4T accuracy on the BIRD benchmark, clearly surpassing equivalent systems from OpenAI and Anthropic. This is a significant achievement in the field of text-to-SQL.
Therefore, the practical implications are relevant. SMEs managing business databases will be able to query their data without knowing SQL. Furthermore, Google has stated its intention to integrate this technology into its data services, such as BigQuery and Looker. Consequently, democratized access to data analysis could become a reality in the short term.
We of SHM Studio We are closely monitoring these developments. In particular, we are evaluating how tools of this type can be integrated into the workflows of Italian B2B and retail SMEs. Finally, it's important to understand not only what Gemini-SQL2 does, but also what it concretely means for those who manage data without a dedicated technical team. This article offers an operational and strategic reading of the ongoing change.
What has changed: Google Research redefines text-to-SQL
On June 13, 2026, Google Research announced Gemini-SQL2, an advanced text-to-SQL conversion system. The model is built on Gemini 3.1 Pro and represents a qualitative leap compared to previous generations. According to reports by The Decoder, Gemini-SQL2 has reached the 80.041 TP4T accuracy on the BIRD benchmark, the most widely used in the industry for evaluating text-to-SQL systems.
This result consistently outperforms competing models from OpenAI and Anthropic. Therefore, Google positions itself as a technical leader in this specific domain of AI applied to data. The BIRD benchmark measures a system's ability to generate correct SQL queries from natural language questions on real and complex databases.
Additionally, Google has stated that the technology behind Gemini-SQL2 will be integrated into its existing data services. These include BigQuery, Looker, and other tools within the Google Cloud suite. Consequently, companies already utilizing the Google ecosystem could benefit from these capabilities without adopting new tools.
The BIRD benchmark and the significance of an accuracy score of 80%
The benchmark BIRD (Big Bench for Large-scale Database Grounded Text-to-SQL Evaluation) is the industry benchmark. It evaluates models on real databases with complex schemas and ambiguous queries. Achieving 80% on this test is no small feat.
In fact, previous systems performed significantly worse. According to published data, the gap between Gemini-SQL2 and its competitors is more than a few percentage points. However, it is important to put this into context: the remaining 20% error rate can still generate incorrect or incomplete queries in production environments. Therefore, human supervision remains necessary in critical scenarios.
To further explore the technical workings of AI benchmarks, you can refer to the analyses published by MIT Technology Review, which has repeatedly addressed the issue of evaluating language models on structured tasks.
Immediate impact for Italian SMEs without a technical team
For many Italian B2B and retail PMIs, access to company data is still mediated by technical personnel. A sales manager who wants to know which clients made two purchases in the last quarter has to wait for a developer to write the query. This slows down decisions.
Gemini-SQL2 could change this dynamic. Therefore, a system that converts queries in Italian—or English—directly into executable SQL significantly lowers the technical barrier. Furthermore, the 80%’s accuracy on the BIRD benchmark suggests that the system performs well even on databases with non-trivial structures.
Analogously, tools like these integrate with the trends of AI applied to business that we at SHM Studio follow for our clients. In particular, data access automation is one of the most concrete and high-return use cases for medium-sized companies. For this reason, it is worth considering how this technology fits into existing operational flows.
How does it integrate with the Google Cloud ecosystem?
Google has announced that Gemini-SQL2 will be deployed within its cloud services. BigQuery, Google Cloud's data warehouse, is the most obvious candidate for an initial integration. Looker, the business intelligence platform acquired by Google, could benefit even more directly.
Consequently, SMEs that have already invested in the Google ecosystem could access these features through gradual updates of existing services. However, commercial release dates have not yet been officially announced. Therefore, it is premature to plan operational integrations in the short term without further announcements.
For those managing campaigns and data on Google platforms, it's also worth monitoring developments related to Google Ads and its automated reporting functionalities, which could indirectly benefit from this technology. Likewise, those who work with digital marketing Anyone who relies on data will be interested in following the evolution of these tools.
The competitive landscape: OpenAI and Anthropic are falling behind on this specific task
The result of Gemini-SQL2 is particularly significant because it arrives at a time when OpenAI and Anthropic dominate public perception in the AI industry. However, the BIRD benchmark tells a different story in this specific domain.
OpenAI models—including GPT-4o and recent versions—and Anthropic's models like Claude 3.7 have not reached comparable levels on structured text-to-SQL. Therefore, Google demonstrates that vertical specialization on a specific task can produce measurable competitive advantages. This is a relevant strategic signal for the market.
According to the analysis of Gartner Regarding technological trends, the specialization of AI models for vertical domains is one of the most promising directions for 2026-2027. Thus, Gemini-SQL2 fits into a broader trajectory of technical differentiation among major players.
What to do now: three operational considerations for SMEs
First, it's useful to map existing company databases and identify which queries are most frequently requested from technical staff. This exercise allows for an estimation of the potential time savings a text-to-SQL tool could generate.
Subsequently, it's worth assessing whether the company's data infrastructure is already on Google Cloud or if there are migration plans. In fact, integrating Gemini-SQL2 will be smoother for those already operating within the Google ecosystem. For those who use on-premise databases or other cloud providers, accessing this technology may require intermediate steps.
Finally, it's advisable not to wait for the commercial release to start structuring your data in a more accessible way. Good data architecture is a prerequisite for any natural language querying tool. We at SHM Studio we can support SMEs in this assessment and planning phase, within the scope of our services AI consulting e digital marketing data-driven.
What the press releases don't say: real limitations to consider
Benchmarks are useful, but they don’t tell the whole story. The 80.04% accuracy score on BIRD is a result obtained in a controlled environment. In real-world business contexts, databases have non-standard column names, implicit relationships, and undocumented internal conventions. Therefore, actual performance may be lower than that measured in the lab.
Furthermore, the quality of the generated queries depends heavily on the quality of the question asked. A non-technical user might formulate ambiguous requests, generating imprecise results even with an excellent model. Therefore, end-user training remains a critical element for the success of these tools.
Despite this, the direction is clear. The gap between natural language and structured data is rapidly shrinking. For Italian SMEs, this means that investing today in the quality and organization of their data—through services like web development data-driven or strategies of SEO based on structured analytics — it is a strategic choice with an increasingly shorter return horizon.
Prospects: where will this technology lead in 2027-2028
In the short term, the integration of Gemini-SQL2 into Google Cloud products represents the most likely evolution. In the medium term, similar functionalities are expected to become standard across major business intelligence platforms, from Tableau to Power BI.
Therefore, by 2027-2028, natural language querying of enterprise databases could become a foundational feature, no longer a differentiator. Consequently, the competitive advantage will shift from the ability to access data to the ability to interpret and act on it with speed.
To further explore the topic of AI applied to Italian SMEs, a useful reading is available on SHM Studio Blog, where we publish regular analyses on these topics. Those who want to discuss how to integrate AI tools into their business processes can contact us directly. Furthermore, for those who manage digital content and communication, our services of SEO copywriting e LinkedIn campaign can support the narrative of these changes to clients and stakeholders.
Related articles
Discover other articles that explore similar topics in depth, selected to give you a more complete and stimulating view. Each piece of content is carefully chosen to enrich your experience.