Google turns Gemini into an agent platform as 3.5 Flash moves into search, code and enterprise

Breaking

CIO CISOBoardGoogleGeminiAI AgentsEnterprise AIAI InfrastructureAI GovernanceVendor Risk

Google turns Gemini into an agent platform as 3.5 Flash moves into search, code and enterprise

Joachim Høgby

19. mai 202619. mai 20266 min lesingKilde: Google

Del

LinkedIn X Facebook E-post WhatsApp Telegram

Google used I/O 2026 to move Gemini from model announcement to operating layer.

On Tuesday, the company introduced Gemini 3.5 Flash, the first model in a new 3.5 series. The short version is blunt: Google wants the same model family to power search, coding, enterprise agents and long-running workflows. This is not just another chatbot release. It is Google’s attempt to make Gemini the layer where work is planned, executed and governed.

That is why the launch matters beyond the benchmark table. Google is putting agent capabilities into three surfaces at once: the Gemini app and AI Mode in Google Search for consumers, Antigravity, the Gemini API, AI Studio and Android Studio for developers, and Gemini Enterprise Agent Platform plus Gemini Enterprise for companies.

When a model moves into search, coding tools and enterprise platforms on the same day, procurement changes. CIOs are no longer just choosing the model with the best answer. They need to decide where agents are allowed to work, which systems they can call, how actions are logged and who owns the risk when an agent takes the wrong step.

What Google launched

Gemini 3.5 Flash is available from May 19, according to Google. Gemini 3.5 Pro is already used internally and is expected next month.

Google describes Flash as a model for long-horizon agentic tasks. It is meant to plan, build and iterate across multiple steps. The company points to coding, application development, codebase maintenance, document work and financial workflows.

The numbers Google gives are aggressive. It says the model beats Gemini 3.1 Pro on demanding coding and agent benchmarks, including Terminal-Bench 2.1 at 76.2 percent, GDPval-AA at 1656 Elo and MCP Atlas at 83.6 percent. It also claims 84.2 percent on CharXiv Reasoning for multimodal understanding. Google says 3.5 Flash is four times faster than other frontier models when measured by output tokens per second, and often costs less than half as much for comparable work.

These are vendor claims, not independent audit results. But the direction matters. The model race is moving from answer quality to cost per completed workflow. That is a different market. It hits IT budgets, architecture and governance directly.

Agents move into the enterprise stack

The most important part is not that Google has a faster model. It is where Google is placing it.

In Antigravity, 3.5 Flash can use subagents to solve larger tasks in parallel. Google shows examples where the model categorizes unstructured assets, migrates legacy code to Next.js and builds interactive interfaces. On the enterprise side, Google names Shopify, Macquarie Bank, Salesforce, Ramp, Xero and Databricks as examples or partners.

Those examples show where the market is going. Shopify is using subagents to analyse merchant data. Macquarie Bank is piloting onboarding by reasoning over documents of more than 100 pages. Salesforce is integrating 3.5 Flash into Agentforce for more complex enterprise tasks with multiple subagents and tool calls. Databricks is using agentic workflows to retrieve real-time information, reason across large datasets, diagnose issues and propose fixes.

For leadership teams, this is a preview of the 2026 AI budget. Agent projects will no longer arrive only as demos in isolated chat windows. They will be bundled into search, cloud platforms, CRM, developer environments, data platforms and office tools.

That makes governance harder. Each agent needs identity, permissions, data boundaries, logging, approvals and a kill switch. Without that, the productivity case becomes a security and compliance problem with a clean interface.

Google is also showing the scale behind the pressure

In Sundar Pichai’s I/O post, Google frames the launch as part of a wider AI operating system. The company says it now processes more than 3.2 quadrillion tokens per month across its surfaces, up from roughly 480 trillion a year earlier. More than 8.5 million developers build with Google’s models every month. Its model APIs process about 19 billion tokens per minute. Google also says more than 375 Google Cloud customers each processed more than one trillion tokens over the last 12 months.

This is no longer small-scale AI. This is production volume.

Pichai also said Google expects about $180 billion to $190 billion in capex this year, compared with $31 billion in 2022. A key part is Google’s own TPU stack. The company says TPU 8t targets training, TPU 8i targets inference, and its training system can scale across more than one million TPUs globally.

For executives, that means AI can no longer be treated as a simple software licence. Model choice is tied to infrastructure, capacity, price, data location and supplier power. The same pattern is visible across the AI compute market this week: compute is becoming a strategic procurement category, not just a technical resource.

The board-level read

Google is trying to own the full chain: model, agent platform, developer tools, search, Workspace, cloud and custom silicon. That gives speed and integration. It also creates lock-in.

CIOs and boards should read Gemini 3.5 as a governance story, not a product update. Three decisions should move up the agenda.

First, which workflows can receive agent support without giving the agent direct authority over money, customers, security or regulated obligations?

Second, which systems may agents touch? MCP, APIs, document stores, code repositories and finance systems must be treated as privileged surfaces. Agent access without proper IAM, logging and audit trails is a shortcut to trouble.

Third, how will the business measure value? Google is selling lower cost and higher speed. Boards should require hard metrics: time saved, fewer errors, shorter cycle times, better quality and documented control. Not just more tokens.

Gemini 3.5 Flash may prove to be a strong model. The more important question is whether companies can build a control model around agents before agents become default features inside the tools employees already use.

Sources and media

Primary source: Google, “Gemini 3.5: frontier intelligence with action”, published May 19, 2026 at 17:45 UTC: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/
Background/context: Google, “I/O 2026: Welcome to the agentic Gemini era”, published May 19, 2026 at 17:45 UTC: https://blog.google/innovation-and-ai/sundar-pichai-io-2026/
Google’s benchmark, cost and usage figures are vendor claims and are not independently verified in this article.
Thumbnail: OpenAI Image 2 / hogby.ai.

📬 Likte du denne?

AI-nyheter for ledere. Kuratert av en CIO som bygger det selv. Daglig i innboksen.

Relaterte saker

Anthropic gjør Claude Opus 5 til ny toppmodell for agentarbeid

Breaking

AI-modellerAnthropicClaude

Anthropic gjør Claude Opus 5 til ny toppmodell for agentarbeid

Claude Opus 5 flytter Anthropic-kampen fra ren intelligens til styrbar kost, fart og sikkerhet i agentarbeid. Det er en tydelig CIO-sak, ikke bare en modellnyhet.

24. juli 20265 min lesing

Anthropic

Åpne saken

CIOCISOCTO

GitHub ruller Claude Opus 5 inn i Copilot for agentisk koding

Claude Opus 5 er tilgjengelig i GitHub Copilot for Pro+, Max, Business og Enterprise. GitHub fremhever agentiske kodeflyter, egenverifisering og strengere cyber-sperrer. For IT-ledere blir modellvalg i Copilot et spørsmål om styring, kostnad og sikkerhet – ikke bare autocomplete.

24. juli 20265 min lesing

GitHub

Åpne saken

AI-modellerGoogle AIGemini

Google gjør Gemini Flash raskere for agentarbeid

Google lanserer Gemini 3.6 Flash og 3.5 Flash-Lite med tydeligere fokus på hastighet, token-effektivitet og produksjonsklare AI-agenter.

24. juli 20264 min lesing

Google AI

Åpne saken