The cheapest model in the lineup is now the agent.
Google opened I/O 2026 on 19 May with a sequencing decision rather than a flagship reveal. Gemini 3.5 Flash shipped generally available the same day, the first member of the Gemini 3.5 family, which Google describes as "frontier intelligence with action" built to "reliably execute complex, multi-step agentic workflows across your apps". And it did not arrive as an option buried in a model picker. Flash became the default model in the Gemini app and AI Mode in Search worldwide, Australia included, with simultaneous availability in Google Antigravity, the Gemini API in Google AI Studio and Android Studio, and the Gemini Enterprise Agent Platform (9to5Google).
What happened
The flagship is the absence. Gemini 3.5 Pro was not released. Google says it is "already being used internally" and expected to roll out next month, positioned as the orchestrator and planner working in tandem with Flash, which handles the sub-agent tasks. Two stablemates rounded out the agent story. Gemini Omni, announced alongside the 3.5 family, generates video from any combination of image, audio, video and text input. Gemini Spark, a personal AI agent built on 3.5 Flash, reached trusted testers on launch day, with a beta for Google AI Ultra subscribers in the US the following week.
The capability claims are unusually specific for a fast tier. Google's published numbers put Flash at 76.2 per cent on Terminal-Bench 2.1, 1656 Elo on GDPval-AA, 83.6 per cent on MCP Atlas and 84.2 per cent on CharXiv multimodal reasoning, and Google claims Flash beats the previous flagship, Gemini 3.1 Pro, on nearly all benchmarks, including coding and agentic tasks. TechCrunch reports the model produces output tokens roughly 4 times faster than other frontier models, with an optimised variant reportedly running 12 times faster. The same coverage describes Flash running autonomously for multiple hours, pausing only for decisions requiring human judgment, and at one point building an operating system independently.
Then the price. Reported API pricing lands at US$1.50 per million input tokens and US$9 per million output tokens, with cache hits at US$0.15 per million, which Google frames as "less than half the cost of comparable models" for long-horizon agentic work. Coverage of the API documentation puts the context window at 1,048,576 input tokens and 65,536 output tokens, with a January 2026 knowledge cutoff. Google also states the model was developed under its Frontier Safety Framework with strengthened cyber and CBRN (chemical, biological, radiological and nuclear) safeguards and new interpretability tooling.
What it actually means
The sequencing is the story. Frontier launches normally run top down. The flagship demos the future, then cheaper distilled tiers trickle out for production work. Google inverted it. The executor shipped first, worldwide, as the default. The planner ships next month. That order only makes sense if the product is no longer a single model but an architecture: an orchestrator that decomposes work, plus a fleet of fast, cheap sub-agents that execute it. Google has said as much by positioning Pro as the orchestrator working in tandem with Flash on sub-agent tasks. Orchestrator-plus-subagent systems have spent the past year as conference demos and developer-framework diagrams. As of 19 May they are the shipping shape of Google's consumer AI surface.
The benchmark and pricing evidence supports the substitution claim better than the usual fast-tier marketing does. A model at US$1.50 per million input tokens that matches or beats last generation's flagship on coding and agentic benchmarks does not just lower costs. It changes which workloads are economically viable as agents at all. Long-horizon agentic work is token-hungry by construction: a loop that plans, reads, retries and verifies multiplies token consumption well beyond a single completion. At cache-hit pricing of US$0.15 per million, the repetitive scaffolding inside those loops becomes nearly free. The output speed compounds the effect, because slow agents fail in a distinctive way: people stop waiting for them and do the task themselves.
The third meaning is distribution. The most consequential agentic deployment of 2026 so far did not happen through a procurement cycle, a pilot program or a migration project. It happened through a server-side default flip. Capability announcements get scrutiny. Default changes get release notes. That asymmetry is now a governance problem, and it lands hardest in the section below.
Who should care and why
Managers inherit the immediate version. The Gemini app a team member used on Friday answered questions. The one they use on Monday is built to execute multi-step workflows across apps. Nothing in the interface forces a conversation about which tasks are now being delegated rather than drafted. If your team norms only cover "checking AI outputs", they predate the tool your team now has.
Governance, Risk and Compliance (GRC) professionals have an inventory integrity problem. Most AI registers record tools and vendors. Far fewer record which model sits behind each tool, and almost none would capture a same-day worldwide default change to an agentic model. The risk profile of "staff use the Gemini app" changed materially on 19 May while the register entry stayed identical.
HR teams should note the policy drafting gap. Acceptable-use policies written for generative AI typically govern text: what staff may enter, how outputs are reviewed. An agent that can run for hours and act across applications is governed by neither clause. The question is no longer only what the model wrote. It is what the model did.
Workers Compensation (WC) and claims professionals get the compounding version of the privacy rule. De-identification obligations do not change because the model became more capable. But an agentic default raises the stakes of casual use: a model that can act on uploaded content is a worse place for an unredacted document than a model that can only summarise it. De-identify before anything touches a consumer tool. No exceptions.
The Australian angle
Because 3.5 Flash became the worldwide default in the Gemini app and AI Mode in Search, Australian workers received a substantially more capable, more autonomous model overnight with no procurement decision, no pilot and no policy update. That is the textbook shadow-AI scenario the Voluntary AI Safety Standard's guardrails on AI inventories and ongoing monitoring are designed to catch. An inventory entry reading "Gemini app, consumer, low risk" was arguably accurate on 18 May and arguably wrong on 19 May, and no internal change process fired in between.
Gemini Spark sharpens the point from the other direction. The personal agent built on 3.5 Flash rolled out to trusted testers on 19 May with a US-only beta the following week. Australia sits one rollout ring behind on agent features even when the underlying model is already the local default. Expect that pattern to repeat: the model arrives everywhere at once, the agent products arrive in rings, and the gap between the two is where unsanctioned workarounds breed.
For Workspace-heavy Australian organisations, the concrete review item is the Gemini Enterprise Agent Platform. Google Cloud's stated defaults are the right starting questions: agent activity runs within a secure cloud boundary by default, data remains under customer control, and the platform integrates with existing connectors including Microsoft SharePoint, OneDrive and ServiceNow. Connectors are the convenience and the exposure in a single feature. An agent with a ServiceNow connector is an agent that can raise, modify or close tickets. That is a permissions design exercise, not a chat policy.
The economics will force the conversation regardless. At roughly A$2.30 per million input tokens and A$13.80 per million output tokens, converting the reported US dollar pricing at current rates, agentic workloads on Flash undercut most incumbent options. The cost objections that protected the status quo six months ago no longer hold.
Hype check
Three things deserve scepticism. First, every benchmark number above is Google's own, published on launch day, with no independent verification yet. "Beats the previous flagship on nearly all benchmarks" is also a claim made about last generation's Pro while this generation's Pro remains unreleased. The comparison Google did not publish is the one that matters for buying decisions next month.
Second, the autonomy demonstrations. Running for multiple hours and building an operating system are launch-stage conditions: clean tasks, controlled environments, presumably many takes. Enterprise environments contribute ambiguous instructions, broken connectors, permission walls and data that lies. Hour-scale autonomy in a demo does not establish hour-scale reliability in a workflow that touches your ticketing system.
Third, the cost framing. "Less than half the cost of comparable models" depends entirely on which comparators Google selected, and the 4 times and 12 times speed multipliers are measured against unnamed "other frontier models". The pricing is genuinely aggressive. The superlatives wrapped around it are marketing.
What is undersold is the part this article led with. A default change across Google's consumer surfaces is a bigger event than any benchmark table, and it is the one element of the launch nobody can dispute.
What to do this week
Three actions, in priority order.
- Record the change in your AI inventory. Add an entry noting that the default model behind the Gemini app and AI Mode in Search changed to an agentic model on 19 May 2026. If your register has no field for "default model changed underneath an existing tool", that missing field is the lesson.
- If you run Google Workspace, review the Gemini Enterprise Agent Platform before your first enthusiast does. Confirm the secure boundary defaults, decide which connectors are in scope, and write down who approves an agent that can act inside SharePoint, OneDrive or ServiceNow.
- Re-run the cost model on any agent pilot you parked for budget reasons. The business case maths changed on 19 May, and a board paper priced on last quarter's rates is now wrong in your favour.
Gemini 3.5 Flash is a capable model. That is the least important fact about it. The capability arrived in Australian workplaces without anyone deciding it should. The governance does not get the same shortcut. Update the register first. Then let the agent work.
TheAICommand. Intelligence, At Your Command.





