Google / Gemini News & Releases Dev Tools & SDK Changelogs

Mariner retired at I/O 2026. The VM-backed successor is Antigravity.

Antigravity 2.0, Managed Agents API, Gemini 3.5 Flash GA, and the Mariner shutdown: I/O 2026 developer rundown.

Jun 10, 2026

Mariner retired at I/O 2026. The VM-backed successor is Antigravity.

Google opened I/O 2026 by reframing its entire product surface around software that acts on its own. The pitch was less about answering questions and more about agents that do the work — and the numbers Sundar Pichai put on screen were meant to prove the shift is already at scale.

What Sundar Pichai Announced for Autonomous Software at I/O 2026

At the Google I/O 2026 keynote on May 19, 2026 in Mountain View, Sundar Pichai declared Google to be "in the agentic Gemini era" — AI that plans, acts, monitors, codes, browses, and synthesizes on a user's behalf, often running in the background under human control . The framing matters because it set the agenda for every release that followed: the central launches were tools that run autonomous tasks, not chat features.

Quick Answer: At I/O 2026, Pichai declared the "agentic Gemini era" and shipped Antigravity 2.0 (desktop agent orchestrator), Managed Agents (a VM per API call), and Gemini 3.5 Flash (GA day one). The Gemini app now has 900M monthly active users, up from 400M a year earlier.

"We are in the agentic Gemini era," — Sundar Pichai, CEO, Google (source: blog.google, 2026-05).

The scale figures anchored the claim. The Gemini app surpassed 900 million monthly active users, up from 400 million a year earlier, with daily requests up more than 7× year over year . Monthly tokens processed across Google surfaces grew from roughly 480 trillion at I/O 2025 to more than 3.2 quadrillion in May 2026 — again about 7× year over year — with model APIs now handling roughly 19 billion tokens per minute . On the builder side, Google cited more than 8.5 million developers working monthly with its models and over 375 Cloud customers each processing more than 1 trillion tokens in the prior 12 months .

The releases that carried the agentic thesis were concrete. Antigravity 2.0, a standalone desktop application for orchestrating multiple autonomous agents in parallel, became the new center of gravity for builders; Managed Agents added a Gemini API call that provisions a remote Linux sandbox per task; and Gemini 3.5 Flash shipped generally available on day one across Google products and APIs . Gemini Spark, a 24/7 consumer agent, began a trusted-tester rollout, while Jules and Deep Research for Science received agentic updates of their own . The sections that follow trace where each piece came from and what it means to build on.

What Happened to Project Mariner and Where the Technology Went

Project Mariner was Google's browser and computer-use research prototype — a single agent that reasoned over your active browser tab and clicked, typed, and navigated on your behalf, pausing for human confirmation before sensitive actions. Introduced December 11, 2024, it posted an 83.5% result on the WebVoyager benchmark, a single-agent state-of-the-art score at the time . It is absent from every official I/O 2026 release page. The Verge reported Mariner was shut down on May 4, 2026 — two weeks before the keynote — with Google confirming the technology moved into other products .

So Mariner did not get a sequel under its own name. It was decomposed, and its capabilities resurfaced across three of the surfaces Google did announce. Read it as a precursor rather than a retired product:

Multi-agent orchestration → Antigravity 2.0. The desktop app for running multiple autonomous agents in parallel takes over the orchestration surface Mariner prototyped at single-agent scale .
Long-horizon background tasks → Gemini Spark. Spark inherits the "agent works while you don't watch" concept, running on dedicated Google Cloud VMs powered by Gemini 3.5 and the Antigravity harness .
Open browser communication → WebMCP. The standard for letting agents talk to web pages launched as a Chrome origin trial, picking up the browser-control thread Mariner explored inside a closed prototype .

The keynote framed this consolidation directly. "We are in the agentic Gemini era," said Sundar Pichai, CEO at Google (source: Google) — and Mariner's computer-use research is now plumbing for that era, not a standalone offering.

For builders, the practical change is blunt: if you had Mariner research access, those endpoints are gone. The documented replacements for browser-use and computer-use workflows are the Antigravity CLI — with terminal sandboxing, credential masking, and hardened Git policies — and the Gemini API's Managed Agents, where one call provisions a remote Linux sandbox that can browse the web for live data . Migrating means re-pointing at those surfaces, not waiting for a Mariner v2 that is not coming.

Antigravity 2.0: Desktop Orchestrator for Autonomous Work

Antigravity 2.0 is a standalone desktop application for coordinating multiple autonomous task chains in parallel, and it is the surface Google now positions at the center of agentic development . It is distinct from the Gemini web interface: rather than a single chat thread, you get a local supervision UI and a task ledger that tracks each agent's plan, actions, and current state on your own machine. Gemini 3.5 Flash is generally available inside it from day one, alongside the Gemini API in AI Studio and Android Studio .

The stack splits into three layers. The desktop app handles orchestration and human oversight. The Antigravity CLI adds cross-platform terminal sandboxing, credential masking, and hardened Git policies — the controls you want before letting an agent touch a real repository . The Antigravity SDK exposes programmatic control of the entire orchestrator, so teams can run the agent harness on their own infrastructure instead of inside Google's hosted UI . For developers migrating off browser-use prototypes, this is the practical landing spot: the CLI replaces the sandboxed execution Mariner used to do implicitly, and the SDK lets you wire that into existing CI or internal tooling.

AI Studio is the on-ramp. It now ships native Kotlin support, one-click Cloud Run deploy, Firebase support, Workspace integrations, and a direct export-to-Antigravity path . The intent is a short loop from prototype in AI Studio to a deployed service or a multi-agent run in Antigravity, without rebuilding scaffolding by hand.

Google also added evaluation and open-standard pieces around the proprietary core. Android Bench introduces an LLM leaderboard that includes the open-weight Gemma 4 model, giving a public reference point for Android-focused agent work . WebMCP, a browser-native agent communication standard, launches as a Chrome origin trial alongside Chrome DevTools for Agents . WebMCP is the open complement to the Antigravity stack: where Antigravity is Google's closed orchestrator, WebMCP defines how any browser-resident agent talks to a page, which matters if you would rather not bind your agent layer to a single vendor.

The trade-off is clear. Antigravity gives you a coherent, supervised multi-agent environment fast; WebMCP and the SDK are the escape hatches that keep some of that work portable.

Managed Linux Sandboxes: Isolated VM Provisioning Explained

Managed Agents turn that orchestration layer into a server-side primitive: a single Gemini API call provisions a remote Linux sandbox where an agent can reason, plan, execute code, manage files, and browse the web for live data, with Google handling the VM lifecycle end to end . You no longer stand up the runtime, mount a filesystem, or wire network egress yourself — the API returns an environment the agent already controls. For a developer, that collapses a multi-step infrastructure problem into one request.

This puts Google in direct competition with the do-it-yourself sandbox stack — E2B, Modal, and hand-rolled Docker isolation — that teams currently assemble to give agents a safe place to run untrusted code. The trade-off is the familiar one: convenience against lock-in. Managed Agents are not a portable runtime you can lift elsewhere; they require the full Gemini API stack throughout the task chain, so the model, the harness, and the sandbox are bought together . If your agent reasoning already runs on Gemini, the integration cost is near zero; if you wanted to swap models per task, you inherit Google's boundaries.

The lifecycle detail that matters most in practice is statelessness. The sandboxes are ephemeral and scoped to a single task session by default — when the session closes, the VM and everything written to its local disk go with it. Persistent state is your responsibility: an agent that needs to keep results must explicitly write to durable storage or call an external database from inside the sandbox before the session ends. Treat the VM as scratch space, not a server. This is the standard contract for task-scoped compute, but it is easy to miss when the provisioning is this frictionless, and it changes how you design any workflow that spans more than one run.

The first concrete use case Google shipped on this primitive is the Migration Agent, introduced at I/O 2026, which runs inside a Managed sandbox and converts React Native, web, and iOS code to native Kotlin automatically . It is a clean fit for the model: a bounded, code-heavy task that needs a real filesystem, a compiler, and tool access, but no long-lived state. That pattern — clone, transform, verify, emit, discard — is exactly what ephemeral VM provisioning is built for, and it is the shape most early Managed Agent workloads will take.

Validating the 4× Speed Assertion: What the Leaderboard Shows

Gemini 3.5 Flash is the headline model behind that pattern, and Google's performance claims are specific: it beats the prior Gemini 3.1 Pro on nearly all benchmarks, runs four times faster than other frontier models measured by output tokens per second, and can cost less than half of comparable frontier models . Those are the numbers that decide whether a migration is worth it — and at the time of writing (June 10, 2026) all of them are Google-reported.

Quick Answer: Google says Gemini 3.5 Flash runs 4× faster than other frontier models by output tokens/sec and costs under half as much, with self-reported scores of 76.2% on Terminal-Bench 2.1 and 83.6% on MCP Atlas. None of these are independently reproduced yet — benchmark your own workload before switching providers.

The self-reported scores from I/O materials place the model strongly: Terminal-Bench 2.1 at 76.2%, GDPval-AA at 1656 Elo, and MCP Atlas at 83.6%, which Google says puts 3.5 Flash in the "top-right quadrant" of the Artificial Analysis index — high capability, low cost . Google framed the model as "combining frontier intelligence with action" . The Artificial Analysis quadrant is a useful independent index to watch, but the I/O figures are vendor placements on it, not third-party reruns.

Claim	Source	Verification status (2026-06-10)
4× faster by output tokens/sec	Google I/O keynote	Not independently reproduced
< half the cost of comparable frontier models	Google I/O keynote	Varies by request size and concurrency
76.2% Terminal-Bench 2.1 · 83.6% MCP Atlas · 1656 Elo GDPval-AA	I/O developer materials	Self-reported scores
Beats Gemini 3.1 Pro on nearly all benchmarks	Google I/O keynote	Self-reported comparison

Treat the 4× and cost figures as directional, not contractual. Throughput and price-per-task vary significantly with request size, concurrency, and workload type — a streaming chat workload and a long-context agent loop with heavy tool calls will not see the same multiplier. The practical move is to run latency and cost tests against your specific traffic before committing to a migration away from an existing provider. A model that is four times faster on short completions can still bottleneck on the tool-execution and VM-provisioning overhead that dominates agentic work, which is where many of these workloads actually spend their wall-clock time.

One caveat narrows what you can even check today: Gemini 3.5 Pro, the higher-capability sibling, was internal-only at I/O, with external rollout planned for June 2026 . No third-party leaderboard data exists for it at the time of writing. If your migration decision hinges on Pro-tier quality rather than Flash throughput, the honest answer is to wait for the public release and the independent reruns that follow it.

Jules: Autonomous Pull Requests, CI Repair, Current State

Jules is Google's GitHub-native coding agent, and unlike most of what shared the I/O 2026 stage, it is a shipping product with a measurable track record rather than a roadmap promise. It left beta on August 6, 2025, after more than 140,000 public commits . The working model is narrow and well-defined: you assign a task on a connected GitHub repository, and Jules clones the code into an isolated VM, installs dependencies, proposes a plan for your approval, executes the changes, and returns a pull request for human review . The VM-backed, review-gated loop is the same isolation pattern that now underpins Managed Agents and Spark — Jules just got there first.

The capability additions since GA tell you where the product is headed:

CLI and API — added October 2025, moving Jules out of the web UI and into scripts and pipelines .
MCP support — February 2, 2026, letting Jules reach external tools through the Model Context Protocol .
CI auto-fixing — February 19, 2026, so Jules can react to failing checks rather than only opening new work .
Gemini 3.1 Pro access — March 9, 2026, upgrading the reasoning model behind the agent .

Worth being clear about the framing: Jules was not a centerpiece at I/O 2026. Antigravity 2.0 and Managed Agents took that role, and Jules stayed in its lane as a GitHub-native PR agent — not a general orchestration layer for parallel, multi-agent work . For developers, that division of labor is the practical signal. If your work is repo-scoped — fix a flaky test, bump a dependency, repair a broken CI run — Jules is the lower-friction tool, available now at jules.google.com with CLI and API access since October 2025 . One caveat before you standardize on it: there is no native GitLab or Bitbucket support documented as of I/O 2026, so teams off GitHub are still waiting.

Rollout Reality: U.S. and AI Ultra Lead, Global Timing Unknown

Availability is the deciding variable for most of what Google announced: nearly every agentic feature is U.S.-first and gated behind AI Pro or AI Ultra subscriptions, with no confirmed international dates . Only the model layer shipped broadly on day one. If you are deciding whether to build on this stack now or wait, the split is clean — APIs and the IDE-facing pieces are usable today, while the consumer and Search agents are staged through summer and late 2026.

Gemini Spark, the 24/7 background agent running on dedicated Google Cloud VMs, began a trusted-tester rollout the week of I/O, with a U.S. AI Ultra beta planned the following week . Its Chrome integration is slated for summer 2026, and Android Halo — the UI for live agent progress — is due later in 2026, with no international date attached . Search agents that monitor finance, shopping, sports and the open web are scheduled for summer 2026 for AI Pro and Ultra subscribers, while the Antigravity-backed Search generative UI is slated for summer 2026 globally at no charge — the one consumer feature without a paywall .

Component	Availability	Access tier
Gemini 3.5 Flash	GA day one (Gemini API, AI Studio, Antigravity, Android Studio)	Developers, broad
Gemini 3.5 Pro	June 2026 planned, no confirmed date	TBD
Gemini Spark	Trusted testers (week of I/O); U.S. Ultra beta next week; Chrome summer 2026; Halo late 2026	U.S., AI Ultra
Search autonomous agents	Summer 2026	AI Pro / Ultra
Search generative UI	Summer 2026, global	Free
Deep Research for Science	Labs prototypes only, not GA	Limited

The model layer is the exception to the gating. Gemini 3.5 Flash reached general availability on day one across the Gemini API, AI Studio, Google Antigravity and Android Studio, so you can test it immediately . Gemini 3.5 Pro was in internal use only, with a June 2026 rollout planned but no fixed date . Deep Research for Science remains three Labs prototypes — Hypothesis Generation, Computational Discovery, Literature Insights — not a shipping product .

The takeaway: treat the 3.5 Flash API and Antigravity as buildable today, and treat everything labeled "summer 2026" or "Spark" as a roadmap, not a dependency. Most agentic capabilities remain Google-reported and have not been independently reproduced , so run the speed and cost claims against your own workloads before you commit a production path to this stack. Prototype on the parts that exist; wait for the rest to clear beta and a region before you standardize.

Last updated: 2026-06-10. Rollout dates reflect Google I/O 2026 keynote announcements (May 19, 2026) and are subject to change.

Frequently asked questions

Is Project Mariner still available for developers?

No. Project Mariner was shut down on May 4, 2026, and its browser/computer-use technology was folded into Antigravity, Gemini Spark, and other Gemini products rather than retained as a standalone prototype. For browser and computer-use workflows, the documented replacement surfaces are the Antigravity CLI and the Managed Agents API, both introduced at I/O 2026 on May 19, 2026.

What exactly does a Managed Agents API call provision?

A single Managed Agents call provisions a remote Linux sandbox — a VM where the agent can reason, plan, execute code, manage files, and browse the web for live data, per the I/O 2026 developer keynote. Google manages the VM lifecycle, and the sandbox is ephemeral by default: any state you need to keep must be written out explicitly before the session ends, or it is discarded with the instance.

How does Antigravity 2.0 differ from using the Gemini API with tool calls?

Antigravity 2.0 is a standalone desktop application that adds an orchestration layer the raw API does not: multi-agent coordination, parallel task supervision, credential masking, and hardened Git policies, paired with a CLI and SDK (source: developers.googleblog.com, 2026-05). Calling the Gemini API with tool calls is a single-agent, largely stateless interaction — you handle process management, isolation, and supervision yourself. Choose Antigravity when you run several agents at once and need governance; choose the API when you want a thin, embeddable building block.

Is Gemini 3.5 Flash actually faster and cheaper than GPT-4o?

Those numbers are Google-reported, not independently verified. Google claims Gemini 3.5 Flash runs four times faster than other frontier models by output tokens per second and can cost less than half of comparable frontier models, alongside scores such as Terminal-Bench 2.1 at 76.2%. Cross-check on independent signals such as the Artificial Analysis index, and run your own latency and cost benchmarks before migrating production workloads.

Can Jules handle repositories outside GitHub?

No. As of I/O 2026, Jules is GitHub-native, with no documented GitLab or Bitbucket support; it clones a GitHub repo into an isolated VM, proposes a plan, executes changes, and returns a pull request (source: jules.google). A CLI and API have been available since October 2025 and extend programmatic access, but the underlying architecture remains GitHub-first, so non-GitHub repositories are not a supported workflow today.