Siri has spent a decade as a command line you talk to: one utterance, one action, no memory of what came before. At WWDC 2026 on June 8, 2026, Apple rebuilt it from the ground up as "Siri AI," and the change is structural rather than cosmetic .
What Siri AI Does That Prior Siri Couldn't
Siri AI converts the assistant from a single-utterance command layer into a conversational one that holds context across a session. Apple organizes the new capability around four pillars: on-screen awareness (answering questions about what's currently displayed), personal context retrieval (searching messages, emails, photos, and notes indexed on-device), broad world knowledge with live web lookup routed through Private Cloud Compute, and systemwide app actions — all with multi-turn context retention . The practical difference: you can ask a follow-up that depends on the previous answer, and Siri returns to earlier points in the conversation without re-prompting .
That session memory is backed by a new primitive prior Siri lacked entirely: a standalone Siri app with conversation history and pinned chats, synced privately across devices via iCloud . Old Siri had no durable record of what you asked; threads vanished the moment they completed. Treating conversations as persistent, addressable objects is what lets the assistant behave less like a voice trigger and more like a chat surface you can scroll back through.
The systemwide actions are concrete, not abstract. Apple shows Siri editing a just-sent message, creating reminders, adding songs to a playlist, sending email, rotating a photo, creating calendar events, and adding a photo to an album — acting inside Messages, Music, Reminders, Mail, Calendar, Photos, Podcasts, and Phone workflows . The mechanics of how third-party apps plug into this are the App Intents story covered later in the piece.
The most demo-friendly addition is a Siri mode in the Camera app: tap the shutter and Siri analyzes what the lens sees in real time — objects, food, receipts — without leaving Camera . WWDC's walkthrough led with receipt parsing and food identification, the kind of point-and-ask interaction that previously meant opening a separate app (video: CNET). It ships alongside a swipe-down-from-the-Dynamic-Island invocation on iPhone and Spotlight access with right-click file context on Mac . Whether any of this lands on schedule is a separate question — and one Apple has answered badly before.
AFM vs. Gemini: Apple's Wording vs. What the Press Traced

Here is the cleanest way to read the model story: Apple named one thing, and reporters traced another. Every Apple-published surface — the WWDC newsroom post and the iOS 27 and Apple Intelligence developer pages — says Apple Intelligence is "powered by next-generation Apple Foundation Models" and names no external provider . The Gemini linkage is real reporting, but it is press-sourced and unconfirmed by Apple, and the distinction matters if you are deciding how much to trust the stack underneath your App Intents.
Quick Answer: Apple's own materials credit only "next-generation Apple Foundation Models." Secondary reporting from Bloomberg and mlq.ai describes a Gemini licensing deal worth roughly $1 billion per year and a 1.2-trillion-parameter cloud tier — none of which appears in any Apple release.
What the press traced is a three-tier router. Bloomberg's pre-event reporting and mlq.ai's WWDC writeup describe a "System Orchestrator" that sends simple tasks to on-device AFM models, moderate requests to Apple's Private Cloud Compute, and the heaviest reasoning to a flagship "AFM Cloud Pro" reported at roughly 1.2 trillion parameters running on Nvidia Blackwell B200 GPUs hosted in Google Cloud with confidential-compute encryption . The reported arrangement is a multi-year Gemini license. Apple has not acknowledged any of it publicly.
| Claim | Apple-published | Press-reported (unconfirmed) |
|---|---|---|
| Model provider | "Apple Foundation Models" — no vendor named | Google Gemini family under license |
| Heaviest tier | "Private Cloud Compute" on Apple silicon | "AFM Cloud Pro," ~1.2T params on Nvidia B200 in Google Cloud |
| Commercial terms | Not disclosed | ~$1B/year multi-year deal |
| Benchmarks | Apple-vs-Apple only; no external baselines | None published |
On the verification question, Apple gave you nothing to grade. There are no benchmarks against external baselines and no independent leaderboard comparisons in any Apple material. The concrete numbers Apple did ship are all Apple-vs-Apple, measured against an iOS 26.4.2 prerelease baseline in April–May 2026: app launches up to 30% faster and new photos reaching the library up to 70% faster . Those are device-performance figures, not model-quality figures — they tell you Siri AI feels faster, not how its reasoning compares to a standalone Gemini or Claude endpoint.
Apple's only on-record framing stays deliberately narrow: Apple Intelligence "runs on-device where possible, uses Private Cloud Compute on Apple silicon for larger server models," and cloud data "is not stored and is used only for the user's request" — a privacy posture stated on Apple's developer pages, never a model attribution . Snazzy Labs read the gap the same way, framing the keynote as Apple quietly admitting it needed outside help to ship a credible assistant (video: Snazzy Labs). For a developer, the practical takeaway is to build against Apple's published protocol surface — the on-device AFM and the Language Model protocol — and treat any "it's really Gemini underneath" assumption as a rumor you cannot yet design around.
Foundation Models Framework: the Swift Primitives Dropped at WWDC
The Foundation Models framework is a native Swift API that lets an app call language models directly, targeting on-device Apple Foundation Models, Private Cloud Compute, and any provider that conforms to Apple's Language Model protocol . That protocol surface is the part developers should read closely: it means a third-party LLM vendor can ship its own conformer and be addressed through the same Swift types as Apple's models, rather than through a bespoke SDK. You write against the protocol, and the model behind it becomes a configuration detail instead of a rewrite.
Two runtime primitives do most of the work. Dynamic Profiles select a capability tier at request time, so an app can route a cheap classification to the smallest on-device model and escalate a harder prompt without branching code paths. Multimodal prompts are first-class rather than bolted on, and an Evaluations primitive is built in — you score model output inside the framework instead of standing up a separate eval harness . For teams that currently glue together a prompt runner, a logging layer, and an offline grading script, that consolidation is the practical draw.
"The Foundation Models framework gives apps direct access to the on-device model at the core of Apple Intelligence, with support for Private Cloud Compute and third-party providers." — Apple, WWDC 2026 developer materials (source: MacRumors)
The cost structure is where this changes build decisions. On-device AFM runs fully offline at no per-request cost , which sets a concrete crossover point: for common classification, summarization, or entity-extraction tasks, a paid external API stops paying for itself once the on-device model is accurate enough. You trade a metered token bill for fixed device compute, and the latency moves on-device too.
Apple extends that further for smaller shops. Developers in the App Store Small Business Program — apps under 2 million cumulative first-time downloads — can access next-generation models on Private Cloud Compute at no cloud API cost . For an early-stage product currently paying per token to a hosted provider, that removes a recurring line item for server-class inference, with the obvious tradeoff that the workload is locked to Apple's stack and gated to capable hardware.
The honest caveat: Apple published no benchmarks for these models alongside the framework . The primitives are real and shippable; the quality you get from the on-device tier is something you will have to measure yourself — which is, conveniently, what the Evaluations primitive is for.
Siri Intents in Practice: Entities, Spotlight, and View Annotations

The App Intents framework is the seam where third-party apps actually plug into Siri AI, and at WWDC 2026 it changed shape in three concrete ways: schema-driven indexing, entity surfacing in personal context, and view annotations for on-screen awareness . The headline shift for developers is that natural-language actions now fire without fixed trigger phrases. Apple's App Intents schemas let Siri match intent semantically, which removes the rigid utterance-mapping that made prior SiriKit painful to maintain at scale .
Mechanically, you describe what an action does and the entities it operates on, and Siri's orchestration layer resolves a user's phrasing to that action. You no longer hand-maintain a list of every way a user might ask for something — the long, brittle phrase tables that defined SiriKit integration are gone.
The second change is entity indexing into Spotlight's semantic catalog. When your app indexes its data objects as App Intents entities, those objects become eligible for Siri's personal context retrieval — the same retrieval pillar that pulls a buried email or an old photo from first-party apps . In practice, a message, note, or mail-equivalent object living in a non-Apple app can now appear in Siri's answers next to Apple's own data, rather than being invisible to it. For anything that stores user content, this is the difference between being searchable by the assistant and being a black box.
The third primitive is View Annotations. You mark SwiftUI or UIKit views so Siri AI can reason about what is currently on screen inside your app — this is the implementation surface behind the "on-screen awareness" capability Apple advertised for the assistant itself . Without annotations, "what does this mean?" against your UI has nothing structured to read; with them, Siri gets a typed view of the content the user is looking at.
The migration note is the part that will bite teams quietly. iOS 27's semantic intent matching replaces hard-coded utterance lists, so existing SiriKit intents are now resolved by the new matcher rather than your old phrase tables. The risk is that a phrasing your users relied on no longer routes the way it did, and there is no compile error to warn you — mismatches are silent. Developer testing opened June 8, 2026, ahead of the fall 2026 GA , which is the window to exercise every current intent against the new behavior. Treat the beta as a regression suite for your existing Siri surface, not just a place to wire up the new schemas — the actions you already shipped are the ones most likely to drift.
Gemini and Others as Siri Subagents: the Delegation Design
The delegation design lets users route Siri queries to an external chat provider instead of Apple's own models. According to WWDC coverage, third-party chatbots including Claude and Gemini will be selectable as "Extensions" within Siri, so a user picks a preferred model and Siri hands the query off rather than answering natively . This is a user-configured switch, not an automatic router — Siri's native System Orchestrator handles Apple Intelligence tiers; Extensions sit on top as an explicit opt-in destination.
For builders, the practical implication is distribution. If your product wraps an external chat provider — say, a frontend over Gemini or Claude — then once Extensions ship, your users may already have a Siri entry path into that provider with no additional integration work on your side. The provider relationship, not your app, is what gets surfaced. That cuts both ways: you inherit reach you didn't build, but the routing and presentation are Apple's, and your differentiation above the raw model gets flattened at the Siri layer.
Two things are unresolved, and both matter before you plan around this:
- The privacy boundary at delegation is unspecified. Apple's published Apple Intelligence materials describe on-device processing and Private Cloud Compute for Apple's own models, with cloud data not stored and used only for the request . But no data-flow document has been released covering what passes to an external Extension, when, or under what consent — the privacy story Apple sells for its own stack does not automatically extend to a third-party model you route a query into.
- Open SDK vs. curated partner list is not clarified. WWDC session materials name Claude and Gemini as selectable providers , but say nothing about whether any provider can self-register or only Apple-approved partners can participate. Given that Apple's reported model deal with Google runs at roughly $1 billion per year , a curated list is the more likely starting shape — but that is inference, not a documented program.
The actionable read: don't architect a launch around Siri Extensions yet. Developer testing opened June 8, 2026, ahead of the fall 2026 GA , so the beta is where the enrollment model and any data-flow terms should surface. Until they do, treat the Extension path as confirmed in concept and undefined in mechanics.
Capability Segmentation by Chip: iPhone 16 vs. 15 Pro vs. Everything Else
Siri AI is gated by silicon, not just by OS version, and the gap between "can run iOS 27" and "can run Siri AI" is wide. iOS 27 itself supports iPhone 11 and iPhone SE (2nd generation) and later — the same reach as iOS 26 . That means the Snow Leopard-style performance work — app launches up to 30% faster, AirDrop up to 80% quicker, iPad external-drive transfers up to 5x faster against an iOS 26.4.2 prerelease baseline — lands broadly, while the assistant story does not.
The full Apple Intelligence and Siri AI stack requires an on-chip ML accelerator above a fixed bar: all iPhone 16 models and later, iPhone 15 Pro and 15 Pro Max, iPads and Macs with M1 or later, iPad mini with A17 Pro, and Vision Pro with M2 or later . The standard iPhone 15 and 15 Plus are excluded — the Pro/Pro Max split inside the iPhone 15 line is the cleanest illustration that this is a Neural Engine constraint, not a model-year cutoff.
A further subset is carved out for advanced Siri voice expressivity and pace controls: iPhone 17 Pro/Pro Max, iPhone Air, iPads with M4 or later and at least 12GB unified memory, Macs with M3 or later and at least 12GB, and Vision Pro with M5 . So there are effectively three tiers, not two.
| Capability tier | Minimum hardware |
|---|---|
| iOS 27 base (performance/reliability) | iPhone 11, iPhone SE (2nd gen) and later |
| Apple Intelligence + Siri AI | iPhone 16 (all), iPhone 15 Pro/Pro Max; M1+ iPad/Mac; iPad mini A17 Pro; Vision Pro M2+ |
| Advanced Siri voice expressivity/pace | iPhone 17 Pro/Pro Max, iPhone Air; M4 iPad ≥12GB; M3 Mac ≥12GB; Vision Pro M5 |
The harder ceiling is regional. Siri AI will not initially ship in the EU on iOS, iPadOS, and watchOS — reported as Digital Markets Act compliance — and Apple Intelligence remains unavailable in China pending regulatory approval . That removes two of Apple's three largest markets at launch, and the assistant arrives English-only in beta "later in 2026" on top of those exclusions . For anyone sizing the addressable user base behind a Siri AI integration, the practical near-term reach is a high-end, non-EU, non-China, English-speaking slice of installed devices — far narrower than the iOS 27 install count implies.
The Delivery Question: Apple Promised Siri AI Before and Missed

Apple has promised an AI-rebuilt Siri before and missed the date, which is the single most important caveat for anyone planning around Siri AI. On March 7, 2025, Apple publicly acknowledged that its promised AI-enhanced Siri features were delayed and would arrive "in the coming year" (video: MacRumors). That year is now WWDC 2026 — and the assistant still is not generally available.
At the June 8, 2026 keynote, Apple shipped the framing — Siri AI as the centerpiece of the next generation of Apple Intelligence — but the product itself slipped to a beta arriving "later in 2026," launching English-only, with the EU excluded on iOS, iPadOS, and watchOS and China unavailable pending regulatory approval . The pattern from 2025 repeats: a confident announcement, then a deferred ship date.
The scheduling gap matters concretely. iOS 27 itself is slated for fall 2026, but Siri AI rides a separate, unspecified later schedule, and Apple's release does not quantify the interval between OS general availability and the assistant's arrival . That unaddressed gap is the planning risk: you can know when the OS lands without knowing when the AI layer behind your integration reaches users.
"It's the kind of thing that, until you're using it every day, you don't fully buy it" — paraphrasing the skepticism running through independent WWDC coverage, which frames Siri AI as a credibility test Apple has yet to pass (video: Snazzy Labs, source: Snazzy Labs).
For developers, the split decision is clean. The App Intents and Foundation Models work is worth starting now against the developer betas that opened June 8, 2026 — schemas, entity indexing, and View Annotations are testable today and the on-device path runs at no per-request cost . But any end-user reach projection that depends on Siri AI being live is premature until Apple publishes a confirmed GA date. Build the integration; do not yet promise the audience.
Watch / Sources
- CNET — WWDC 2026: Everything Revealed in 13 Minutes
- Snazzy Labs — Apple finally admitted it
- MacRumors — WWDC 2026: Everything Apple Announced! (New Siri AI & iOS 27)
Frequently asked questions
Can I call Apple Foundation Models from my app in the iOS 27 developer beta today?
Yes. The Foundation Models framework — a native Swift API for on-device Apple Foundation Models and Private Cloud Compute — is available in the iOS 27 developer beta that opened June 8, 2026. It includes multimodal prompts, Dynamic Profiles, and Evaluations, and lets third-party providers conform to Apple's Language Model protocol. On-device models run offline at no per-request cost, and App Store Small Business Program apps under 2 million total first-time downloads can use Private Cloud Compute at no cloud API cost. Full GA ships fall 2026; Siri AI itself is a separate, later beta.
Is Apple's cloud-tier AI actually running on Google Gemini?
Unconfirmed by Apple. Apple's official iOS 27 and Apple Intelligence pages name only "next-generation Apple Foundation Models" and cite no external provider. Secondary reporting from Bloomberg and mlq.ai describes a flagship "AFM Cloud Pro" at roughly 1.2 trillion parameters running on Nvidia Blackwell B200 GPUs in Google Cloud under a reported ~$1 billion-per-year deal. Treat those figures as reported-but-unconfirmed: Apple has published no confirmation, no benchmark against Gemini, and no leaderboard comparison.
What code changes are needed to connect my app to Siri AI?
Implement App Intents and index your entities into Spotlight's semantic catalog so Siri AI can resolve them. Add View Annotations (SwiftUI or UIKit) to expose on-screen content for Siri's on-screen awareness. If you are bringing your own external LLM, adopt Apple's Language Model protocol. The framework uses natural-language, semantic matching rather than fixed utterance phrases, so the prior rigid phrase-to-action mapping is gone .
Why is Siri AI not available in the EU at launch?
Apple cites Digital Markets Act compliance — the same mechanism that delayed earlier Apple Intelligence features in the EU. Siri AI will not initially be available in the EU on iOS, iPadOS, and watchOS, and no confirmed EU rollout date has been announced . China is separately excluded pending regulatory approval, with no timeline given .
Which devices get the full Siri AI stack versus just the iOS 27 base?
The iOS 27 base release — the performance and reliability improvements — supports devices back to iPhone 11 and iPhone SE (2nd generation) and later. The full Siri AI and Apple Intelligence stack requires all iPhone 16 models, iPhone 15 Pro and 15 Pro Max, and iPads and Macs with M1 or later. Advanced Siri voice expressivity is gated further: iPhone 17 Pro/Pro Max, iPhone Air, M4 iPads with 12GB+ unified memory, and M3 Macs with 12GB+ unified memory.