v0.109: Fable 5's refusal arrives as HTTP 200

SDK v0.108–0.109: Fable 5 refusal-in-200, Managed Agents cron, vault credentials, and an enum fix in 0.109.1.

v0.109: Fable 5's refusal arrives as HTTP 200
Share

On June 9, 2026, the official Anthropic Python SDK shipped three releases inside eight hours, then a cleanup six days later. The headline is a new failure mode: Claude Fable 5 can refuse a request and still return HTTP 200.

What does the v0.108–0.109 batch contain?

The v0.108–0.109 batch is a four-release train for the anthropic Python SDK that introduces two new model identifiers, a refusal-as-success response shape, and typed surfaces for Managed Agents. Three releases landed on a single day — v0.108.0 at 16:37, v0.109.0 at 20:04, and v0.109.1 at 23:55 — with a v0.109.2 cleanup following on June 15 . This is Stainless-style codegen cadence: API-spec changes ship as frequent small versions, not batched quarterly drops.

Quick Answer: The v0.108–0.109 batch adds claude-fable-5 and claude-mythos-5, server-side and client-side refusal fallbacks, Managed Agents typed surfaces, and the frontier_llm refusal category — across four releases shipped June 9–15, 2026 .

The substantive surface lands in v0.108.0; the 0.109.x line refines it. Here is what each release carries, per the repository CHANGELOG and GitHub Releases :

ReleaseShippedWhat it adds
v0.108.0Jun 9, 16:37claude-fable-5 and claude-mythos-5; server-side fallback on refusal (beta); client-side BetaRefusalFallbackMiddleware
v0.109.0Jun 9, 20:04Managed Agents typed surfaces (cron scheduling, vault env credentials); auto-injected managed-agents-2026-04-01 beta header
v0.109.1Jun 9, 23:55Adds frontier_llm to the refusal category enum — the value absent at launch
v0.109.2Jun 15Removes retired model identifiers from the type stubs

The connective theme is refusal handling. v0.108.0 introduces fallback on refusal, v0.109.1 patches a missing refusal category, and v0.109.2 prunes deprecated models from the typed client . If you call Fable 5 or Mythos 5, the sections below break down what changes line by line; if you touch neither model nor Managed Agents, the practical impact is mostly the type-stub cleanup in 0.109.2.

Fable 5's refusal is a 200, not an exception

v0.109: Fable 5's refusal arrives as HTTP 200

Here is the line-by-line problem: claude-fable-5 can return an HTTP 200 whose body carries stop_reason: "refusal" instead of raising an API error . Any handler that treats 200 as success and reads content.text will silently discard the refusal. No exception fires, no status code trips your retry logic — the request simply produced nothing usable, and your code never notices.

The detail you actually need lives in stop_details, which carries a category and a human-readable explanation. At launch the documented categories were cyber, bio, frontier_llm, and reasoning_extraction . The practical migration step: branch on stop_reason before you ever touch content.text.

"Refusal responses are returned with HTTP 200 and a stop_reason of refusal; the stop_details object includes the category and an explanation," — Anthropic's refusals and fallback documentation (source: platform.claude.com).

This is Fable-specific, not universal. claude-mythos-5 — access-gated through Project Glasswing, with a 1M-token context window and pricing of $10 per million input and $50 per million output tokens — lacks Fable 5's safety classifiers. So refusal-in-200 is a Fable 5 behavior you opt into by model choice, not a blanket change across every Anthropic model.

One enum gap made this worse at launch. The frontier_llm category was missing from the SDK's typed enum until v0.109.1 added it . Typed clients that validated category values or branched on them could mis-handle exactly those refusals — the ones flagging requests that might assist competing-model development — until the upgrade.

When you do wire up fallback, the response tells you what ran:

  • Fallback content blocks mark the handoff point where one model stopped and another took over .
  • usage.iterations entries of type message and fallback_message let you trace which model produced which tokens, and bill accordingly.
  • Top-level model reflects the model that actually answered — not the one you requested — so never assume the responder matches your request parameter.

The takeaway for any Fable 5 integration: a 200 is a delivery receipt, not a content guarantee. Inspect stop_reason first, read stop_details.category to classify, and trust usage.iterations plus the top-level model field to reconstruct what genuinely executed .

Where automatic retry fires — and where it can't

Automatic refusal retry fires in exactly two environments. Server-side fallback retries a Fable 5 classifier refusal inside one API call and returns a single message, but it is beta-only — gated behind the header server-side-fallback-2026-06-01 and available on the Claude API direct and Claude Platform on AWS only . Everywhere else, retry is your code's job, not the platform's.

The split matters because the cloud resellers are explicitly excluded. Message Batches, Amazon Bedrock, Vertex AI, and Microsoft Foundry do not support server-side fallback; on those platforms you wire up the SDK's client-side middleware instead . That interceptor is BetaRefusalFallbackMiddleware (paired with BetaFallbackState), which catches a refused request, reissues it against the fallback target, and attaches the fallback-credit-2026-06-01 header automatically .

Middleware coverage is uneven across language SDKs:

Environment / SDKRetry path
Claude API direct, Claude Platform on AWSServer-side fallback (one API call)
Python, TypeScript, Go, Java, C#BetaRefusalFallbackMiddleware (client-side)
Ruby, PHP, raw HTTPNone — implement direct retry yourself
Message BatchesNone — collect refused items, resubmit manually

So Ruby and PHP teams, and anyone hitting the raw HTTP endpoint, get no pre-built helper and must hand-roll the retry-and-reheader loop themselves . Batch users face a sharper edge: there is no automatic path at all, so refused items have to be gathered from the result set and resubmitted in a fresh request .

Why a dedicated credit header exists comes down to cost. Prompt caches are per-model, so a refusal retried on a different model would otherwise re-pay the full cache-write price. Refusal responses can carry a fallback_credit_token to offset that re-charge, plus a fallback_has_prefill_claim flag for prefilled requests . The credit is what keeps a cross-model retry from quietly doubling your input bill.

One constraint applies to every path: at launch, the only permitted fallback target for a Fable 5 refusal is Claude Opus 4.8 (claude-opus-4-8) . You don't get to pick an arbitrary cheaper model; the handoff goes to Opus 4.8 or nowhere. Practically, that means budgeting for Opus-tier pricing on the fraction of traffic Fable 5 declines, and instrumenting which environment you're in before assuming any retry happens for free.

The decision tree is short: on Anthropic direct or AWS, send the beta header and let the platform retry; on Bedrock, Vertex, or Foundry with a supported SDK, install the middleware; on Ruby, PHP, raw HTTP, or Batches, write the retry yourself and resubmit explicitly .

Scheduled invocations and vault-stored secrets

The other half of the batch sits one release later. v0.109.0 adds typed surfaces for Anthropic's Managed Agents — a beta, pre-built harness for long-running and asynchronous work, modeled around four objects: agents, environments, sessions, and events . Every request carries the managed-agents-2026-04-01 beta header, which the SDK injects automatically so you don't hand-set it on each call .

The headline capability is autonomous scheduling. A scheduled deployment lets an agent open sessions on its own, driven by a POSIX cron expression paired with an IANA timezone. Granularity tops out at the minute — there is no sub-minute trigger .

The typed surface exposes the controls you'd want for a production cron job:

  • upcoming_runs_at — the computed next-fire times, so you can verify the schedule resolves the way you read it.
  • Pause, unpause, and archive controls, plus manual one-off runs outside the schedule .
  • Run records for each invocation, and up to 10 seconds of jitter to spread load off exact-minute boundaries.
  • A hard ceiling of 1,000 scheduled deployments per organization — plan fan-out against that cap, not against per-agent intuition .

Credentials get their own model. Environment-variable secrets live in vaults, across three categories: mcp_oauth, static_bearer, and environment_variable . An env-var credential is keyed by secret_name and stored in the sandbox as an opaque placeholder. The real value is substituted only at egress — the point where the request leaves the sandbox — so the agent itself never reads the raw secret.

That design closes the obvious leak: a prompt-injected agent can reference a credential by name without ever being able to exfiltrate its contents. The trade-offs at this stage are worth knowing before you build against them:

  • No self-hosted sandbox support yet — vault credentials assume Anthropic-managed environments.
  • Credential keys are immutable after creation; to rename, you delete and recreate.
  • A duplicate key returns a 409, so creation is not idempotent — handle the conflict explicitly.
  • A maximum of 20 credentials per vault .

None of this affects developers who only call the Messages API. But if you were wiring agents to a cron and stuffing secrets into environment variables by hand, v0.109.0 gives you typed primitives — scheduled deployments and vaults — that the SDK now validates for you instead.

The enum omission that silently misfired in typed handlers

v0.109: Fable 5's refusal arrives as HTTP 200

v0.109.0 shipped the refusal surface without the frontier_llm value in its typed enum, so any Fable 5 refusal in that category could break a strictly-typed handler before the next patch landed . Exhaustive match/switch statements would fall through, and Pydantic Literal validators would raise on an unrecognized string. The fix arrived hours later in v0.109.1, which simply "add[s] the frontier_llm refusal category" .

The gap mattered because the refusal plumbing was already live. A client could receive an HTTP 200 carrying stop_reason: "refusal" and stop_details with a category your code had never been told existed. Validation logic written against the 0.109.0 stubs would treat a legitimate response as malformed.

What does frontier_llm actually cover? Anthropic documents it as requests that could assist development of competing AI models under its commercial terms — and notes that benign machine-learning work may also trigger it . So this is not an edge case reserved for adversarial prompts. A developer running ordinary ML experiments through Fable 5 can hit it, which makes the missing enum value a production hazard rather than a theoretical one.

The complete refusal category set is four values, and all four must appear in any exhaustive match for a typed client to be safe:

  • cyber
  • bio
  • frontier_llm
  • reasoning_extraction

These are the documented categories returned in a refusal's stop_details . If your handler branches on category, or if a downstream service serializes the value into a typed schema, omitting any one of them reintroduces the same fall-through that 0.109.0 had.

The practical takeaway: pin to at least v0.109.1 before shipping Fable 5 refusal logic to production. Developers still on 0.109.0 with category-typed handlers should upgrade first — the two releases came out the same day, June 9, 2026, so there is no reason to stay on the earlier one . If you cannot upgrade immediately, treat refusal categories as an open string set in your validators rather than a closed Literal, and log unknown values instead of raising on them.

After 0.109.2: what's gone from the type stubs

Once your refusal handling is solid, the last release in the train is the easiest to reason about. Version 0.109.2, published June 15, 2026, is a chore-only cleanup that removes retired model identifiers from the SDK's type definitions . There is no behavioral change and no API-surface change — the client talks to the same endpoints it did on 0.109.1.

"remove retired models from API and SDKs" — anthropics/anthropic-sdk-python CHANGELOG (source: GitHub Releases)

The practical effect lands at type-check time, not runtime. If your code references a removed identifier through the SDK's typed model enum, you will see a type or lint error on upgrade . Raw string literals are unaffected — a hard-coded model name passes through and is resolved server-side, so the failure mode is a stricter local toolchain rather than a broken request.

That makes the upgrade order straightforward. A clean sequence for teams moving onto Fable 5 or Mythos 5:

  • Upgrade to 0.109.2 first to clear retired-identifier noise from the typed model list and surface any stale references in one pass.
  • Audit refusal handling so all four documented categories — cyber, bio, frontier_llm, and reasoning_extraction — are present, since a 200 with stop_reason: "refusal" is a normal success path for Fable 5 .
  • Then enable the retry path you need — server-side fallback on the Claude API, or the client-side middleware on Bedrock, Vertex, and Foundry where server-side fallback is unavailable .

The takeaway: 0.109.2 is the safe landing point for the whole June 9 batch. It removes dead model names without touching behavior, so upgrade to it, confirm your category enum is complete, and turn on fallback only once refusal-as-success is handled. The SDK's automated codegen cadence means more of these small releases are coming — pin your version and read the changelog, because the type stubs now move faster than the prose that explains them.

Last updated: 2026-06-18.

Frequently asked questions

What is stop_reason: refusal and how is it different from an API error?

A refusal is a structurally successful response: the transport layer returns HTTP 200, but the model declines to answer, setting stop_reason: "refusal" with stop_details carrying a category and explanation . Unlike a 4xx or 5xx, the SDK raises no exception — your code must inspect stop_reason before consuming content blocks, or it will read a body that contains a decline rather than an answer.

Do Bedrock, Vertex, and Foundry users need to implement their own retry on refusal?

Yes. Server-side fallback is beta-only on the Claude API and Claude Platform on AWS; it is unavailable on Message Batches, Amazon Bedrock, Vertex AI, and Microsoft Foundry . Those platforms must use the SDK's BetaRefusalFallbackMiddleware, documented for Python, TypeScript, Go, Java, and C#. Ruby and PHP have no middleware and require direct retry logic; batch users must collect refused items and resubmit manually.

What are the four Fable 5 refusal categories?

The documented categories are cyber, bio, frontier_llm, and reasoning_extraction . The frontier_llm category — covering requests that could assist development of competing models, plus some benign ML work — was absent from the typed SDK enum until v0.109.1 . Any code on 0.109.0 with category-typed handlers should upgrade before handling live traffic.

What exactly broke in typed code when frontier_llm was absent from the enum?

Code that constrained stop_details.category to a closed set of values — exhaustive match/switch statements, or Pydantic Literal and enum validators — had no path for frontier_llm before v0.109.1 . A refusal classified that way would either raise a validation error on parse or silently fall through to the default branch, mishandling a live Fable 5 decline rather than routing it correctly.

Is Mythos 5 generally available, and does it also return HTTP 200 on refusal?

No on both counts. Claude Mythos 5 (claude-mythos-5) is access-restricted through Project Glasswing and lacks Fable 5's safety classifiers, so it does not return stop_reason: "refusal" — that behavior is Fable 5-specific . Both models share a 1M-token context window and pricing of $10 per million input and $50 per million output tokens .