Anthropic paused the SDK billing split on June 15 itself

Anthropic paused the SDK credit split. Live usage credits system, CI cost baselines, and what triggers a restart.

Anthropic paused the SDK billing split on June 15 itself
Share

Did the June 15 SDK Billing Change Actually Take Effect?

No. Anthropic paused the SDK billing split on June 15, 2026 — the exact day it was scheduled to launch — and the original change is not in effect as of June 16. The company's Help Center article, the primary source for the policy, now carries a pause notice stating that "nothing has changed: Claude Agent SDK, claude -p, and third-party app usage still draw from your subscription's usage limits" . If you were bracing for a separate monthly credit pool to start metering your agents today, it did not happen.

The change that was announced — reported May 14, 2026 — would have meant that, "Starting June 15, 2026, Claude Agent SDK and claude -p usage no longer counts toward your Claude plan's usage limits" . That architecture is real and fully specified; Anthropic shelved the launch, not the design.

"Starting June 15, 2026, Claude Agent SDK and claude -p usage no longer counts toward your Claude plan's usage limits." — Anthropic Help Center, now superseded by a pause notice (source: support.claude.com)

Current state, plainly: SDK calls, the claude -p headless CLI, Claude Code GitHub Actions, and third-party apps authenticating via the SDK all still draw from your subscription's included limits. Nothing moved to a dedicated credit pool. Anthropic has committed to updating the plan and sharing details before anything takes effect , which means the Help Center article itself is your earliest reliable signal — watch support.claude.com/en/articles/15036540 and your account email.

Treat this as a live risk, not a cancellation. The pause is indefinite, and the metered-credit model Anthropic described — standard API rates, no rollover, hard stop by default once exhausted — is exactly what teams running CI, cron, and scheduled agents should plan against. The remaining sections break down what was proposed, what is already live, and how to audit your automation before any restart.

What Was Announced: The Proposed Programmatic Credit Split

The proposed change drew a hard line between human and machine traffic. Announced May 14, 2026 and scheduled to take effect June 15, programmatic Claude usage would have stopped counting against the flat-rate subscription pool and instead metered at standard API rates from a separate, dedicated monthly credit bucket. The Help Center article states it verbatim: "Starting June 15, 2026, Claude Agent SDK and claude -p usage no longer counts toward your Claude plan's usage limits" .

The traffic that would have moved to the metered bucket: Agent SDK usage in your own projects, the claude -p headless CLI, Claude Code GitHub Actions, and third-party apps authenticating through the SDK . What would have stayed on the subscription pool: interactive Claude Code in the terminal or IDE, and Claude conversations on web, desktop, and mobile .

The boundary is interactive-human versus programmatic-automated. If a person is typing and waiting for a response, it stays subscription-funded. If a process is running unattended, it is metered. That puts CI runners, cron jobs, scheduled agents, and PR-review bots squarely on the credit side — exactly the workloads teams scale up precisely because no one is babysitting them. The same release notes confirm the SDK exposes Claude Code's full agent loop and tools , so any headless invocation of that loop would have been treated as programmatic traffic.

Under the paused design, each consumer tier received a fixed monthly credit allowance, metered at standard API rates, with no rollover and a hard stop once exhausted unless "usage credits" overflow was explicitly enabled :

Plan tierDedicated agent credit / monthMeteringDefault when exhausted
Pro$20Standard API ratesRequests rejected (hard stop)
Max 5x$100Standard API ratesRequests rejected (hard stop)
Max 20x$200Standard API ratesRequests rejected (hard stop)

The design choice that matters most for planning: credits would not roll over, so unused allowance evaporated each month, and the default failure mode was closed — once the bucket emptied, agent requests were rejected rather than silently billed, unless an operator opted into overflow spend . That is the model worth budgeting against even now, because the pause did not retract the structure — only its start date.

The Usage Credits System That Is Live Right Now

While the programmatic split sits paused, a different overflow mechanism is fully live and worth wiring into your budget today: usage credits. Usage credits are an opt-in spend layer that lets Pro, Max 5x, and Max 20x subscribers keep working after they exhaust their plan's included allocation, with subsequent usage metered at standard API pricing rather than failing closed . This is the same standard-rate metering the paused SDK design borrowed — only here it applies to interactive Claude and Claude Code terminal usage, and it is something you control now.

The feature was renamed from "Extra usage" to "usage credits" during May 18–22, 2026, and the /extra-usage command became /usage-credits; the old command and URL still redirect, and the CLI output strings were updated to match . So if a runbook or alias references the old name, it will keep functioning — but new docs and command output now use the new label.

Mechanically, credits sit on top of the plan rather than replacing it. Session limits still reset every five hours, and credits only activate once that included allocation is spent; charges are billed separately from the subscription . You enable them under Settings > Usage, set a monthly cap (or choose unlimited spend), prepay, and optionally configure auto-reload. A $2,000 daily redemption cap applies, and credits cover both Claude conversations and Claude Code terminal usage, with combined consumption counting toward your limits . For teams running claude -p inside scheduled jobs, this is the difference between a build that stalls and one that quietly keeps spending (video: The Stack Underflow).

One edge worth knowing: minor overshoot is possible because Anthropic checks the limit before a request and computes actual token consumption after, so a single call can tip slightly past a cap you set .

The plumbing differs by plan tier. Team owners pre-purchase credits so members keep using Claude, Cowork, and Claude Code after seat limits are reached; seat-based Enterprise is billed at month-end on actual usage; and usage-based Enterprise plans have no credits mechanism at all, because every token bills at API rates from the first request . Org-wide, seat-tier, and per-user monthly limits are all configurable, which gives finance a real lever before any future SDK restart lands.

Token Rates and Cost Baselines: What Your Agents Actually Burn

Anthropic paused the SDK billing split on June 15 itself

Your agents burn budget at standard API rates, and those rates are the number you should be modeling against — not the paused per-plan credit figures. As of June 2026, Opus 4.8/4.7 cost $5 per million input tokens and $25 per million output tokens, Sonnet 4.6/4.5 cost $3/$15, Haiku 4.5 costs $1/$5, and Fable 5 and limited-availability Mythos 5 cost $10/$50 . These are the per-token economics that any future SDK restart would meter against, so they are the right baseline for re-planning automated workloads now.

ModelInput ($/M tokens)Output ($/M tokens)
Opus 4.8 / 4.7 / 4.6 / 4.5$5$25
Opus 4.8 fast mode$10$50
Sonnet 4.6 / 4.5$3$15
Haiku 4.5$1$5
Fable 5 / Mythos 5$10$50

Two recent shifts can quietly inflate those numbers. First, Opus 4.8 fast mode runs at $10/$50 per million tokens — roughly 2x standard Opus pricing for about 2.5x the speed — and Opus 4.8 became the default model for Max, Team Premium, and Enterprise pay-as-you-go accounts during May 25-29, 2026 . If your agents inherited that default without an explicit model pin, you may already be paying Opus rates on work a Sonnet or Haiku run could handle.

Second, the tokenizer changed. Opus 4.7 and later use a tokenizer that can consume up to 35% more tokens for the same text compared with earlier models . That means a per-run cost estimate carried over from a pre-4.7 model is structurally low — recheck your assumptions if you upgraded, because the same prompt and the same output now move more tokens across the meter at the same per-token rate.

Do not trust the SDK's own cost fields for budgeting. The total_cost_usd and costUSD values are client-side estimates derived from a bundled price table, and Anthropic explicitly says not to use them for billing; the authoritative figures live in the Console Usage and Cost tabs and the Usage and Cost API . Wire your measurement to those sources before any restart lands, so your per-scheduled-run token burn is grounded in actuals rather than a stale local table.

For a sense of scale, Anthropic's own cost documentation cites enterprise deployments averaging roughly $13 per developer per active day and $150-$250 per developer per month, with 90% of users staying under $30 per active day . The tail matters more for automation: agent teams running in plan mode use about 7x more tokens than a standard interactive session (video: The Engineering Shift). A CI pipeline or cron job that spins up multi-agent plan-mode runs unattended is exactly where that 7x multiplier compounds — and exactly the traffic a programmatic credit split would have metered separately.

Bundle Discounts and Overflow Policy: Levers Worth Understanding Now

Discounted usage bundles are the cleanest lever for capping that compounding cost: they are prepaid credit blocks bought at a discount that activate only after your plan's included limits are exhausted. Anthropic prices them at $50 face value for $45 (10% off), $250 for $200 (20%), and $1,000 for $700 (30%) . Because the discount scales with size, a team with predictable monthly overage saves the most by pre-buying one large bundle rather than topping up in small increments.

Bundles draw down from a single pool shared across Claude on web, Desktop, Mobile, Claude Code, Cowork, and approved third-party apps on the account, and they only begin to deplete once included limits are exceeded . Purchase ceilings differ by plan: individual Pro and Max users can buy up to $2,000 per month in bundles, while Team owners can buy up to $3,000 . Above that bundle cap, usage does not stop — it continues at standard (undiscounted) credit rates, so the cap limits your discount, not your spend.

Bundle face valuePrice paidDiscount
$50$4510%
$250$20020%
$1,000$70030%

Monthly bundle purchase cap: $2,000 for individual Pro/Max, $3,000 for Team owners . Overage beyond the cap bills at standard credit rates.

Bundles are the discount lever; overflow policy is the failure-mode lever, and neither setting is neutral. Enabling usage credits lets work continue at standard API pricing after included limits are hit, which prevents hard request failures but exposes uncapped, API-rate spend on automated traffic . Disabling it does the opposite: the system fails closed, so an unattended cron job rejects requests rather than running up a bill. For agents in CI, a hard stop you did not anticipate is its own outage.

  • Credits on, no cap: agents keep running; spend is unbounded at API rates — risky for unattended jobs.
  • Credits on, explicit monthly cap: the recommended posture — set a spend ceiling via /usage-credits (changing it needs billing access) .
  • Credits off: fails closed; predictable cost but scheduled agents break silently when limits hit.

For teams running claude -p in CI/CD, that default question is the one to settle deliberately rather than discover in production (video: The Stack Underflow). Pair an enabled-with-cap policy with a pre-bought bundle sized to your measured overage, and you get continuity, the steepest discount, and a hard ceiling at once.

Auditing Your Automation Inventory Before a Restart

Before any restart notice lands, build a complete inventory of programmatic Claude traffic — because every one of those calls sits on the paused side of the proposed split. Under the announced design, Agent SDK usage, the claude -p headless CLI, Claude Code GitHub Actions, and third-party apps authenticating via the SDK would all draw from the separate metered credit, while only interactive terminal/IDE and chat usage stayed on the subscription pool . The boundary is automated-vs-human, so your audit list is essentially your automation list.

Enumerate, repo by repo:

  • Every claude -p invocation in shell scripts, Makefiles, and CI steps.
  • GitHub Actions workflows that run Claude Code (PR review bots, doc generators, triage labelers).
  • Cron jobs and scheduled runners that fire agents unattended.
  • Internal or customer-facing apps built on the Python/TypeScript SDK.

Pay special attention to recently expanded capability. Week 24 (June 8–12, releases v2.1.166–v2.1.176) added nested subagents with background chains capped at five levels . Each level multiplies token exposure per run: a single scheduled trigger can now fan out into a tree of delegated work, and agent teams already consume roughly 7x more tokens than standard sessions when teammates run in plan mode . An inventory that counts only top-level entry points will undercount real burn.

Harden each unattended agent the same way you would lock down a service account: scope tool allow-lists to the minimum the job needs, cap iteration counts so a stuck loop cannot run unbounded, and reserve Opus-class models for runs that genuinely need them. This matters most for anything customer-facing, where you cannot lean on a subscription cushion at all.

"Third-party developers may not offer claude.ai login or rate limits in SDK products unless previously approved." — Anthropic, Claude Agent SDK documentation (source: code.claude.com).

That warning has a direct budgeting consequence: apps you ship to users should be costed against API or cloud-provider billing from the first token, not against a personal plan. Note also that the SDK's total_cost_usd and costUSD fields are client-side estimates from a bundled price table — Anthropic says not to use them for billing and to treat the Console or the Usage and Cost API as authoritative .

For the heaviest unattended batch jobs — nightly migrations, bulk classification, large-scale code rewrites — the cleaner answer is to route them to the standard API behind their own dedicated budget guardrails rather than leaning on subscription overflow. Practitioner estimates put a heavy SDK user's effective increase at roughly 5–10x under the paused design , so isolating that traffic now gives you a metered, capped lane that a restart cannot disrupt. Do the inventory while the change is still paused; it is far cheaper than discovering the scope after the meter switches on.

What to Watch: Signals the Pause Is Lifting

The single signal that matters most is the primary source itself: Anthropic's Help Center article at support.claude.com/en/articles/15036540, which states the company will update the article and share details before anything takes effect . Put a browser change-alert or RSS monitor on that URL now. It is the canonical restart notice, and it will flip before any account-level enforcement does.

Track three secondary signals in parallel. First, the platform release notes at platform.claude.com, where billing-scope changes tend to surface alongside version bumps . Second, the billing and usage section of your account settings, where a reinstated programmatic-credit line item would appear before it bites. Third, the account email on file for your admin address — Anthropic said it would notify affected plans directly . Make sure that address is monitored, not a dormant shared inbox.

Translate the announced numbers into your own task math before the meter returns. If the split restarts at the published amounts, a Pro plan's $20/month dedicated credit covers roughly 30–50 medium Sonnet-class coding tasks — useful for light, attended work but thin for continuous automation. Opus 4.8 changes the arithmetic sharply: its fast mode bills $10/$50 per million input/output tokens, about double standard pricing for roughly 2.5x speed , so a scheduled CI agent running Opus-class jobs can drain a $20 credit in hours rather than weeks.

"Once you put claude -p in a pipeline, you are no longer budgeting for a person at a keyboard — you are budgeting for a process that runs whether or not anyone is watching." — The Stack Underflow, on claude -p in CI/CD (video: The Stack Underflow)

That framing sets the decision point at restart. Teams running unattended batch work at Opus 4.8 rates will likely find dedicated API billing cheaper and more predictable than paying standard rates through subscription overflow, where spend is uncapped by default and harder to attribute per job. The concrete takeaway: keep the inventory and per-run token measurements current while the change is paused, watch article 15036540 for the flip, and pre-decide which workloads move to a metered API key the day the meter switches on.

Frequently asked questions

Did Anthropic's June 15 SDK billing change actually take effect?

No. Anthropic paused the change on June 15, 2026 — the same day it was scheduled to start . The Help Center article now states plainly that "nothing has changed": Claude Agent SDK, claude -p, and third-party app usage still draw from your subscription's usage limits . Anthropic says it will update the plan and share details before anything takes effect, so the restart notice will appear on article 15036540 first . Treat the split as a live risk, not a current state.

What is the difference between usage credits and the paused credit pool?

They are two separate mechanisms that are easy to conflate. Usage credits are live now and opt-in: they let Pro, Max 5x, and Max 20x subscribers keep working after hitting included limits by switching subsequent usage to standard API pricing, and they apply to all Claude usage including interactive conversations and Claude Code terminal sessions . The paused credit pool was different — a separate, mandatory metered bucket reserved exclusively for programmatic and SDK traffic, with no opt-out, that would have rejected requests once exhausted unless overflow was enabled . One is voluntary overflow for everyone; the other was a forced split for automated workloads.

Which of my Claude usage counts against my subscription limits right now?

All of it. As of June 16, 2026, SDK usage, claude -p, Claude Code terminal and IDE sessions, web/desktop/mobile conversations, Claude Code GitHub Actions, and third-party apps authenticating through the SDK all draw from the same subscription pool . The programmatic-versus-interactive split that would have moved CI runners, cron jobs, and PR-review bots onto a metered credit is not in effect . Nothing about how your automated agents bill has changed yet.

How do I accurately measure what my CI and SDK agents cost?

Use the Console Usage and Cost tabs, or the Usage and Cost API — these are authoritative. Do not rely on the SDK's total_cost_usd or costUSD fields: Anthropic states these are client-side estimates derived from a bundled static price table and explicitly says not to use them for billing decisions . For per-run accuracy, measure token burn per scheduled job through the Console tabs (video: The Stack Underflow) . This baseline matters because enterprise deployments average roughly $150–$250 per developer per month, and agent teams in plan mode can use about 7x more tokens than standard sessions .

Should I enable usage credits overflow now, before any billing restart?

It depends on your failure preference. Enabling usage credits prevents hard stops when you exceed plan limits, but it exposes uncapped API-rate spend; leaving it off fails closed — requests are rejected once limits are hit . If you enable it, set an explicit monthly spend limit via /usage-credits rather than choosing unlimited, and review that cap against your Console cost baselines (changing the cap requires billing access) . For Team and seat-based Enterprise plans, owners control whether members can continue past seat limits and pre-purchase or settle credits accordingly .

Watch / Sources

Last updated: 2026-06-16. Reviewed against the Anthropic Help Center pause notice and live usage-credits documentation as of this date.