Fugu hides the seams: multiple AIs, billed as a whole

Fugu wraps a swappable LLM pool in an OpenAI-compatible endpoint. Setup, tier comparison, billing, and EU restrictions.

Fugu hides the seams: multiple AIs, billed as a whole
Share

Sakana AI's pitch with Fugu is unusual: stop chasing one bigger model, and instead train a model whose job is to direct other models. The seams between vendors are hidden, and you get back a single answer on a single bill.

What Is Sakana Fugu?

Sakana Fugu is a language model trained to act as a coordinator rather than a soloist. When a request arrives, Fugu decomposes the task, routes the sub-tasks to the most suitable external frontier LLMs — OpenAI, Anthropic, and Google among them — verifies their outputs, and synthesizes a final answer, all invisible to the caller . Sakana describes it as "a multi-agent system that behaves like a single model," aimed at frontier-level quality while reducing single-vendor lock-in, since the agent pool is swappable .

Two tiers ship. fugu targets lower latency, lets you customize or exclude vendors in the pool, and suits interactive coding, code review, and chatbots. fugu-ultra runs a fixed pool, routes among one to three agents depending on difficulty, and is tuned for hard, multi-step work like Kaggle competitions, literature review, and cybersecurity analysis .

Released in June 2026, Fugu comes from Sakana AI — co-founded in 2023 by Llion Jones, a co-author of the 2017 "Attention Is All You Need" paper, and David Ha, former head of research at Stability AI . It builds on the Trinity and Conductor ICLR 2026 papers and the technical report at arXiv:2606.21228 . One caveat before you start: Fugu is not available in the EU/EEA, the UK, or Switzerland at launch while GDPR compliance is pending, under Terms of Service effective June 12, 2026 .

Before You Integrate Fugu

Fugu hides the seams: multiple AIs, billed as a whole

Setup is a console-and-credential exercise, not a model download. Sign in to the Sakana console with Google or email, then generate an API key; you must be at least 18 under the Terms of Service effective June 12, 2026.

Two checks matter before your first call:

  • Region. Fugu is blocked in the EU/EEA, the UK, and Switzerland at launch while GDPR work continues, so verify your access region first .
  • Data policy. Sub-tasks route to external frontier models from OpenAI, Anthropic, or Google, so review your organization's policy before sending proprietary code .

For fugu only — not fugu-ultra, whose pool is fixed — you can exclude specific vendors or models during key creation or in console settings . A training-data opt-out also lives in settings; enable it before sending any non-public material .

Plugging Fugu Into a Codebase

Fugu hides the seams: multiple AIs, billed as a whole

Because Fugu is OpenAI-compatible, integration is a base URL swap: point the OpenAI SDK at Sakana with base_url='https://api.sakana.ai/v1' and api_key='YOUR_FUGU_KEY', then call model fugu or fugu-ultra . Existing chat.completions calls keep working unchanged. Supported endpoints are /v1/chat/completions, /v1/responses, and /v1/models, which lists available model IDs .

Sakana recommends the Responses API — client.responses.create() — over Chat Completions for tool use, multimodal input, and reasoning or function-call management, while Chat Completions remains supported . The mental model is one bill for many agents: this illustrative (not executed against the live API) sketch shows the single-line-item shape Fugu presents — sub-tasks fan out, one charge comes back.

class Fugu:
    """One interface that routes work to multiple AIs and returns one bill."""

    prices = {"planner": 0.003, "writer": 0.002, "auditor": 0.001}

    def run(self, task):
        usage = {
            "planner": f"plan({task})",
            "writer": f"draft({task})",
            "auditor": f"check({task})",
        }
        result = " -> ".join(usage.values())
        total = sum(self.prices[name] for name in usage)
        return {"result": result, "bill": f"${total:.3f}", "line_item": "Fugu"}


response = Fugu().run("launch email")
print(response)

Three runtime details bite if you miss them. Reasoning effort accepts only high and xhigh (max is an alias for xhigh); every other value is rejected . Complex fugu-ultra jobs can run long, so set timeout=120.0 or higher to avoid default client cut-offs . Streaming works with the usual stream=True, and for production stability pin the dated alias fugu-ultra-20260615 rather than the floating fugu-ultra tag .

For agentic coding, Sakana ships a Codex integration installed in one line: curl -fsSL https://sakana.ai/fugu/install | bash, then launch codex-fugu . The bootstrap clones github.com/SakanaAI/fugu into ~/.fugu, pins Codex, deploys config, and stores your key; non-interactive installs pass SAKANA_API_KEY and --yes . The installer officially supports Ubuntu and macOS only — Windows users need WSL Ubuntu or manual setup, since the bash pipe won't run in PowerShell . If codex-fugu isn't found after install, reopen the terminal to refresh PATH .

What Fugu Hides From You

Fugu hides the seams: multiple AIs, billed as a whole

Fugu's convenience comes with deliberate opacity. Sakana treats provider selection and the routing plan as proprietary: for any given query, it does not expose which underlying models ran or how the task was decomposed . Usage fields do separate visible-model tokens from orchestration tokens — and those orchestration tokens are real, billable usage folded into the final price — but you get no per-provider breakdown .

The benchmarks deserve the same skepticism. The model page reports Fugu Ultra / Fugu at SWE Bench Pro 73.7 / 59.0, Terminal Bench 2.1 82.1 / 80.2, and GPQA Diamond 95.5 / 95.5 . These are self-reported. Baseline frontier scores are provider-reported rather than re-run, and Fable 5 and Mythos Preview were left out of Fugu's agent pool because they were not publicly accessible — so the headline comparisons are not fully independent, apples-to-apples audits, and no third-party replication existed at the time of reporting .

Two more constraints matter before onboarding. Fugu is blocked entirely across the EEA, UK, and Switzerland pending GDPR compliance, with no launch timeline given for those regions . And per the Terms of Service (effective June 12, 2026), Sakana guarantees nothing about accuracy, completeness, or legal compliance, and routes sub-tasks to external models like OpenAI, Anthropic, and Google — review your data-handling obligations before sending sensitive or proprietary code .

Taking Fugu Further

Once Fugu is wired in, match the plan and tier to your workload. Subscriptions cover both models: Standard at $20/month, Pro at $100/month (10× Standard usage), and Max at $200/month (30× Standard) — with a second month free for subscriptions made before July 31, 2026. For production, pay-as-you-go consumption tokens get higher routing priority than monthly-plan tokens, which matters for latency-sensitive traffic .

fugu-ultra-20260615≤272K context>272K context
Input / 1M tokens$5$10
Output / 1M tokens$30$45
Cached input / 1M$0.50$1.00

Note that orchestration tokens are real, billable usage folded into that consolidated price . Reserve fugu-ultra for multi-step code challenges, patent investigation, paper reproduction, and security analysis; for fast interactive tasks, fugu gives lower latency and a provider pool you can constrain. Pick the tier by job difficulty, not by default.

Frequently asked questions

Is Sakana Fugu available in Europe?

No. Fugu is not available in the EU/EEA, the UK, or Switzerland at launch. The Terms of Service (effective 2026-06-12) explicitly exclude the European Economic Area, the United Kingdom, and Switzerland while Sakana works toward GDPR compliance . No timeline for those regions has been published, so plan around it rather than waiting on it.

What is the difference between fugu and fugu-ultra?

The two tiers trade latency against answer quality. fugu targets low latency and everyday quality for interactive coding, code review, and chatbots, and lets you exclude specific providers or models from the pool when you create or edit an API key . fugu-ultra uses a fixed pool, routes among one to three agents depending on difficulty, and is tuned for hard, multi-step problems. Its dated alias fugu-ultra-20260615 is priced at $5 input and $30 output per 1M tokens under standard context .

Can I see which underlying LLMs Fugu used for my request?

No. Sakana treats provider selection and the routing plan as proprietary and does not expose them for any query . Fugu Ultra usage fields separate visible model tokens from orchestration tokens, but there is no per-provider breakdown . The opacity is by design — if you need an audit trail of which vendor handled each sub-task, Fugu will not give you one.

Does Fugu work with the OpenAI Python SDK without major changes?

Yes. Fugu is OpenAI-compatible: set the SDK's base_url to https://api.sakana.ai/v1 and pass your Fugu credential as api_key . Existing Chat Completions calls run as-is, though Sakana recommends the Responses API for new integrations that need tool use, multimodal input, or reasoning management . Raise your client-side timeout for fugu-ultra, since complex jobs can run long .

Does the Fugu installer work on Windows?

Only through WSL Ubuntu. The one-line bash-pipe installer (curl -fsSL https://sakana.ai/fugu/install | bash) will not run in PowerShell or a native Windows terminal, so the supported path is WSL Ubuntu or the manual setup; macOS and Ubuntu are officially supported . For non-interactive installs, pass SAKANA_API_KEY as an environment variable and --yes to skip prompts; if codex-fugu isn't found afterward, reopen the terminal to refresh PATH .