Sakana Fugu Review: Japan's Multi-Agent AI Beats GPT-5.5

Japan just shipped a new kind of AI model — and it's beating GPT-5.5 on coding benchmarks. Sakana AI, a Tokyo-based research lab, launched Fugu in June 2026. It's not a single model. It's a multi-agent orchestration system that coordinates multiple frontier models behind one API. Think of it as an AI that assembles its own team based on your task, delegates work to the right specialists, then hands you back a single answer.

What Is Sakana Fugu?

Sakana AI was founded in 2023 by former Google DeepMind researchers. Their name comes from the Japanese word for fish (魚) — and Fugu is the Japanese blowfish, famous for being both exotic and lethal if handled wrong. The AI tool borrows the name for good reason: it looks simple on the surface but has serious technical depth underneath.

Fugu is built on two research papers accepted at ICLR 2026: TRINITY and Conductor. TRINITY uses a lightweight coordinator that assigns Thinker, Worker, and Verifier roles to different LLMs across multiple turns. Conductor is trained with reinforcement learning to discover natural-language coordination strategies automatically. You don't write orchestration logic or choose which models run. Fugu figures that out per request.

Fugu vs. Fugu Ultra: Which One Do You Need?

Sakana ships two variants, both through the same OpenAI-compatible endpoint. If you already use Claude Opus 4.8 or GPT-5.5, switching takes minutes.

Fugu

Balanced

Low latency, solid output quality. Best for everyday coding, code review, and chatbot workflows. You can exclude specific model providers to meet data privacy or compliance needs.

Fugu Ultra

Performance

Deeper agent pool, optimized for accuracy above all else. Early users rely on it for Kaggle competitions, patent research, paper reproduction, and cybersecurity assessments.

Benchmark Results: How Does It Actually Stack Up?

These scores come directly from Sakana AI's published technical report, compared against publicly reported scores for Opus 4.8, GPT-5.5, and Gemini 3.1 Pro. Neither Fable 5 nor Claude Mythos Preview are in Fugu's agent pool since they're not publicly accessible.

Benchmark	Fugu	Fugu Ultra	Opus 4.8	Gemini 3.1 Pro	GPT-5.5
SWE Bench Pro	59.0	73.7 ★	69.2	54.2	58.6
LiveCodeBench	92.9	93.2 ★	87.8	88.5	85.3
LiveCodeBench Pro	87.8	90.8 ★	84.8	82.9	88.4
GPQA-D (Doctoral Science)	95.5 ★	95.5	92.0	94.3	93.6
Humanity's Last Exam	47.2	50.0 ★	49.8	44.4	41.4
TerminalBench 2.1	80.2	82.1 ★	74.6	70.3	78.2
CharXiv Reasoning	85.1	86.6 ★	84.2	83.3	84.1

★ = top score. Source: Sakana AI technical report, June 2026. Provider-reported scores used for Opus 4.8, Gemini 3.1 Pro, and GPT-5.5.

What Real Users Are Saying

The user feedback lines up with the benchmarks. Here's what early users reported on the Sakana website:

"Where other tools flagged about three issues, Sakana Fugu surfaced more than twenty. It's become the model I run all my reviews through."

Software Engineer · Code Review

"Normally 3–4 days of work. With Fugu I had a full analysis in a few hours, including connections between papers I would never have spotted on my own."

Industry Researcher · Patent Landscape

"Given one scoped instruction, Sakana Fugu drove a full security assessment end-to-end — recon, XSS/SQLi checks, auth review, and a clean report — staying inside scope."

Security Engineer · Security Assessment

These match patterns you see across other agentic AI tools — but Fugu Ultra's multi-model coordination appears to deliver qualitatively better answers on complex tasks than single-model agents.

Pricing: Is It Worth It?

Every plan — including Standard — gives you access to both Fugu and Fugu Ultra. You're not locked into the cheaper model.

Standard

$20/month

Lightweight daily use

Baseline usage allowance

Pro

$100/month

Focused working sessions

10x Standard allowance

Max

$200/month

Heavy long-running workloads

20x Standard allowance

Pay-as-you-go (Enterprise)

Fugu Ultra — Input

$5 / 1M tokens

Fugu Ultra — Output

$30 / 1M tokens

Fugu — Pricing

Top-tier model rate
No fee stacking

Rates increase for contexts above 272K tokens. Token usage and cost are reported per request.

🎁

Limited-time offer: Subscribe before July 31, 2026

Get your second month completely free at your initial subscription tier. That's $20 off on Standard, $100 off on Pro, or $200 off on Max.

Who Should Try Sakana Fugu?

Fugu fits well with AI-powered workflows that need strong reasoning, coding, or research — especially when reliability matters more than raw speed. If you're browsing the AI coding tools directory, it belongs on your shortlist.

✓ Good fit if you...

Run complex, multi-step coding or reasoning tasks
Do research across long documents or patent archives
Want frontier performance without single-vendor lock-in
Need model opt-out for compliance (Fugu standard)
Compete on Kaggle or run autonomous ML experiments

✗ Not ideal if you...

Are based in the EU/EEA (not available yet)
Need ultra-low latency for real-time consumer apps
Require a free tier to evaluate first
Need to see which specific model handled your query

Final Verdict

Sakana Fugu is not a marketing stunt. The benchmark scores come from real tasks — code that runs, science questions that need doctoral-level reasoning, sequential decision-making under uncertainty. The multi-agent approach produces results that individual frontier models can't match on their own, and the pricing is competitive with paying for multiple AI subscriptions separately.

The Standard plan ($20/month) is cheap enough to test without commitment. And the free second-month offer running through July 2026 makes right now a low-risk time to try it. If you're already spending on Claude, GPT-5, and Gemini separately, Fugu Ultra might cover all three.

9.0

Overall Score

9.4

Benchmarks

8.5

Value

9.2

Ease of Use

Try Sakana Fugu →

Frequently Asked Questions

Is Sakana Fugu available outside Japan?

Yes — it's accessible globally except EU/EEA countries, where Sakana is still working on GDPR compliance. Users in North America, Asia, and other regions can sign up today.

Is Fugu compatible with my existing OpenAI setup?

Yes. Fugu uses an OpenAI-compatible API. You point your existing client at the Fugu endpoint with your API key and start sending requests — no SDK migration required.

What's the difference between Fugu and Fugu Ultra?

Fugu balances speed and quality for everyday work. Fugu Ultra maximizes answer quality by coordinating a deeper pool of expert agents — better for hard, multi-step tasks like research, Kaggle, or paper reproduction.

Can I control which models Fugu uses?

For Fugu (the standard model), yes — you can opt out specific providers from the agent pool via the console settings. For Fugu Ultra, the pool is fixed to maintain peak performance.

Is there a free trial?

There's no free tier, but the Standard plan is $20/month and Sakana is offering a free second month for anyone who subscribes before July 31, 2026. That's 2 months for the price of one to test it properly.