Model & API Releases

News & Releases Model & API Releases

LatentSync 1.6 needs 18 GB VRAM — that's the broadcast bar

LatentSync 1.6, MuseTalk 1.5, and MOVA compared on VRAM requirements, fps, resolution ceiling, and which runs locally

Sungjae Lee

Jul 09, 2026

News & Releases Model & API Releases

OCR 4's 72% win: the study was Mistral's

Mistral OCR 4: spatial layout, 170 languages, $4/1K pages API. The 72% win is vendor-run; OlmOCRBench is the verifiable score.

Sungjae Lee

Jul 04, 2026

Claude / Anthropic News & Releases Model & API Releases

Opus 4.8 and the 11-day Bun port: what the vendor claim hides

Opus 4.8 flags code flaws 4x more, deprecates `budget_tokens`, and adds Dynamic Workflows for long-running pipelines.

Sungjae Lee

Jun 26, 2026

News & Releases Model & API Releases

Quicksilver's 37KB classifier: Suno and Udio, nothing else

UChicago's SAND Lab released Quicksilver: a 37KB extension flagging AI music from Suno and Udio, all on-device.

Sungjae Lee

Jun 26, 2026

News & Releases Model & API Releases

Animation-only, 60% pricier: what 1.5 actually delivers

Image-only, $0.08–$0.25/sec, native audio sync — what the 1.5 release delivers and what launch coverage omits.

Sungjae Lee

Jun 26, 2026

News & Releases Model & API Releases

ArgusRed refuses nothing. What actually constrains it?

Cosine's pen-test model, a Go enforcement layer, confirmed-only findings, and no published model card or eval.

Sungjae Lee

Jun 25, 2026

News & Releases Model & API Releases

Amazon wants to sell you Trainium racks. Neuron is the catch.

Amazon moves toward merchant Trainium3: rack specs, Neuron porting friction, and what Jassy's $50B TAM actually means.

Sungjae Lee

Jun 25, 2026

News & Releases Model & API Releases

Mistral OCR 4: the 72% is preference voting, not proof

OCR 4's 72% win rate: blind preference study, 600+ docs, undisclosed competitors, no published methodology.

Sungjae Lee

Jun 25, 2026

Claude / Anthropic News & Releases Model & API Releases

Snowflake's CEO declared a tie. The iteration ledger didn't.

Snowflake CEO tested GLM-5.2 vs Claude Opus 4.7: 103 tasks, pass@3 near-tie, 2x token use, 5.7x cheaper output. Here's what the numbers actually mean for builders.

Sungjae Lee

Jun 24, 2026

News & Releases Model & API Releases

Amazon Folded Apparel Printing Into Alexa. The AI Is Unnamed.

Amazon's Alexa now generates merch from a description. The image model powering it remains unattributed as of June 2026.

Sungjae Lee

Jun 24, 2026

News & Releases Model & API Releases

Analysts Called It Basic. Thomson Reuters Lost 16% Anyway.

Anthropic legal automation: TR -16%, RELX -14%, Wolters -13%. Inside the AI capex debate and which SaaS moats survive.

Sungjae Lee

Jun 24, 2026

News & Releases Model & API Releases

Seedance 2.5: thirty seconds of AI video, zero documentation

ByteDance's FORCE conference on June 23 described Seedance 2.5 with 30-second one-shot video generation and 50 multimodal references. Official documentation still shows Seedance 2.0.

Sungjae Lee

Jun 23, 2026

News & Releases Model & API Releases

Mercury 2 abandons autoregressive decoding and hits 1,009/s

Mercury 2 hits 1,009 tok/s via diffusion decoding. Claim sourcing, API migration, and workload fit analysis.

Sungjae Lee

Jun 22, 2026

News & Releases Model & API Releases

The fastest LLM inference engine takes 28 minutes to start

vLLM, SGLang, TensorRT-LLM, and llama.cpp throughput compared on H100 with TTFT, cold-start, and per-workload guidance.

Sungjae Lee

Jun 22, 2026

News & Releases Model & API Releases

Tesla filed 'Megapod' — no AI rack exists to buy

Tesla's USPTO filing for 'Megapod' covers a modular AI rack with servers, cooling, and software. No price, no ship date.

Sungjae Lee

Jun 22, 2026

News & Releases Model & API Releases

Midjourney's body scanner has no AI — the CEO admitted it

Midjourney Medical's USCT scanner: 500k transducers, no FDA clearance, no live AI, and ~12 scans completed as of June 2026.

Sungjae Lee

Jun 22, 2026

OpenAI / Codex News & Releases Model & API Releases

The doctor-vs-AI health exam that only one lab graded

GPT-5.5 Instant outscored physicians on HealthBench Professional. OpenAI built the benchmark, supplied the physicians, and ran the evaluation.

Sungjae Lee

Jun 21, 2026

Google / Gemini News & Releases Model & API Releases

Gemini Omni is paywalled. 3.5 Flash is the backend.

3.5 Flash: GA, $1.50/M input, API-callable. Gemini Omni: subscription-only, no endpoint. Decision guide for builders.

Sungjae Lee

Jun 21, 2026

News & Releases Model & API Releases

Firefly routes to Kling, Veo, Runway. Your IP, not Adobe's.

Firefly AI assistant routes to Kling, Veo, Runway, and 25+ other models. Adobe's indemnity covers native outputs only.

Sungjae Lee

Jun 21, 2026

News & Releases Model & API Releases

Grok joins Databricks at DAIS — bring your own xAI credential

Grok 4.3 in Databricks Agent Bricks via BYOK. Unity AI Gateway controls, $5/1k tool calls, open partnership terms.

Sungjae Lee

Jun 21, 2026

News & Releases Model & API Releases

CrankGPT: Pi 5 Offline Voice AI — 0.8s TTFB, No Grid, Full Benchmark Breakdown

Pi 5, hand crank, no internet: CrankGPT's full ASR/TTS/LLM stack and llama-bench latency figures explained.

Sungjae Lee

Jun 21, 2026

Claude / Anthropic News & Releases Model & API Releases

Opus 4.8 is a one-line swap. The xhigh recalibration isn't.

Opus 4.8 vs 4.7: +4.9 pts SWE-bench Pro, xhigh recalibrated, GA subagent fleets. Drop-in API; effort tiers changed.

Sungjae Lee

Jun 21, 2026

News & Releases Model & API Releases

SubQ's 56× gain: Appen ran the study. SubQ paid Appen.

SubQ 1.1 Small: 56× FLOP reduction at 1M, 99% NIAH, Appen-measured, unnamed donor base, private API, no public weights.

Sungjae Lee

Jun 20, 2026

Claude / Anthropic News & Releases Model & API Releases

Every Opus 4.8 chart beats 4.7. The asterisks matter.

Opus 4.8 vs 4.7: SWE-Bench Pro up 4.9 pts, 1M-context recall nearly doubled, Dynamic Workflows launched, pricing flat.

Sungjae Lee

Jun 20, 2026

News & Releases Model & API Releases

Napier logs multiplications — the 17× is unconfirmed

Tensordyne's Napier uses log arithmetic claiming 13× throughput over GB300. Tape-out complete; production Q2 2027.

Sungjae Lee

Jun 19, 2026

News & Releases Model & API Releases

Deezer's Detector Crosses to Spotify. The 99.8% Is Unaudited.

Deezer's AI Music Detector scans Spotify, Apple Music, and Tidal for AI-generated tracks — free, no account required.

Sungjae Lee

Jun 19, 2026

News & Releases Model & API Releases

Grok 4.3 is GA on Bedrock — AWS's own list says otherwise

Grok 4.3 on Bedrock via Mantle: reported GA June 15, unconfirmed in the provider list. Model IDs and pricing breakdown.

Sungjae Lee

Jun 18, 2026

Google / Gemini News & Releases Model & API Releases

Google I/O: Gemini's cheaper tier outscored the old flagship

Gemini 3.5 Flash at Google I/O 2026: agentic vs predecessor, Computer Use gap, $1.50/M pricing, and migration checklist.

Sungjae Lee

Jun 17, 2026

Claude / Anthropic News & Releases Model & API Releases

Opus 4 retired. Opus 4.8 costs 67% less — mind the tokenizer

Sonnet 4 and Opus 4 retired June 15 — exact model IDs, breaking changes, and the Opus 4.8 tokenizer caveat explained.

Sungjae Lee

Jun 16, 2026

News & Releases Model & API Releases

Ideogram 4 turns the bounding box into a layout primitive

Ideogram 4.0: JSON-schema layout, per-element bounding boxes, 9.3B weights (non-commercial), and API from $0.03/image.

Sungjae Lee

Jun 15, 2026

News & Releases Model & API Releases

50+ slash commands and no way to find them. Lens fixes that.

Open-source skill navigator for Claude Code: /c, /cc, /cp, /cpp, /ccp, /cr mapped. v3.18 drops the /cpp word cap and adds unlimited HTML slide docs.

Sungjae Lee

Jun 14, 2026

News & Releases Model & API Releases

An OVHcloud alum founded Gladia. Now OVHcloud wants it.

OVHcloud announced exclusive negotiations to acquire Gladia, a Paris STT startup. No valuation or close date disclosed.

Sungjae Lee

Jun 12, 2026

News & Releases Model & API Releases

Mistral's chip ambition: conditional. Its EU cluster: 44 MW.

What Mensch said about Mistral chips, and what's confirmed: EU cluster specs, ASML deal, and the sovereign compute bet.

Sungjae Lee

Jun 09, 2026

News & Releases Model & API Releases

MAI Is Already in Your IDE — the Flagship Is Still Gated

Seven MAI models at Build 2026: what's live in Copilot, what's gated, and what Microsoft's technical report claims — all vendor-reported until verified.

Sungjae Lee

Jun 07, 2026

Claude / Anthropic News & Releases Model & API Releases

MAI-Thinking-1 beats Anthropic's top model — per Microsoft

Seven MAI models at Build 2026. MAI-Thinking-1 is a 35B-active sparse MoE — specs, claimed scores, and what's still unverified externally.

Sungjae Lee

Jun 07, 2026

Meta / Llama News & Releases Model & API Releases

Meta's always-on pendant will record everyone in the room — not just you

An internal Alex Himel memo, reported by The Information, reveals Meta's AI pendant roadmap: ambient audio capture, real-time transcription, and a Wearables for Work subscription tier — built on the Limitless acquisition.

Sungjae Lee

Jun 03, 2026

News & Releases Model & API Releases

3,000 tok/s on MI300X by deleting the kernel scheduler

Kog AI monokernel: ~3,000 tok/s on AMD MI300X by eliminating kernel launches. Technical read with caveats.

Sungjae Lee

Jun 03, 2026

News & Releases Model & API Releases

Booed at graduation — the AI skeptics you'll be shipping to

MIT Technology Review's May 2026 Hype Index covers graduation boos, Gen Z sentiment (46%), and record AI fundraising.

Sungjae Lee

May 31, 2026

Google / Gemini News & Releases Model & API Releases

Omni skips the re-render — nine demos show the difference

Gemini Omni and 3.5 Flash demo breakdown: nine I/O 2026 clips, scene-preservation vs parallel coding, API availability.

Sungjae Lee

May 31, 2026

Google / Gemini News & Releases Model & API Releases

Waymo Now Trains on Interactive Worlds Built From Street View

Genie 3 generates interactive worlds from real Street View geometry. Waymo is already using it for rare-event training.

Sungjae Lee

May 30, 2026

Google / Gemini News & Releases Model & API Releases

DeepMind Guides Blind Runners On-Device — No Cloud, No Tether

DeepMind's chest-mounted AI system lets blind runners navigate independently using dual-path on-device inference—no cloud, no tether.

Sungjae Lee

May 30, 2026

News & Releases Model & API Releases

grok-build-0.1 in Kilo Code, No API Key Needed for SuperGrok

SuperGrok and X Premium+ subscribers can now authenticate into Kilo Code and run grok-build-0.1 inside VS Code or JetBrains — no API key management required.

Sungjae Lee

May 29, 2026

News & Releases Model & API Releases

xAI grok-build-0.1 API Public Beta: Token Costs and SDK Support

xAI's coding model exits the $299 CLI gate. Here's what the public API beta actually offers developers.

Sungjae Lee

May 29, 2026

News & Releases Model & API Releases

Cohere's First Frontier Model Has Benchmark Gaps

Cohere's first open-weight frontier model: benchmark gaps, native citation design, and the enterprise sovereignty case.

Sungjae Lee

May 28, 2026

News & Releases Model & API Releases

What grok-build-0.1's Caching Incident Revealed

xAI's grok-build-0.1 hit public beta in May 2026. Here's what the spec says — and what the caching incident revealed.

Sungjae Lee

May 28, 2026

Claude / Anthropic News & Releases Model & API Releases

Claude Opus 4.8 Hits 69.2% SWE-Bench Pro — What Else Changed

Anthropic ships Opus 4.8 with 69.2% SWE-Bench Pro, mid-conversation system messages, and adaptive thinking.

Sungjae Lee

May 28, 2026

Google / Gemini News & Releases Model & API Releases

Gemini 3.5 Flash Goes GA With a Breaking thinking_level Change

Gemini 3.5 Flash is GA: 1M-token context, a breaking thinking_level change, and full pricing breakdown.

Sungjae Lee

May 28, 2026

Google / Gemini News & Releases Model & API Releases

Waymo가 이미 실사용 중인 Street View 기반 Genie 3

Genie 3 generates interactive worlds from real Street View geometry. Waymo is already using it for rare-event training.

Sungjae Lee

May 30, 2026

Google / Gemini News & Releases Model & API Releases

클라우드도 선도 없이 시각장애 러너를 안내하는 AI

DeepMind's chest-mounted AI system lets blind runners navigate independently using dual-path on-device inference—no cloud, no tether.

Sungjae Lee

May 30, 2026

News & Releases Model & API Releases

SuperGrok 구독자는 이제 API 키 없이 grok-build-0.1을 쓴다

SuperGrok and X Premium+ subscribers can now authenticate into Kilo Code and run grok-build-0.1 inside VS Code or JetBrains — no API key management required.

Sungjae Lee

May 29, 2026

News & Releases Model & API Releases