Model & API Releases
New AI model and API releases — capabilities, pricing, and what actually changed for developers.
An OVHcloud alum founded Gladia. Now OVHcloud wants it.
OVHcloud announced exclusive negotiations to acquire Gladia, a Paris STT startup. No valuation or close date disclosed.
Mistral's chip ambition: conditional. Its EU cluster: 44 MW.
What Mensch said about Mistral chips, and what's confirmed: EU cluster specs, ASML deal, and the sovereign compute bet.
MAI Is Already in Your IDE — the Flagship Is Still Gated
Seven MAI models at Build 2026: what's live in Copilot, what's gated, and what Microsoft's technical report claims — all vendor-reported until verified.
MAI-Thinking-1 beats Anthropic's top model — per Microsoft
Seven MAI models at Build 2026. MAI-Thinking-1 is a 35B-active sparse MoE — specs, claimed scores, and what's still unverified externally.
Meta's always-on pendant will record everyone in the room — not just you
An internal Alex Himel memo, reported by The Information, reveals Meta's AI pendant roadmap: ambient audio capture, real-time transcription, and a Wearables for Work subscription tier — built on the Limitless acquisition.
3,000 tok/s on MI300X by deleting the kernel scheduler
Kog AI monokernel: ~3,000 tok/s on AMD MI300X by eliminating kernel launches. Technical read with caveats.
Booed at graduation — the AI skeptics you'll be shipping to
MIT Technology Review's May 2026 Hype Index covers graduation boos, Gen Z sentiment (46%), and record AI fundraising.
Omni skips the re-render — nine demos show the difference
Gemini Omni and 3.5 Flash demo breakdown: nine I/O 2026 clips, scene-preservation vs parallel coding, API availability.
Waymo Now Trains on Interactive Worlds Built From Street View
Genie 3 generates interactive worlds from real Street View geometry. Waymo is already using it for rare-event training.
DeepMind Guides Blind Runners On-Device — No Cloud, No Tether
DeepMind's chest-mounted AI system lets blind runners navigate independently using dual-path on-device inference—no cloud, no tether.
grok-build-0.1 in Kilo Code, No API Key Needed for SuperGrok
SuperGrok and X Premium+ subscribers can now authenticate into Kilo Code and run grok-build-0.1 inside VS Code or JetBrains — no API key management required.
xAI grok-build-0.1 API Public Beta: Token Costs and SDK Support
xAI's coding model exits the $299 CLI gate. Here's what the public API beta actually offers developers.
Cohere's First Frontier Model Has Benchmark Gaps
Cohere's first open-weight frontier model: benchmark gaps, native citation design, and the enterprise sovereignty case.
What grok-build-0.1's Caching Incident Revealed
xAI's grok-build-0.1 hit public beta in May 2026. Here's what the spec says — and what the caching incident revealed.
Claude Opus 4.8 Hits 69.2% SWE-Bench Pro — What Else Changed
Anthropic ships Opus 4.8 with 69.2% SWE-Bench Pro, mid-conversation system messages, and adaptive thinking.
Gemini 3.5 Flash Goes GA With a Breaking thinking_level Change
Gemini 3.5 Flash is GA: 1M-token context, a breaking thinking_level change, and full pricing breakdown.
Waymo가 이미 실사용 중인 Street View 기반 Genie 3
Genie 3 generates interactive worlds from real Street View geometry. Waymo is already using it for rare-event training.
클라우드도 선도 없이 시각장애 러너를 안내하는 AI
DeepMind's chest-mounted AI system lets blind runners navigate independently using dual-path on-device inference—no cloud, no tether.
SuperGrok 구독자는 이제 API 키 없이 grok-build-0.1을 쓴다
SuperGrok and X Premium+ subscribers can now authenticate into Kilo Code and run grok-build-0.1 inside VS Code or JetBrains — no API key management required.
xAI grok-build-0.1 API 공개 베타: 토큰 비용 및 SDK 지원
xAI's coding model exits the $299 CLI gate. Here's what the public API beta actually offers developers.
Command A+, 벤치마크 갭에도 엔터프라이즈가 선택할 이유
Cohere's first open-weight frontier model: benchmark gaps, native citation design, and the enterprise sovereignty case.
grok-build-0.1, 캐싱 인시던트가 드러낸 스펙의 실체
xAI's grok-build-0.1 hit public beta in May 2026. Here's what the spec says — and what the caching incident revealed.
SWE-Bench Pro 69.2%의 Claude, 에이전트 코딩이 달라지나
Anthropic ships Opus 4.8 with 69.2% SWE-Bench Pro, mid-conversation system messages, and adaptive thinking.
Gemini 3.5 Flash GA, thinking_level이 기존 코드를 깨뜨린다
Gemini 3.5 Flash is GA: 1M-token context, a breaking thinking_level change, and full pricing breakdown.