16 posts 8 posts

Model & API Releases

New AI model and API releases — capabilities, pricing, and what actually changed for developers.

An OVHcloud alum founded Gladia. Now OVHcloud wants it.

OVHcloud announced exclusive negotiations to acquire Gladia, a Paris STT startup. No valuation or close date disclosed.

Mistral's chip ambition: conditional. Its EU cluster: 44 MW.

What Mensch said about Mistral chips, and what's confirmed: EU cluster specs, ASML deal, and the sovereign compute bet.

MAI Is Already in Your IDE — the Flagship Is Still Gated

Seven MAI models at Build 2026: what's live in Copilot, what's gated, and what Microsoft's technical report claims — all vendor-reported until verified.

MAI-Thinking-1 beats Anthropic's top model — per Microsoft

Seven MAI models at Build 2026. MAI-Thinking-1 is a 35B-active sparse MoE — specs, claimed scores, and what's still unverified externally.

Meta's always-on pendant will record everyone in the room — not just you

An internal Alex Himel memo, reported by The Information, reveals Meta's AI pendant roadmap: ambient audio capture, real-time transcription, and a Wearables for Work subscription tier — built on the Limitless acquisition.

3,000 tok/s on MI300X by deleting the kernel scheduler

Kog AI monokernel: ~3,000 tok/s on AMD MI300X by eliminating kernel launches. Technical read with caveats.

Booed at graduation — the AI skeptics you'll be shipping to

MIT Technology Review's May 2026 Hype Index covers graduation boos, Gen Z sentiment (46%), and record AI fundraising.

Omni skips the re-render — nine demos show the difference

Gemini Omni and 3.5 Flash demo breakdown: nine I/O 2026 clips, scene-preservation vs parallel coding, API availability.

Waymo Now Trains on Interactive Worlds Built From Street View

Genie 3 generates interactive worlds from real Street View geometry. Waymo is already using it for rare-event training.

DeepMind Guides Blind Runners On-Device — No Cloud, No Tether

DeepMind's chest-mounted AI system lets blind runners navigate independently using dual-path on-device inference—no cloud, no tether.

grok-build-0.1 in Kilo Code, No API Key Needed for SuperGrok

SuperGrok and X Premium+ subscribers can now authenticate into Kilo Code and run grok-build-0.1 inside VS Code or JetBrains — no API key management required.

xAI grok-build-0.1 API Public Beta: Token Costs and SDK Support

xAI's coding model exits the $299 CLI gate. Here's what the public API beta actually offers developers.

Cohere's First Frontier Model Has Benchmark Gaps

Cohere's first open-weight frontier model: benchmark gaps, native citation design, and the enterprise sovereignty case.

What grok-build-0.1's Caching Incident Revealed

xAI's grok-build-0.1 hit public beta in May 2026. Here's what the spec says — and what the caching incident revealed.

Claude Opus 4.8 Hits 69.2% SWE-Bench Pro — What Else Changed

Anthropic ships Opus 4.8 with 69.2% SWE-Bench Pro, mid-conversation system messages, and adaptive thinking.

Gemini 3.5 Flash Goes GA With a Breaking thinking_level Change

Gemini 3.5 Flash is GA: 1M-token context, a breaking thinking_level change, and full pricing breakdown.

Waymo가 이미 실사용 중인 Street View 기반 Genie 3

Genie 3 generates interactive worlds from real Street View geometry. Waymo is already using it for rare-event training.

클라우드도 선도 없이 시각장애 러너를 안내하는 AI

DeepMind's chest-mounted AI system lets blind runners navigate independently using dual-path on-device inference—no cloud, no tether.

SuperGrok 구독자는 이제 API 키 없이 grok-build-0.1을 쓴다

SuperGrok and X Premium+ subscribers can now authenticate into Kilo Code and run grok-build-0.1 inside VS Code or JetBrains — no API key management required.

xAI grok-build-0.1 API 공개 베타: 토큰 비용 및 SDK 지원

xAI's coding model exits the $299 CLI gate. Here's what the public API beta actually offers developers.

Command A+, 벤치마크 갭에도 엔터프라이즈가 선택할 이유

Cohere's first open-weight frontier model: benchmark gaps, native citation design, and the enterprise sovereignty case.

grok-build-0.1, 캐싱 인시던트가 드러낸 스펙의 실체

xAI's grok-build-0.1 hit public beta in May 2026. Here's what the spec says — and what the caching incident revealed.

SWE-Bench Pro 69.2%의 Claude, 에이전트 코딩이 달라지나

Anthropic ships Opus 4.8 with 69.2% SWE-Bench Pro, mid-conversation system messages, and adaptive thinking.

Gemini 3.5 Flash GA, thinking_level이 기존 코드를 깨뜨린다

Gemini 3.5 Flash is GA: 1M-token context, a breaking thinking_level change, and full pricing breakdown.