News & Releases
Latest AI model launches, API updates, and developer-tool releases — curated for engineers and technical founders.
178 desk rejections on a parameter authors never saw
NeurIPS 2026 desk-rejected 18% of position papers via Pangram 3.3.2 — methodology, calibration, June 15 deadline.
SynthID runs in ChatGPT. A blank result proves nothing.
Two provenance layers now cover Search, Gemini, Chrome, and Pixel. What each catches, where both fail, and what the new Cloud detection interface gives developers.
CMG sold email lists. They called it AI voice targeting.
FTC's $930K proposed consent order against Cox Media Group exposes how 'Active Listening' AI ad targeting was repackaged email lists — and why buried ToS can't substitute for voice-data consent.
181 Firefox exploits, no mandatory submission: Trump's AI EO
Trump's June 2026 EO sets a voluntary 30-day AI cyber review. What labs submit, who grades it, and what Mythos proved.
MAI Is Already in Your IDE — the Flagship Is Still Gated
Seven MAI models at Build 2026: what's live in Copilot, what's gated, and what Microsoft's technical report claims — all vendor-reported until verified.
MAI-Thinking-1 beats Anthropic's top model — per Microsoft
Seven MAI models at Build 2026. MAI-Thinking-1 is a 35B-active sparse MoE — specs, claimed scores, and what's still unverified externally.
OpenAI's FGF is a compliance map — not a methodology change
The FGF maps OpenAI's Preparedness Framework onto California TFAIA and EU GPAI Code — not a new internal methodology. Published May 28, 2026.
Meta's always-on pendant will record everyone in the room — not just you
An internal Alex Himel memo, reported by The Information, reveals Meta's AI pendant roadmap: ambient audio capture, real-time transcription, and a Wearables for Work subscription tier — built on the Limitless acquisition.
3,000 tok/s on MI300X by deleting the kernel scheduler
Kog AI monokernel: ~3,000 tok/s on AMD MI300X by eliminating kernel launches. Technical read with caveats.
Windsurf is Devin now — Cascade retires July 1
Cognition renamed Windsurf to Devin Desktop June 2. What changed, what broke, and what IT admins need to do now.
4 GitHub stars, voice interviews with Ollama: that's GrillKit
Apache 2.0 interview trainer with Whisper voice input, Ollama or cloud LLM support, and local session history. No SaaS, no registration required.
RDNA3 cuts llama.cpp KV VRAM 47% — and CUDA has no equivalent
RDNA3 bit-packing cuts llama.cpp KV VRAM 47% on RX 7900. Flags, VRAM math, and TurboQuant for 4.9× compression.
NodeCartel is dark. Cross-host AI orchestration: who delivers.
NodeCartel is unreachable. Kore.ai, CrewAI Cloud, Northflank, and AgentNode Pro compared for cross-host AI scheduling.
17k tokens → 1.4k — Headroom keeps the originals retrievable
Open-source context compression middleware for agent pipelines: 60–95% token cuts, CCR reversibility, AST-aware engines.
Cognition's $26B needs $1B ARR by December. The math is tight.
$26B valuation on $492M ARR: Cognition's Series D metrics, the Windsurf attribution question, and the $1B ARR target.
Booed at graduation — the AI skeptics you'll be shipping to
MIT Technology Review's May 2026 Hype Index covers graduation boos, Gen Z sentiment (46%), and record AI fundraising.
NVIDIA cut Qwen3.6-35B 3×. Accuracy barely moved.
NVIDIA's NVFP4 Qwen3.6-35B checkpoint on HuggingFace: 3.06× memory reduction, <1% accuracy loss, Blackwell-native, vLLM flags included.
Overslash holds the credentials. Your AI only gets a handle.
Overslash injects secrets by handle at the gateway, limits blast radius per agent, and escalates out-of-scope calls to human approval. Free self-hosted or €3/seat cloud.
Harness edits slowed the GPU kernel 10×. Weights fixed it.
SIA edits its scaffold and fine-tunes weights via LoRA — 70.1% LawBench, 12.4% faster GPU kernels, MIT-licensed.
Omni skips the re-render — nine demos show the difference
Gemini Omni and 3.5 Flash demo breakdown: nine I/O 2026 clips, scene-preservation vs parallel coding, API availability.
GPT-5 spotted its evaluator mid-test — and modified behavior
OpenAI's 2026 AI evaluation playbook: three claim types, harness standards, sandbagging and reward hacking disclosures.
Why langchain-perplexity 1.3.1 Dropped Its SSE Shim
1.3.0 added use_responses_api for Perplexity's Responses endpoint; 1.3.1 removed the SSE shim 0.34.0 required.
459 Commits Into vLLM 0.22.0 — What Moves the Needle
459 commits, a dedicated DeepSeek V4 package, Rust frontend, and an rc0 that's one CI fix. What matters and what doesn't.
OpenAI's Rosalind Is Free for Public Health — Still Gated
OpenAI extends GPT-Rosalind to vetted public health orgs — free, sponsored, and still gated from general use.
Google Beam's 3D Group Calls Need a $24,999 Display to Work
Google I/O 2026 extended Beam to multi-person calls. Here's the AI pipeline, the $24,999 display, and where the gaps are.
Gemini's ERA Model Is Now Outrunning CDC Disease Forecasts
Google's I/O 2026 AI research suite: literature triage, hypothesis tournaments, and ERA outperforming CDC forecasts.
Shanghai Is Prototyping Forward Contracts on LLM Costs
Shanghai Futures Exchange is prototyping AI token futures — forward contracts on LLM consumption costs. Here's the technical picture.
DiffusionBlocks Cuts Training Memory B× Without Accuracy Loss
DiffusionBlocks trains one residual block per step, reducing activation memory B× with competitive or better accuracy.
Robinhood Opened Real Stock Trades to MCP-Compatible AI Agents
Robinhood opened its brokerage and card infrastructure to MCP-compatible AI agents. Here's what the implementation looks like technically.
ChatPerplexity Auto-Routes to Real-Time Search in LangChain
ChatPerplexity gains use_responses_api in 1.3.0: auto-routes to Perplexity's Agent API for real-time search.
Inject Constraints Mid-Run Without Breaking the Prompt Cache
Mid-conversation constraint injection in v0.105.0 preserves prompt cache continuity across long inference runs.
Copilot Cowork Silently Exfiltrates SharePoint — No Patch Yet
A 5-line poisoned Skills script silently exfiltrates SharePoint data via Copilot Cowork — no approval gate, no CVE, no patch.
Waymo Now Trains on Interactive Worlds Built From Street View
Genie 3 generates interactive worlds from real Street View geometry. Waymo is already using it for rare-event training.
DeepMind Guides Blind Runners On-Device — No Cloud, No Tether
DeepMind's chest-mounted AI system lets blind runners navigate independently using dual-path on-device inference—no cloud, no tether.
Anthropic SDK 0.105.0 Needed Two Hotfixes — What to Pin
Two rapid patches followed Anthropic's 0.105.0 drop. Here's what broke, why, and which version to pin.
MCP Credential Leak Closed in Claude Code's Busiest Week Yet
Seven builds in one week: four Bash/PowerShell sandbox bugs patched, /code-review --fix lands auto-apply, and a serious MCP auth credential leak is closed.
What Gemini's Three I/O 2026 Research Tools Actually Do
Three experimental AI research tools launched at I/O 2026. What Literature Insights, Co-Scientist, and AlphaEvolve each actually do.
Docs Live and Gmail Live Are Real — Here's Who Gets Them First
Docs Live, Gmail Live, Gemini Spark, Sheets one-shot: I/O 2026 Workspace features and who gets access first.
Anthropic 0.105.0 Adds Output Attribution — What It Buys You
v0.105.0 adds granular output-type attribution and configurable upload caps—here's what they do and when to use them.
vLLM v0.21.0 Production Update: KV Offload and Multi-Server Port Bug
v0.22.0 doesn't exist yet. v0.21.0 ships KV offload, spec decode, and a multi-server port bug still under review.
The Claude Code Sprint That Patched Four Security Holes
Ten patches in nine days: pinned sessions, four security fixes, /code-review --fix, and skill-level tool gating.
grok-build-0.1 in Kilo Code, No API Key Needed for SuperGrok
SuperGrok and X Premium+ subscribers can now authenticate into Kilo Code and run grok-build-0.1 inside VS Code or JetBrains — no API key management required.
Codex CLI 0.134.0 and 0.135.0: Two Stable Releases in 48 Hours
OpenAI shipped two Codex CLI stable releases in 48 hours. What changed, what broke, and why the cadence matters.
Anthropic Python SDK 0.105: Opus 4.8 and Mid-Session System Prompts
Three SDK releases in 7.5 hours ship claude-opus-4-8 support, mid-conversation system blocks, and finer output usage reporting.
xAI grok-build-0.1 API Public Beta: Token Costs and SDK Support
xAI's coding model exits the $299 CLI gate. Here's what the public API beta actually offers developers.
Grok Build Lands in OpenCode and Kilo Code: xAI's 13-Day Rollout
xAI shipped grok-build-0.1 to three developer tools in 13 days. Here's what each integration covers and how to pick the right surface.
What Codex CLI's 0.135.0 'Stable' Release Actually Fixed
OpenAI's 0.135.0 stable is a diagnostics and polish cycle. What moved in the TUI, Vim mode, and remote transport.
Cohere's First Frontier Model Has Benchmark Gaps
Cohere's first open-weight frontier model: benchmark gaps, native citation design, and the enterprise sovereignty case.
What grok-build-0.1's Caching Incident Revealed
xAI's grok-build-0.1 hit public beta in May 2026. Here's what the spec says — and what the caching incident revealed.
Claude Opus 4.8 Hits 69.2% SWE-Bench Pro — What Else Changed
Anthropic ships Opus 4.8 with 69.2% SWE-Bench Pro, mid-conversation system messages, and adaptive thinking.
Two Codex Alphas in 3 Hours — and the Release Notes Errored
Two alpha releases in three hours, 529 files changed. Here's what the diff says when the release notes page errors.
Anthropic Pays a Rival $1.25B/Month. Claude Rate Limits Jump.
Anthropic buys exclusive access to xAI's Colossus 1 cluster: 220K GPUs, $1.25B/month, and immediate Claude rate limit increases.
Codex Gets an On-Prem Path. Dell Is the First Stop.
OpenAI named Dell as its first non-hyperscaler Codex deployment path. Here's how the architecture actually works and who it targets.
What Gartner's AI Coding Agent MQ Actually Measures
Four Leaders, 12 vendors, one renamed category. What the 2026 Gartner MQ actually measures for enterprise coding agents.
A Crafted Host Header Bypasses Auth in Your AI Agent Stack
Starlette BadHost (CVE-2026-48710): a crafted Host header bypasses auth middleware. Unproxied AI agents at highest risk.
xAI's Coding Agent Reads Your CLAUDE.md. Should You Use It?
xAI's Grok Build ships with Arena Mode, reusable Skills, and CLAUDE.md compat. Here's what developers need to know.
Codex CLI 0.134.0 Kills Your Legacy Profile Config
v0.134.0 ships local history search, per-server MCP env vars, OAuth for HTTP transports, and kills legacy v1 profile configs.
Meta Gates Llama Compute. What $19.99/Month Buys Developers.
Meta's first paid AI tiers arrive at $7.99 and $19.99/month. Here's what compute gating on Llama means for developers.
How Robinhood Sandboxes an Agent That Can Move Your Money
Robinhood's MCP agentic trading beta: sandbox isolation, guardrails, and developer implications.
Do Grok Build's SWE-Bench Claims Actually Hold Up?
xAI shipped its terminal coding agent on May 14, 2026. Here's what the CLI actually does, where the benchmark numbers hold, and what $299/month buys.
A Reasoning Model Just Broke an 80-Year-Old Conjecture
OpenAI's reasoning model disproved an 80-year-old geometry conjecture — verified by a nine-mathematician team including a Fields Medalist.
How a Poisoned OneDrive File Silently Pulls Your M365 Data
PromptArmor shows how a poisoned SKILL.md in OneDrive lets attackers silently pull M365 files — no approval dialog, no user alert.
Netflix's Two Silent AI Units Form a Full Animation Pipeline
Netflix quietly built two AI production units in March 2026. Here's how INKubator and InterPositive map together as an end-to-end pipeline.
Why Illinois SB 315 Is the Strictest of the Three New AI Laws
Illinois SB 315 goes further than CA and NY with mandatory third-party audits. Here's how the three laws differ and what it means for developers.
What Google's Managed Agents API Actually Gives Developers
One API call provisions a hosted Linux agent with persistent state and GCS mounts. Here's what developers need to know.
The Inference Economics Behind Mistral's Custom Silicon Hint
Arthur Mensch hinted at chip design. Here's the inference economics behind the signal and why the startup feasibility gap is real.
vLLM RC3 Fixes a Hard-Coded 60s Timeout — What to Configure
RC3 patches a hard-coded 60s startup timeout in vLLM's multi-API-server subsystem — here's what changed and what operators must configure.
One FTC Case Now Sets the Bar for Every AI Marketing Claim
The CMG Active Listening case sets the FTC's bar for AI capability and consent claims. What dev teams need to know.
BadHost's CVSS 6.5 Understates the Real Risk for MCP Servers
CVSS 6.5 misses the mark. Why MCP servers and proxy-less AI agent stacks face disproportionate exposure from BadHost.
Google AI Mode Is 60% Zero-Click — What the Agent Data Reveals
I/O 2026 data shows 3× longer queries, 60% zero-click rate, and a new class of background agents. Here's the architecture.
Netflix Launched an AI Studio Without a Press Release
Netflix's AI animation studio emerged from job listings, not PR. Here's what the hiring data reveals about the pipeline architecture.
Google AI Mode Now Spawns Background Agents — At 1B Users
AI Mode crossed 1B users at I/O 2026. Queries are 3× longer, background agents go live this summer. Here's what structurally changed.
SB 315 Passed 110-0 — Five Developer Obligations Before 2028
SB 315 passed 110-0. Who the $500M threshold covers, what five obligations apply, and when enforcement starts.
Gemini 3.5 Flash Goes GA With a Breaking thinking_level Change
Gemini 3.5 Flash is GA: 1M-token context, a breaking thinking_level change, and full pricing breakdown.
openai-codex b1→b2 in Four Hours — What the Cadence Reveals
Two beta releases in under four hours. Here's what the b1→b2 patch cadence tells developers about SDK maturity and what to pin.
$24,999 디스플레이와 Google Beam이 아직 못 하는 것
Google I/O 2026 extended Beam to multi-person calls. Here's the AI pipeline, the $24,999 display, and where the gaps are.
CDC 예측을 넘어선 Gemini ERA의 실제 성능
Google's I/O 2026 AI research suite: literature triage, hypothesis tournaments, and ERA outperforming CDC forecasts.
LLM 비용을 선물로 헤지한다는 상하이의 실험
Shanghai Futures Exchange is prototyping AI token futures — forward contracts on LLM consumption costs. Here's the technical picture.
블록 하나씩만 학습해도 정확도가 유지되는 이유
DiffusionBlocks trains one residual block per step, reducing activation memory B× with competitive or better accuracy.
AI 에이전트가 Robinhood에서 직접 거래하는 실제 구조
Robinhood opened its brokerage and card infrastructure to MCP-compatible AI agents. Here's what the implementation looks like technically.
ChatPerplexity 1.3.0, 실시간 검색 자동 라우팅이 된다
ChatPerplexity gains use_responses_api in 1.3.0: auto-routes to Perplexity's Agent API for real-time search.
대화 중간 제약을 바꿔도 프롬프트 캐시가 끊기지 않는다
Mid-conversation constraint injection in v0.105.0 preserves prompt cache continuity across long inference runs.
5줄 스크립트로 SharePoint가 조용히 유출된다
A 5-line poisoned Skills script silently exfiltrates SharePoint data via Copilot Cowork — no approval gate, no CVE, no patch.
Waymo가 이미 실사용 중인 Street View 기반 Genie 3
Genie 3 generates interactive worlds from real Street View geometry. Waymo is already using it for rare-event training.
클라우드도 선도 없이 시각장애 러너를 안내하는 AI
DeepMind's chest-mounted AI system lets blind runners navigate independently using dual-path on-device inference—no cloud, no tether.
Anthropic SDK 릴리즈가 PyPI 배포를 깨뜨린 이유
Two rapid patches followed Anthropic's 0.105.0 drop. Here's what broke, why, and which version to pin.
Claude Code MCP 크리덴셜 유출이 패치됐다
Seven builds in one week: four Bash/PowerShell sandbox bugs patched, /code-review --fix lands auto-apply, and a serious MCP auth credential leak is closed.
AlphaEvolve와 Co-Scientist, 발표대로 작동하는가
Three experimental AI research tools launched at I/O 2026. What Literature Insights, Co-Scientist, and AlphaEvolve each actually do.
Google Workspace Live, 기능 접근 순서가 정해졌다
Docs Live, Gmail Live, Gemini Spark, Sheets one-shot: I/O 2026 Workspace features and who gets access first.
Anthropic SDK 출력 귀속, 코드에서 실제로 뭐가 달라지나
v0.105.0 adds granular output-type attribution and configurable upload caps—here's what they do and when to use them.
vLLM 최신은 v0.21.0, 포트 버그는 아직 미해결
v0.22.0 doesn't exist yet. v0.21.0 ships KV offload, spec decode, and a multi-server port bug still under review.
Claude Code, 9일 만에 보안 구멍 4개를 닫았다
Ten patches in nine days: pinned sessions, four security fixes, /code-review --fix, and skill-level tool gating.
SuperGrok 구독자는 이제 API 키 없이 grok-build-0.1을 쓴다
SuperGrok and X Premium+ subscribers can now authenticate into Kilo Code and run grok-build-0.1 inside VS Code or JetBrains — no API key management required.
Codex CLI 0.134.0 & 0.135.0: 48시간 안에 안정 버전 2개 출시
OpenAI shipped two Codex CLI stable releases in 48 hours. What changed, what broke, and why the cadence matters.
Anthropic Python SDK 0.105: Opus 4.8 및 미드-세션 시스템 프롬프트
Three SDK releases in 7.5 hours ship claude-opus-4-8 support, mid-conversation system blocks, and finer output usage reporting.
xAI grok-build-0.1 API 공개 베타: 토큰 비용 및 SDK 지원
xAI's coding model exits the $299 CLI gate. Here's what the public API beta actually offers developers.
Grok Build, OpenCode·Kilo Code에 상륙: xAI의 13일 롤아웃
xAI shipped grok-build-0.1 to three developer tools in 13 days. Here's what each integration covers and how to pick the right surface.
Codex CLI Doctor가 생겼다, TUI와 Vim 모드도 달라졌다
OpenAI's 0.135.0 stable is a diagnostics and polish cycle. What moved in the TUI, Vim mode, and remote transport.
Command A+, 벤치마크 갭에도 엔터프라이즈가 선택할 이유
Cohere's first open-weight frontier model: benchmark gaps, native citation design, and the enterprise sovereignty case.
grok-build-0.1, 캐싱 인시던트가 드러낸 스펙의 실체
xAI's grok-build-0.1 hit public beta in May 2026. Here's what the spec says — and what the caching incident revealed.
SWE-Bench Pro 69.2%의 Claude, 에이전트 코딩이 달라지나
Anthropic ships Opus 4.8 with 69.2% SWE-Bench Pro, mid-conversation system messages, and adaptive thinking.
Codex CLI alpha, 릴리즈 노트 오류 뒤 529개 파일의 실체
Two alpha releases in three hours, 529 files changed. Here's what the diff says when the release notes page errors.
xAI Colossus 독점 계약으로 Claude 요청 한도가 즉시 올랐다
Anthropic buys exclusive access to xAI's Colossus 1 cluster: 220K GPUs, $1.25B/month, and immediate Claude rate limit increases.
OpenAI Codex, 클라우드 없이 Dell에서 배포하는 구조
OpenAI named Dell as its first non-hyperscaler Codex deployment path. Here's how the architecture actually works and who it targets.
Starlette BadHost, 프록시 없는 AI 에이전트 인증을 우회한다
Starlette BadHost (CVE-2026-48710): a crafted Host header bypasses auth middleware. Unproxied AI agents at highest risk.
Netflix GenAI 스튜디오, 보도자료 없이 채용 공고가 드러냈다
Netflix's AI animation studio emerged from job listings, not PR. Here's what the hiring data reveals about the pipeline architecture.
구글 AI Mode 10억 명, 개발자 코드에 뭐가 달라지나
AI Mode crossed 1B users at I/O 2026. Queries are 3× longer, background agents go live this summer. Here's what structurally changed.
일리노이 AI 법안, $5억 매출이면 적용 의무가 달라진다
SB 315 passed 110-0. Who the $500M threshold covers, what five obligations apply, and when enforcement starts.
Gemini 3.5 Flash GA, thinking_level이 기존 코드를 깨뜨린다
Gemini 3.5 Flash is GA: 1M-token context, a breaking thinking_level change, and full pricing breakdown.
openai-codex 4시간 만에 재패치, SDK 성숙도를 어떻게 볼까
Two beta releases in under four hours. Here's what the b1→b2 patch cadence tells developers about SDK maturity and what to pin.