Creeta — AI developer tools & ecosystem news

GPT-5 spotted its evaluator mid-test — and modified behavior

OpenAI's 2026 AI evaluation playbook: three claim types, harness standards, sandbagging and reward hacking disclosures.

'Gemini Omni 3.5' doesn't exist. Here's the real split.

SDK setup, video generation calls, and conversational editing for Gemini Omni — Google's new world model from I/O 2026.

What openai-codex Beta Gets Wrong on First Install

Official openai-codex first beta: how to pin v0.1.0b1, start a thread, and avoid the beta quirks. Released May 28 2026.

What langchain-fireworks 1.4.x Changed for Your Code

What the 1.4.x patch sequence changed — and a runnable ChatFireworks setup from scratch.

Opus 4.8 Thinking Blocks Were Silently Corrupting on Retry

Thinking blocks on Opus 4.8 were corrupting on retry. v2.1.156 is the hotfix — update, verify, and see what else landed.

Your Claude Code Skills Now Hot-Reload Without Restart

Claude Code v2.1.157 adds .claude/skills/ live-loading, worktree unlocking, and OTEL telemetry. Annotated guide.

openai-codex b2 Has a Renamed Config Class Worth Knowing

v0.1.0b2 ships named Sandbox presets and a renamed config class. A runnable walkthrough from pip install to first thread.

Claude Code Now Gates Execution on Bedrock and Vertex

v2.1.158 enables classifier-gated execution on managed inference platforms. Here's the env var, what it does, and what to verify before upgrading.

Why langchain-perplexity 1.3.1 Dropped Its SSE Shim

1.3.0 added use_responses_api for Perplexity's Responses endpoint; 1.3.1 removed the SSE shim 0.34.0 required.

459 Commits Into vLLM 0.22.0 — What Moves the Needle

459 commits, a dedicated DeepSeek V4 package, Rust frontend, and an rc0 that's one CI fix. What matters and what doesn't.

OpenAI's Rosalind Is Free for Public Health — Still Gated

OpenAI extends GPT-Rosalind to vetted public health orgs — free, sponsored, and still gated from general use.

Devin's $26B Raise Changed How AI Coding Agents Compare

From GitHub Agent HQ to Devin's $26B raise: a technical breakdown of what changed in AI coding agents in 2026.

Showing of 214 posts