Creeta

17k 토큰 → 1.4k — Headroom이 원본을 언제든 복원 가능하게 유지

에이전트 파이프라인용 오픈소스 컨텍스트 압축 미들웨어: 토큰 60–95% 절감, CCR 가역성, AST 인식 엔진.

Jun 01, 2026

News & Releases Dev Tools & SDK Changelogs

17k tokens → 1.4k — Headroom keeps the originals retrievable

Open-source context compression middleware for agent pipelines: 60–95% token cuts, CCR reversibility, AST-aware engines.

Creeta

Jun 01, 2026

Cognition의 260억 달러, 12월까지 10억 달러 ARR이 필요하다. 수치가 빠듯하다.

4억 9,200만 달러 ARR에 260억 달러 밸류에이션: Cognition 시리즈 D 지표, Windsurf 기여 귀속 문제, 10억 달러 ARR 목표.

Creeta

May 31, 2026

News & Releases Funding, Strategy & Policy

Cognition's $26B needs $1B ARR by December. The math is tight.

$26B valuation on $492M ARR: Cognition's Series D metrics, the Windsurf attribution question, and the $1B ARR target.

Creeta

May 31, 2026

졸업식에서 야유를 받다 — 당신이 제품을 만들어야 할 AI 회의론자들

MIT 테크놀로지 리뷰 2026년 5월 과대평가 지수: 졸업식 야유, Z세대 정서(46%), 사상 최고 AI 투자 현황을 다룬다.

Creeta

May 31, 2026

News & Releases Model & API Releases

Booed at graduation — the AI skeptics you'll be shipping to

MIT Technology Review's May 2026 Hype Index covers graduation boos, Gen Z sentiment (46%), and record AI fundraising.

Creeta

May 31, 2026

Opus 4.8, budget_tokens 폐기 — 그 외 변경 사항 총정리

Opus 4.8: 빠른 모드, 세션 중간 시스템 프롬프트, 1K 캐시 하한. 기존 budget_tokens 구문은 400 오류 반환.

Creeta

May 31, 2026

Claude / Anthropic Build & Learn Daily How-To

Opus 4.8 kills budget_tokens — here's what else moved

Opus 4.8: fast mode, mid-session system prompts, 1K cache floor. Old budget_tokens syntax returns 400.

Creeta

May 31, 2026

llama-bench, 지원 GPU에서 FA 누락 — b9437로 수정됨

llama.cpp b9437 (5월 30일): llama-bench에서 -fa가 auto로, -ngl이 -1로 변경. b9437 이전 비교 결과는 플래그 검토 필요.

Creeta

May 31, 2026

Meta / Llama vLLM / Ollama Build & Learn Daily How-To

llama-bench skipped FA on capable GPUs — b9437 corrects it

llama.cpp b9437 (May 30): -fa goes auto, -ngl to -1 in llama-bench. Your pre-b9437 comparisons need a flag audit.

Creeta

May 31, 2026

Qwen3.6-35B NVFP4, H100 하나로 구동 — A100 소유자는 제외

FP4 양자화 Qwen3.6-35B는 Hopper에서 약 23GB. vLLM serve 명령어, 환경 변수, DGX Spark 설정 및 주의사항 정리.

Creeta

May 31, 2026

Build & Learn Daily How-To

Qwen3.6-35B NVFP4 runs on one H100 — A100 owners are out

FP4-quantized Qwen3.6-35B fits in ~23 GB on Hopper. vLLM serve commands, env vars, DGX Spark config, and gotchas.

Creeta

May 31, 2026

Showing of 214 posts

17k 토큰 → 1.4k — Headroom이 원본을 언제든 복원 가능하게 유지

17k tokens → 1.4k — Headroom keeps the originals retrievable

Cognition의 260억 달러, 12월까지 10억 달러 ARR이 필요하다. 수치가 빠듯하다.

Cognition's $26B needs $1B ARR by December. The math is tight.

졸업식에서 야유를 받다 — 당신이 제품을 만들어야 할 AI 회의론자들

Booed at graduation — the AI skeptics you'll be shipping to

Opus 4.8, budget_tokens 폐기 — 그 외 변경 사항 총정리

Opus 4.8 kills budget_tokens — here's what else moved

llama-bench, 지원 GPU에서 FA 누락 — b9437로 수정됨

llama-bench skipped FA on capable GPUs — b9437 corrects it

Qwen3.6-35B NVFP4, H100 하나로 구동 — A100 소유자는 제외

Qwen3.6-35B NVFP4 runs on one H100 — A100 owners are out

Featured posts

Gemma 4 12B skips the audio encoder. Is 16 GB enough?

Mariner retired at I/O 2026. The VM-backed successor is Antigravity.

Mistral's chip ambition: conditional. Its EU cluster: 44 MW.

178 desk rejections on a parameter authors never saw

Gemma 4 12B, 오디오 인코더 없이 16GB로 충분할까?

Mariner, I/O 2026에서 은퇴. VM 기반 후속작은 Antigravity.

v0.22.0: 프론트엔드가 막히는 동안 칩은 놀고 있었다

Mistral의 칩 야망: 조건부. EU 클러스터: 44MW.

Meta Business AI, 글로벌 출시 — 단계적 롤아웃, 유료 요금제는 미정

Tags

Creeta

Featured posts

Tags

Sign up for insights and ideas