Your daily AI news digest
Andrej Karpathy, the founding OpenAI researcher who shaped much of the modern LLM training playbook and then spent the past two years building Eureka Labs as a one-person AI education project, is joining Anthropic. Wes Roth's video walks through the move and what it signals about where talent gravity is now strongest in the frontier-lab landscape. The short answer: Anthropic's safety-forward research culture, the room to publish, and a research agenda that explicitly courts the kind of interpretability work Karpathy has been gesturing toward in his own essays.
The hire matters because Karpathy is not a generic researcher; he is a one-person brand whose endorsements move developer attention. Anthropic gets a magnet for the next round of recruiting and a credible internal voice on training and architecture. OpenAI loses, again, on the perception axis even as it continues to outship on the product axis. The story underneath the story is that the field's top individual contributors increasingly choose the lab whose values they can live with, not the one with the highest comp ceiling.
The latest alpha of Datasette Agent integrates with the new Datasette 1.0a30 release through a redesigned "Start a new agent chat" entry point in the Jump to menu. The hosted version at agent.datasette.io is now live behind GitHub authentication, giving builders a working reference for how to point an LLM agent at a real SQL database without rolling the runtime themselves.
Nate B. Jones argues the current AI capex cycle is running into industrial and business limits faster than the headlines suggest. Power, fab capacity, memory supply, and enterprise willingness to absorb six-figure agent pilots are all converging into a constraint set that the current valuation models do not yet price in. The framing pairs cleanly with this week's Epoch AI chip-cost data.
Google Cloud COO Francis de Souza makes the case for security-from-the-start in enterprise AI rollouts, while Google itself is fielding scrutiny over API billing surprises and slow credential-revocation timelines. The piece is a useful temperature check on a hyperscaler asking its customers to do what it is still building the muscle to do itself.
Epoch AI's latest data insight finds that high-bandwidth memory rose from a minority share to roughly 63% of total AI chip component spending between Q1 2024 and Q4 2025. Logic die costs stayed roughly flat, and packaging fell as a percentage of the whole. The chart reframes the supply story: the next AI bottleneck is HBM allocation, not GPU dies.
Huang refreshes her widely-shared "essential AI skills" tour for the current tool landscape. The list has shifted from prompting basics toward agent orchestration, evaluation discipline, and the production skills that distinguish a working AI engineer from a power user. A useful sanity check for anyone scoping a 2026 AI learning plan.
The latest Datasette alpha introduces a Jump-to menu for fast search across databases, tables, and debug pages, plus a plugin hook that lets extensions register their own menu items. It is the structural change that makes Datasette Agent's chat entry point possible, and it is the kind of unsexy plumbing that quietly defines what a tool can do next.
Willison quotes Flask creator Armin Ronacher on the unique frustration of receiving long, plausible, and largely useless AI-generated issue reports on open-source projects. Ronacher argues for a return to short, specific, human-authored bug reports with the actual observation and error included. It is the maintainer-side complaint that pairs with every "AI is great at coding" story.
TechCrunch tests Amazon's Bee, an AI wearable that records and transcribes the conversations around it for later recall. The reviewer finds genuine professional utility, and then immediately runs into the privacy concerns of a device that continuously uploads ambient audio to a cloud the user does not own. The product is shipping; the social contract for it is not.
New research finds LLM coding agents lose roughly 30 percentage points in assertion pass rates as structural requirements accumulate, a phenomenon the authors call "constraint decay." The paper is the rare quantitative answer to the common practitioner complaint that agents do well on toy tasks and fold on real backends.
Reasonix is a terminal-resident coding agent built around DeepSeek as the default model. The pitch is a Claude-Code-shaped workflow at a fraction of the per-token cost, and another data point that the open-weights stack is reaching parity on the agent harness as well as the model.
Inspired by the free PDF re-release of Usborne's "Creepy Computer Games" books, Willison ported the 1983 BASIC game Mad House to interactive JavaScript. It is the kind of weekend project that doubles as a tiny showcase for how an LLM-paired developer attacks a translation task end-to-end.
Karpathy joining Anthropic, Epoch AI's memory-share data, and Nate B. Jones's "wall" thesis all describe the same shift from a model-centric AI economy to an infrastructure-and-talent-centric one. The labs that win the next two years are the ones that lock in scarce HBM, scarce researchers, and scarce enterprise trust before the cycle turns.