Cognitive Debt: How AI Coding Tools Degraded the Sk…

A figure with a lantern standing on a cliff face, developer workstations visible above on the surface, geological strata of accumulated code descending into darkness below

The New York Times reports that the rapid adoption of AI coding tools has let workers generate massive volumes of code, leaving companies scrambling to review and secure what the tools produced. In January, Anthropic published research showing that the biggest skill decline from using AI coding tools occurs in debugging — the exact skill that code review requires. The tools that created the volume degraded the capacity to review it.

The Design

The problem AI coding tools were built to solve was real. Developers spent too much time on routine code — boilerplate, tests, documentation, standard patterns. The work was tractable but slow. The solution was autocomplete at the scale of entire functions, then files, then projects. GitHub Copilot launched in June 2021. Within months, 30% of new code on GitHub was being written with Copilot's assistance. The premise was straightforward: reduce the time between conception and implementation. Get the code written. Everything downstream would benefit.

The premise delivered. Developers adopted AI coding tools faster than any previous developer technology. By 2025, 84% were using or planning to use AI assistants. Startups built entire companies around AI-assisted development. The tools became infrastructure. And the code kept coming.

The Warning

December 2022

Stanford study: programmers who used AI tools like GitHub Copilot to solve a set of coding challenges produced less secure code than those who did not

The Register

In December 2022, a Stanford study found that developers using GitHub Copilot produced less secure code than those who didn't. Not marginally less secure — systematically. The AI suggested plausible patterns that introduced real vulnerabilities. The warning was published, widely cited, and did not slow adoption. The tools were too useful to slow.

In June 2023, the Wall Street Journal reported that IT executives were warning about a different problem: generative AI "lowers the barrier for code creation," which could "result in growing levels of complexity, technical debt, and confusion as they try to manage a ballooning pile of software." The volume concern preceded the volume by two years.

The Bottleneck Moves

By mid-2025, the pile had arrived. Ox Security, which scans code for vulnerabilities, reported it was scanning over 100 million lines per day — not just more code, but a new category of code with a systematically different security profile than what human developers write. The review infrastructure was being built in parallel with the production infrastructure, but the production side was faster.

The Bloomberg report from March 2026 named what was happening: AI coding agents had kicked off a "productivity panic" among executives. The UCB study it cited found that developers using AI tools were working longer hours, not shorter ones. The tool that was supposed to reduce developer workload had increased it.

Amazon's trajectory tells the story: four months of steering developers toward AI tools, then an internal memo restricting junior and mid-level engineers from AI-assisted code changes after a "trend of incidents." On the same day — March 10, 2026 — Anthropic launched Code Review for Claude Code: an AI agent that reviews developer pull requests for bugs. A typical review costs $15 to $25 in token usage. AI to review what AI wrote.

The Skill That Declined

January 2026

Anthropic details an experiment on whether AI coding tools shape developer skills: the biggest performance decline for developers occurred in debugging tasks

Anthropic

In January 2026, Anthropic published research on how AI coding tools affect developer skills. The finding: AI can speed up some tasks by 80%. And the biggest performance decline for developers who use AI tools occurs in debugging — reading code, tracing failures, understanding what a system is actually doing as opposed to what it was supposed to do.

The volume of code increased. The capacity to review that code — the debugging skill that code review requires — decreased. The tools did not just move the bottleneck. They degraded the human capability that the new bottleneck demands.

The tools made developers faster at production and slower at comprehension. The debt that accumulated is not in the code. It is in the understanding.

What Cognitive Debt Is

The term emerged in February 2026 from Margaret-Anne Storey's research: as AI and agents accelerate development, "cognitive load and cognitive debt are likely to become bigger threats to developers than technical debt." Technical debt is code that was written to work but not to last — shortcuts that accumulate interest. Cognitive debt is something different: code that was written, reviewed, and shipped, but that nobody fully understands. The debt is in the gap between what the system does and what the team knows it does.

A developer who writes a function understands it — the edge cases, the assumptions baked into the logic, the reason a particular approach was taken instead of the obvious one. An AI that generates a function has no such knowledge. It produces output that is syntactically correct and often functionally correct. And the developer who deploys it, under pressure to ship, in an environment where review is automated, inherits a system whose internal logic they cannot fully reconstruct.

This is the debt that doesn't appear on the sprint board. It shows up when something breaks and nobody knows why. It shows up in the incidents.

The Bottleneck Returns

GitHub Copilot was designed to eliminate the developer as a bottleneck in software production. It succeeded. The bottleneck moved from writing to reviewing. AI code reviewers moved it again — from reviewing to understanding. And understanding is the one step that cannot be delegated without cost, because it is what makes debugging possible when the system fails.

The developer is the bottleneck again — not for generating code, but for comprehending what the tools produced. A codebase where the institutional knowledge lives not in any developer's head but in a log of prompts, reviewed through automated systems at $25 per pull request. The tools kept their promise. They made it easier to produce software. They made it harder to understand it.

In 2022, a Stanford study found that Copilot users wrote less secure code. Three years later, Anthropic found they lost the skill to debug it. The warning was not wrong. The adoption continued anyway.