
The Great Toil Shift: AI Didn't Remove Your Drudge Work, It Moved It
- 4 minutes - Mar 5, 2026
- #ai#technical-debt#developer-experience#productivity#process
One of the clearest promises of AI coding tools was relief from developer toil: the repetitive, low-value work—debugging boilerplate, writing tests for obvious code, fixing the same style violations—that keeps engineers from doing the interesting parts of their jobs. The premise was simple: AI does the tedious parts, humans do the creative parts.
The data from 2026 tells a more nuanced story. According to Sonar’s analysis and Opsera’s 2026 AI Coding Impact Benchmark Report, the amount of time developers spend on toil hasn’t decreased meaningfully. It’s shifted. High AI users spend roughly the same 23–25% of their workweek on drudge work as low AI users—they’ve just changed what they’re doing with that time.
From Debugging to Managing AI Output
Before AI tools, developer toil was concentrated in debugging, repetitive refactoring, and writing boilerplate. After AI adoption, that toil has largely moved to:
- Managing technical debt introduced by AI-generated code
- Rewriting AI output that was wrong, incomplete, or idiomatic to the wrong codebase
- Reviewing AI-generated PRs at higher volume (and higher defect rates)
- Debugging issues caused by AI code that appeared correct but had subtle logic errors
This is what Sonar calls “the great toil shift.” The work didn’t disappear—the conveyor belt moved the pile. And for many developers, the new pile feels worse: debugging AI-generated code you didn’t write and don’t fully understand is more cognitively taxing than debugging code you wrote yourself.
GenAI-Induced Technical Debt Is Now a Research Category
Researchers analyzing 6,540 LLM-referencing comments in GitHub repositories identified a new category of self-admitted technical debt they named GIST: GenAI-Induced Self-Admitted Technical Debt. Analyzing 81 concrete code comments, the most common patterns were:
- Postponed testing: “AI wrote this, tests TBD”
- Incomplete adaptation: Code ported from a generic pattern that doesn’t fully fit the codebase
- Limited understanding: Explicit comments noting the developer doesn’t understand why the AI-generated code works
That last category is particularly striking. In a traditional codebase, self-admitted technical debt usually represents a conscious shortcut: “I know how this should work; I didn’t have time to do it right.” GIST is different—it represents genuinely unknown terrain, where the developer can’t assess what they don’t know about their own code. That’s a harder class of debt to pay down.
The 88%/93% Split
Sonar surveyed developers on AI’s impact on technical debt and got a revealing result: 88% report at least one negative impact (most commonly: AI creates code that appears correct but isn’t), and 93% report at least one positive impact (most commonly: AI improves documentation). The simultaneous truth—AI is both helping and hurting on technical debt—explains why teams struggle to assess their net position.
The negative side has teeth: AI-generated code introduces 15–18% more security vulnerabilities, and the Opsera report found that teams without active debt management increased their technical debt by 30–41% after AI adoption. That’s not a rounding error; it’s a structural drag that compounds over time.
Why “Use AI More” Doesn’t Solve This
The instinctive response to AI-induced toil is “better prompts” or “more mature tooling.” Those help at the margins. But the toil shift is structural, not a prompting failure. AI models don’t know your codebase, your team’s implicit conventions, or the history of why a given module looks the way it does. Without that context, they produce code that works in isolation and creates friction in integration.
The fixes are also structural:
Invest in codebase context. Tools that give AI models access to your architecture, naming conventions, and past decisions—through AGENTS.md files, project-specific instructions, or repo-specialized models like SERA—reduce the “generic code that doesn’t fit” problem. The less AI has to guess about your system, the less adaptation debt it creates.
Measure debt-per-AI-PR separately. If you can’t see where technical debt is coming from, you can’t address it systematically. Tracking defect density and rework rates for AI-generated vs. human-generated code gives you data to work with.
Budget for AI code maintenance, not just AI code generation. The efficiency gains of AI generation are real but front-loaded. The cost shows up later in review, rework, and eventual refactoring. If your velocity metrics only capture the generation side, you’re optimizing for the wrong thing.
The great toil shift doesn’t make AI coding tools a bad investment—the generation speed is genuinely useful. But it’s a corrective to the narrative that AI removes drudge work. It moves it. Building with that understanding changes how you plan, measure, and maintain a codebase in 2026.


