The VP Geek Speaks
- Home /
- The VP Geek Speaks

The METR Study One Year Later: When AI Actually Slows Developers
In early 2025, METR (Model Evaluation and Transparency Research) ran a randomized controlled trial that caught the industry off guard. Experienced open-source developers—people with years on mature, high-star repositories—were randomly assigned to complete real tasks either with AI tools (Cursor Pro with Claude) or without. The result: with AI, they took 19% longer to finish. Yet before the trial they expected AI to make them about 24% faster, and after it they believed they’d been about 20% faster. A 39-point gap between perception and reality.

Getting Your Team Unstuck: A Manager's Guide to AI Adoption
You’ve got AI tools in place. You’ve encouraged the team to use them. But the feedback is lukewarm or negative: “We tried it.” “It’s not really faster.” “We don’t see the benefit.” As a manager, you’re stuck between leadership expecting ROI and a team that doesn’t feel it.
The way out isn’t to push harder or to give up. It’s to change how you’re leading the adoption: create safety to experiment, narrow the focus so wins are visible, and align incentives so that “seeing benefits” is something the team can actually achieve. This guide is for engineering managers whose teams are struggling to see any performance benefits from AI in their software engineering workflows—and who want to turn that around.

When AI Slows You Down: Picking the Right Tasks
One of the main reasons teams don’t see performance benefits from AI is simple: they’re using it for the wrong things.
AI can make you faster on some tasks and slower on others. If the mix is wrong—if people lean on AI for complex design, deep debugging, and security-sensitive code while underusing it for docs, tests, and boilerplate—then overall you feel no gain or even a net loss. The tool gets blamed, but the issue is task fit.

Start Here: Three AI Workflows That Show Results in a Week
When a team has tried AI and concluded “we don’t see the benefit,” the worst move is to push harder on the same, vague usage. A better move is to pick a few concrete workflows where AI reliably helps, run them for a short time, and measure the outcome. That gives the team something tangible to point to—“this is where AI helped us.”
Here are three workflows that tend to show results within a week and are a good place to start for teams struggling to see performance benefits from AI in their software engineering workflows.

The Trust Collapse: Why 84% Use AI But Only 33% Trust It
Usage of AI coding tools is at an all-time high: the vast majority of developers use or plan to use them. Trust in AI output, meanwhile, has fallen. In recent surveys, only about a third of developers say they trust AI output, with a tiny fraction “highly” trusting it—and experienced developers are the most skeptical.
That gap—high adoption, low trust—explains a lot about why teams “don’t see benefits.” When you don’t trust the output, you verify everything. Verification eats the time AI saves, so net productivity is flat or negative. Or you use AI only for low-stakes work and conclude it’s “not for real code.” Either way, the team doesn’t experience AI as a performance win.

Measuring What Matters: Getting Real About AI ROI
When a team says they don’t see performance benefits from AI, the first question to ask isn’t “Are you using it enough?” It’s “How are you measuring benefit?”
A lot of organizations track adoption (who has a license, how often they use the tool) or activity (suggestions accepted, chats per day). Those numbers go up and everyone assumes AI is working. But cycle time hasn’t improved, quality hasn’t improved, and the team doesn’t feel faster. So you get a disconnect: the dashboard says success, the team says “we don’t see it.”

OpenClaw for Teams That Gave Up on AI
Lots of teams have been here: you tried ChatGPT, Copilot, or a similar assistant. You used it for coding, planning, and support. After a few months, the verdict was “meh”—maybe a bit faster on small tasks, but no real step change, and enough wrong answers and extra verification that it didn’t feel worth the hype. So you dialed back, or gave up on “AI” as a productivity lever.
If that’s you, the next step isn’t to try harder with the same tools. It’s to try a different kind of tool: one built to do a few concrete jobs in your actual environment, with access to your systems and a clear way to see that it’s helping. OpenClaw (and tools like it) can be that next step—especially for teams that are struggling to see any performance benefits from AI in their software engineering workflows.

Why Your Team Isn't Seeing AI Benefits (And It's Not the Tools)
You rolled out AI coding tools. You got licenses, ran the demos, and encouraged the team to try them. Months later, the feedback is lukewarm: “We use it sometimes.” “It’s okay for small stuff.” “I’m not sure it’s actually faster.” Nobody’s seeing the dramatic productivity gains the vendor promised.
If this sounds familiar, you’re not alone. Research shows that while 84% of developers use or plan to use AI tools, only 55% find them highly effective—and trust in AI output has dropped sharply. Adoption doesn’t equal impact. The gap between “we have AI” and “AI is helping us ship better, faster” is where most teams get stuck.

The Documentation Problem AI Actually Solves
I’ve spent the past several weeks writing critically about AI tools—the productivity paradox, comprehension debt, burnout risks, vibe coding dangers. Those concerns are real and important.
But I want to end this series on a genuinely positive note, because there’s one area where AI tools deliver clear, consistent, unambiguous value for engineering teams: documentation.
Documentation is the unloved obligation of software development. Everyone agrees it’s important. Nobody wants to write it. The result is that most codebases are woefully underdocumented, and the documentation that does exist is often outdated, incomplete, or wrong.

Your AI-Generated Codebase Is a Liability
If a quarter of Y Combinator startups have codebases that are over 95% AI-generated, we should probably talk about what that means when those companies get acquired, audited, or sued.
AI-generated code looks clean. It follows conventions. It passes linting. It often has reasonable test coverage. By most surface-level metrics, it appears to be high-quality software.
But underneath the polished exterior, AI-generated codebases carry risks that traditional codebases don’t. Security vulnerabilities that look correct. Intellectual property questions that don’t have clear answers. Structural problems that emerge only under stress. Dependency chains that nobody consciously chose.
Categories
Tags
- 100pounds
- 2020
- Adblock-Plus
- Adoption
- Agents
- Agile
- Ai
- Apache
- Apple
- Architecture
- Authorize-Net
- Automation
- Bing
- Bingbot
- Blog
- Book-Reviews
- Books
- Burnout
- Business-Tools
- Cache
- Career
- Chatgpt
- Chrome
- Cloudflare
- Code-Quality
- Code-Review
- Coding
- Compass
- Conversion
- Css
- Culture
- Design-Patterns
- Developer-Experience
- Development
- Disqus
- Docker
- Documentation
- Firefox
- Future-of-Work
- Gemini
- Genesis-Framework
- Getting-Started
- Ghost-Tag
- Github-Copilot
- Githubpages
- Google-Slides
- Google-Workspace
- Governance
- Helper
- Hiring
- How-Not-To
- How-To
- Html
- Hugo
- Infrastructure
- Internet-Explorer
- Interviews
- Iphone-6
- Javascript
- Jekyll
- Jquery
- Junior-Developers
- Knowledge-Management
- Laravel
- Leadership
- Legal
- Lessons-Learned
- MacOS
- Magento
- Magento 2
- Magento2
- Management
- Meetings
- Mental-Health
- Mentorship
- Metr
- Metrics
- Microsoft
- Moltbot
- Mysql
- Netlify
- Nginx
- Nodejs
- Open-Source
- Openclaw
- OSX
- Performance
- Personal
- Php
- Policy
- Presentations
- Productivity
- Programming
- Python
- Quality
- Rant
- Remote-Work
- Research
- Responsive-Web-Design
- Retrospective
- Roi
- Safari
- Sales
- Scrum
- Security
- Series
- Sitecatalyst
- Sota
- Sql
- Sql-Server
- Tasks
- Teams
- Technical-Debt
- Testing
- Tier-Pricing
- Tips
- Tmobile
- Tools
- Trust
- Unittest
- Ux
- Varnish
- Verification
- Vibe-Coding
- Visual-Studio
- Web-Development
- Windows-7
- Windows-Vista
- Woocommerce
- Wordpress
- Workflow
- Workflows
- Xml