Skip to content
Martin Gratzer
Go back

The Gap Nobody Measures in AI Adoption

I ran a small internal baseline survey across one engineering org, expecting to confirm what I already assumed: this org was ahead of the curve. The results below are aggregated and anonymized. The adoption numbers backed up the assumption. Roughly 80% use AI tools every day, and nearly everyone uses them weekly. Industry-wide, weekly adoption is closer to 75%. The adoption numbers were above that benchmark.

Then I asked a different question: “Do you have a structured workflow for working with AI?” Only a small minority said yes. That’s where a different pattern showed up.

The spread in outcomes

The finding I didn’t expect wasn’t about adoption at all. It was about range. Reported time savings went from roughly break-even to more than a full workday per week saved. Within one org, with access to the same tools, the spread in outcomes was hard to ignore.

The variable wasn’t which tool people used. Engineers on the same tool reported everything from break-even to several hours saved. What separated the top from the rest was whether they’d built a workflow, not just better prompts: a repeatable loop for breaking down the task, exploring the codebase, implementing in steps, generating tests early, and reviewing for drift before opening a PR.

One respondent laid it out clearly: AI helps understand the details of a problem, maintain focus during implementation, then write tests fast enough to catch inconsistencies before manual testing. That’s what I mean by workflow: a repeatable sequence of steps where the gains compound across the task. I wrote about building that kind of workflow myself in Forging a Workflow.

The engineers who hadn’t built workflows were still getting value. They saved a couple of hours most weeks, but those gains didn’t compound into bigger gains the following week.

What the answers actually said

The numbers told one story. The open-ended answers told a sharper one.

Most respondents described their prompting as “hit or miss.” Many said AI often misses relevant codebase context. Nearly half said AI-generated code still needs substantial changes before it’s production-ready. Several respondents described the output as drifting toward generic or overly academic patterns instead of established team conventions.

The top improvement request wasn’t a better model or broader access. It was codebase-aware tooling. Peer learning from effective colleagues came next.

Everyone uses AI. Most people find it useful. But the usefulness plateaus unless the workflow and codebase context around it evolve too. Laura Tacho, presenting DX data at the Pragmatic Summit, put it plainly: “spray-and-pray does not work.” The blockers aren’t model quality or tooling gaps. They’re context, conventions, and unclear expectations. The tools work. What’s missing is the infrastructure around them.

Try it yourself

If you want to know where your org actually stands, here are the questions from the survey that revealed the most. Eight questions, ten minutes, and you’ll quickly see whether this pattern exists in your team too.

  1. How often do you use AI tools for development? (Daily / Weekly / Monthly / Never)
  2. How confident are you in your prompting? (Confident / Hit or miss / Struggling)
  3. Estimate your net weekly time impact — time saved minus time spent reviewing and fixing. (Loss / Break-even / 1-2h / 3-5h / 6-8h / 9+h)
  4. How often does AI-generated code need substantial changes before it’s production-ready? (Rarely / Sometimes / Often / Almost always)
  5. How often does AI miss relevant context about your codebase? (Rarely / Sometimes / Often / Almost always)
  6. What context limitations frustrate you most? (Architecture / Conversation memory / File scope / Conventions / Outdated patterns)
  7. What would most improve your effectiveness? (Prompting techniques / Security practices / Codebase-aware tooling / Use case workshops / Peer learning)
  8. Do you have a structured, repeatable workflow for working with AI? (Yes / No / Tried but it didn’t stick)

Questions 1-2 tell you where adoption stands. Questions 3-6 reveal the integration gap. Questions 7-8 tell you where to invest.

What you’re looking for is the spread. If everyone saves roughly the same amount, your org is applying AI consistently, even if the level is low. If the range is wide, the gap is probably workflow, not access. Your high performers can usually tell you what the rest are missing. Look at what keeps coming back in review. If reviewers are repeatedly correcting conventions, architecture, or missing context, the problem usually starts before the PR.

If your answers look anything like these, you’re not alone. Industry-wide, about two thirds of developers say AI misses relevant codebase context, and 66% say AI code is “almost right, but not quite.” DX data shows organizations hovering around a 10% productivity gain for several quarters without breaking through. High adoption, low integration. That’s the gap nobody measures.


Markdown
Share this post on:

Next Post
From Ore to Iron: Build Your Own Coding Agent