We Learned This by Doing, Not Theorizing
The first version of how we worked with AI was wrong in ways that look obvious in hindsight and were invisible at the time.
We'd come in to a new engagement, talk to the client about their domain, write up what we'd learned, and then put our heads down to build. We treated the AI tools as accelerators on the build phase. The agents wrote code faster than our humans would have. We shipped working software faster than we'd shipped working software before. We were, in the conventional sense, doing well.
And we kept noticing something. The work was good but it wasn't quite right. The client would react to the first version and say "yeah, this is what I asked for, but it's not what I meant." We'd patch. They'd react. We'd patch. The build phase was three times faster than it used to be. The wrap-up phase, where we figured out what we should have actually built, took just as long as it always had.
We started writing about what we noticed in a shared doc. Then we changed how we worked, deliberately, on the next engagement. Then again on the one after. The shape of the work we ended up with looks like our AI-native way of working. But the version in our heads in early 2024 was not the version we use now. It changed because the work changed it.
That's the only honest origin story for what we do. We learned it by doing it.
Why doesn't theory work for this kind of methodology?
There's a temptation, when you're inside a new technology shift, to try to think your way to the right way of working. Read everything. Map out the principles. Define the framework. Then go execute against it.
That move makes sense in fields where the underlying physics are stable enough that you can reason about them in advance. It works less well when the physics keep shifting. The capability of the agents we work with now barely resembles the agents we were working with eighteen months ago. The right way to collaborate with them has changed correspondingly. A framework written in early 2024 would have prescribed practices that the better tools made unnecessary by mid-2025. A framework written today would prescribe practices that won't survive whatever the tools become next year.
So instead of writing the framework, we write the engagement notes. Every engagement teaches us something the previous ones didn't. Some lessons replicate, and those graduate into the way we work. Some lessons turn out to have been specific to one client or one product, and those get demoted. The methodology is the residue of that filtering, not the cause of it.
We've learned, the hard way, that the cost of a wrong abstract framework is higher than the cost of no abstract framework. A bad framework gives you confidence in the wrong direction. No framework leaves you paying attention to the work itself, which is where the answers actually are.
We are uncovering better ways of developing software by doing it and helping others do it.
That line is the part of Agile that's often forgotten. The values that followed it weren't theory. They were observations from practice, written by people who had been doing the work and noticing what kept helping. The credibility of Agile, at the moment it was written, was that it didn't sound smart. It sounded earned.
That posture is the right one for any working methodology, in any era. Especially this one.
What did the work actually teach you that theory wouldn't have?
A few of the bigger lessons, named specifically.
The bottleneck is almost never code. Our first version of "working with AI" treated code generation as the high-leverage step. We were wrong by a factor of ten. The bottleneck on a real engagement is deciding what to generate. We learned this by shipping fast in the wrong direction enough times to notice the pattern. No amount of generation speed makes up for unclear aim. The theoretical version of this insight is obvious. The practical version, where you actually rearrange your workflow to spend most of your energy on the front of the funnel instead of the middle, took us a year to internalize.
Reviewing code is the wrong altitude. This one surprised us. Early on, we put a lot of effort into reviewing what the agents produced at the code level. The code was usually fine. The thing that needed reviewing was the thinking the code embodied: does this solve the right problem, does the data model support where we're going, is this the simplest version that could work? The shift from reviewing output to reviewing direction was a workflow change we didn't predict and couldn't have predicted from first principles. We made it because review meetings kept feeling like they were missing the real conversation.
Domain expertise has to be in the room continuously, not at the start. We knew this in theory. We didn't do it in practice for a while. We'd extract requirements at the beginning, then run with them. The clients who pushed us to keep them in the work were the ones whose products turned out better than ours alone would have. We wrote about why in closest to the problem. The general principle is old. The practical implementation, where you actually structure an engagement so the expert is at the keyboard with you in week eight, not just week one, took practice.
Vibe coding hits a specific kind of wall. We watched enough clients show up to us after hitting it that we eventually wrote the whole post on what kind of wall it is and why. We couldn't have written that post in 2023. We didn't know enough yet. We'd have produced a plausible-sounding theory that missed the specific failure mode. By 2025, we'd seen the failure mode enough times that we could describe it concretely.
None of these lessons are exotic. All of them changed how we work. Each of them, in retrospect, looks predictable. None of them was predicted by us in advance.
What does "learning by doing" actually look like for a team?
There's a posture inside this that's worth naming, because it's easy to dilute.
It isn't "we'll figure it out as we go" in the casual sense. That's improvisation, and improvisation produces inconsistent results. Learning by doing in the way we mean it is a deliberate practice. You go into the work paying attention to what's harder than it should be, what's failing in ways that surprise you, what's working in ways you didn't expect. You write that down. You compare across engagements. You change something on the next one. You watch what happens.
Done well, this is closer to applied research than to "winging it." You're treating each engagement as an experiment that produces both the deliverable the client paid for and the data about how the work should be structured for the next engagement. Both outputs are first-class. Neither is sacrificed for the other.
The cost of this practice is that you have to actually notice what's happening. Most teams don't, because noticing requires slowing down at exactly the moment you'd rather speed up. The first version of the engagement is exciting. The retrospective where you figure out what you got wrong is less exciting. Skipping it is the cheapest way to keep doing the same thing for years.
The benefit is that the methodology compounds. The team that did fifty engagements over two years isn't twice as good as the team that did twenty-five engagements in one year. It's three or four times as good, because each engagement updated the playbook the next one used. We described the math of this in the methodology post. The shape is exponential, not linear, when the practice is real.
Is the methodology done?
No. It won't be.
The thing we've learned that we're most confident about is that whatever we're doing in 2026 will look incomplete in 2028. The tools will be different. The agents will be different. The shape of the collaboration will adapt to capabilities that don't exist yet. The right move isn't to lock in the practice now. The right move is to keep the practice teachable enough that it survives the next shift, and humble enough that we don't pretend we already know what that shift will look like.
This is also why we publish what we're learning. The pieces in this manifesto series are not the framework. They're the things we've noticed are true so far. The format is closer to engagement notes than to documentation. We expect parts of it to need rewriting. We'd rather be honest about that than sell certainty we haven't earned.
If you take one thing from this post, take this. Anyone selling you a complete, finished methodology for building software with AI right now is selling you a guess wrapped in confidence. The honest version is messier and lives closer to the work.
The way we work didn't come from a whiteboard. It came from the engagements that taught us things we couldn't have predicted. That's also where the next version of it will come from. The work is the teacher.
Frequently asked
Why does it matter how a methodology was developed?›A methodology developed from theory tends to be neat and wrong in the messy specifics.
How does a working practice become a methodology?›By doing the work, noticing what consistently helps and what consistently hurts, naming the patterns, and testing them again on the next engagement.
Doesn't all methodology come from practice eventually?›Some does. A lot doesn't.
What's the most surprising thing you learned by doing this work?›That the bottleneck is almost never code generation. It's deciding what to generate.
How is this different from agile?›It's the same philosophical move, applied to a new physics. Agile uncovered better ways of working by doing the work in the early 2000s, when changing requirements were the new reality.
Considered takes, in your inbox.
We write when we learn something worth sharing. No schedule, no marketing digests. Built for engineers and product owners shipping with agents.