Product Guide

How Long Will This Take? A Better Way to Estimate Software Projects in the AI Era

Generative Labs

"How long will this take?"

It's a fair question. You're investing in building something, and you want to know what you're looking at. A week? A month? Three months?

The honest answer is: we can tell you, but not in hours. And that's not a dodge. It's actually a more accurate way to plan.

Why hour estimates don't work anymore

Hour estimates were designed for a world where you figure out what to build, write a spec, and hand it to a development team. Requirements, then execution. The execution was the expensive part, so that's what you estimated.

AI changed that equation. The part that used to take days (writing code, wiring up integrations, standing up infrastructure) takes a fraction of the time. What didn't change is the part that determines whether the product is right: your domain knowledge, the product decisions, the collaboration that shapes what gets built. When code is 5% of the effort instead of 80%, estimating code hours is like timing how long the printer takes to print.

The thinking was always the hard part. Now it's nearly the only part. And that changes what estimation even means.

  • Co-creation means redirects, and redirects are good. The best products are built collaboratively. You see something take shape, you react, you redirect. "Actually, this should work differently." That's the collaboration working. But hour estimates treat every redirect as a deviation. They punish the very thing that produces the best outcomes.
  • The landscape moves faster than any fixed number. Something scoped as a custom build today might become a one-line integration next week because a new capability shipped. An agent that needed hand-built tooling last month might have native support tomorrow. Hour estimates anchored to today's landscape are wrong before you execute them.
  • Your insight is the most valuable input. A five-minute conversation about how your customers actually behave can save a week of building in the wrong direction. That kind of contribution doesn't fit on a timesheet. But it's where the real value is created.

How we size work instead

We use buckets. Four of them:

  • An hour — Small, well-understood, minimal unknowns. Copy changes, color adjustments, config tweaks.
  • A day — Clear scope, some decisions to make, but the path is visible. A new form, a straightforward integration, a page layout.
  • A week — Real complexity. Multiple moving parts, design decisions, unknowns that need to be worked through. A new feature module, a workflow with several states, something that touches multiple parts of the system.
  • A month — A complete system or capability in its entirety. Complex integrations with many moving parts, large features that need to be broken down into smaller pieces before anyone can estimate them meaningfully. In practice, month-sized work almost always gets decomposed into week-sized and day-sized tasks. That's the point: the bucket tells you the scale of what you're looking at, and signals that the next step is breaking it apart.

Notice these are singular. Not "3 hours" or "2.5 days." An hour. A day. A week. A month. Each bucket is a size category, like small / medium / large / extra-large. The point is to capture the magnitude of the work, not the precise duration.

And ranges between buckets are fine. "This is between a day and a week" is a perfectly good estimate. It tells you everything you need to know about what you're looking at.

Why buckets instead of hours

Think of it like this: if someone asks you how far away a restaurant is, you say "about ten minutes" or "across town, maybe half an hour." You don't say "17 minutes and 42 seconds." You know that traffic, parking, and which route you take will all affect the actual time. The rough estimate is more honest and more useful than false precision.

Software estimation works the same way. The bigger the work, the less meaningful precise numbers become. Buckets match your actual confidence level: you can tell the difference between a day-sized task and a week-sized one, but you can't reliably tell the difference between a 14-hour task and an 18-hour one. So why pretend?

How buckets become project estimates

Individual tasks get bucketed. Then we add them up.

Say a module has 12 tasks. Three are hour-sized, five are day-sized, three are between a day and a week, and one is week-sized. That adds up to roughly two to three weeks for the module.

That's a range, not a fixed number. And the range is honest. It accounts for the reality that some day-sized tasks will take half a day and others will take a day and a half. It accounts for discovery (you'll see something during the build that changes a task's shape). It accounts for the fact that the collaboration itself produces better work, and better work sometimes means adjusting course.

When we scope a larger engagement, we bucket the major modules, add up the ranges, and give you an overall estimate like "this phase is likely six to eight weeks." That's a real number you can plan around. It's just not a fake-precise one.

Buckets size the work. Availability sets the pace.

One more thing that's worth understanding: the buckets measure the work, not the calendar.

A week-sized task is a week of effort. If your team is working full-time, that's a calendar week. If you have five hours a week available for collaboration, that same task stretches across several weeks on the calendar. The work is the same size. The pace is different.

This matters because it keeps the estimation honest and independent of your engagement level. A client with a full-time team and a client with a few hours a week get the same sizing for the same work. What changes is how quickly you move through it. No padding, no stretching. Just a clear picture of effort, and a realistic conversation about pace.

What this means for you

A few things this approach makes possible:

Honest conversations about trade-offs. When a feature is bucketed as "a week" and you need it faster, we can talk about what to simplify, what to defer, and what's essential. That conversation is harder when everything is denominated in hours, because cutting "15 hours" from a "40-hour" estimate feels arbitrary.

Room for the work to breathe. The best products come from iteration. When you're co-creating and a better direction emerges mid-build, that's a win, not a setback. In a bucket-based model, that redirect is expected. In an hour-based model, it's a change order.

No incentive to pad. Hour estimates create an incentive to add buffer (everyone does it). Bucket estimates don't need buffer because the range already accounts for variability. What you see is what we actually think.

Better alignment on what matters. Instead of tracking whether task #47 came in at 3.5 hours or 4 hours, we're focused on whether the module is on track and whether the product is heading in the right direction. The altitude of the conversation stays where it's most useful.

The point of all this

AI changed what software estimation is about. It's no longer about predicting how long someone will sit at a keyboard. It's about having a shared language for the thinking, the decisions, and the collaboration that actually determine whether the product is right.

The next time someone quotes you a project at "480 hours," ask yourself: do they actually know it's 480? Or do they know it's months, and they multiplied to make it look precise? The honest version is more useful. It lets you focus on what matters: is this the right thing to build, and what's the clearest path to something real?

That's what the buckets are for. Not to obscure the answer to "how long will this take?" but to give you one you can actually trust.

Follow the thinking.

We write when we learn something worth sharing. No schedule, no spam.