Product Guide

Is Your AI-Built App Ready for Real Customers?

Bill Cava/June 24, 2026

You built something with AI. Vibe coded it, maybe, or prompted your way to a working product over a few weekends. It works. People have used it, and it did what it was supposed to do. Now someone has asked the question that changes the temperature in the room: is this actually ready for real customers?

So you go looking for an answer, and the internet hands you a "production readiness checklist" written for a site reliability team running a fleet of microservices. Service-level objectives. Observability budgets. Incident runbooks. None of it is wrong. All of it is for a different person solving a different problem. You did not build a microservice fleet. You built a product, with AI, and you need to know whether it is safe to put your name and your customers behind it.

That question has a good answer. It is just not the one the search results give you.

This is not only a solo-founder problem. We see it constantly with small teams and companies: a lean operation builds its own internal tools with AI instead of buying software, the tools work, and then everything stalls at the last mile, because the data is sensitive (client records, investment memos, proprietary files) and nobody on the team can say whether it is safe to deploy. "We don't know what we don't know" is the exact phrase we hear. Customer-facing or internal, the bar is the same: can this hold real data and real use without becoming a liability?

What does "production ready" mean for an AI-built app?

Production ready means your app can hold real customers without becoming a liability. Not "it runs." Not "the demo went well." It means the app holds up when something real happens to it: a bad actor, a traffic spike, a compliance question, a customer doing something you never thought to test. That is the whole bar.

The reason this matters more now is the same reason building got easier. AI made it possible to ship a working app in a weekend, which is genuinely new and genuinely good. But shipping was never the hard part of software. The hard part was everything that stands between "it works" and "it keeps working when the world pushes on it." AI compressed the building. It did not compress that gap. If anything, it widened it, because the app that comes out the other end looks more finished than it is.

We have written before about building the right thing in the first place, the product judgment the tools skipped. This is the engineering half of the same story. The tools handed you the ability to build. They did not hand you the layer that keeps what you built alive.

Why doesn't the usual production readiness advice fit?

Because it was written for SRE teams, not for the person who built with AI. The standard checklists assume a staff of engineers, a platform, and someone whose full-time job is reliability. You have a product and a question. The advice answers "how do we operate a service at scale," when your question is "is the thing I made safe to sell."

Search "production readiness checklist" and the top results are platform tools built for engineering organizations. That framing gap is exactly why none of it lands. It is not that the advice is bad. It is that it is aimed past you. The builder who shipped with AI does not need a service mesh. They need to know whether their app leaks customer data, whether someone can walk past the login, and whether the code will calcify into something nobody can safely change in six months.

So forget the operations-team version. The version that matters to you is shorter, and it is about exposure, not infrastructure.

What actually breaks in AI-built apps?

The failures are real, recent, and documented. In 2026, the security firm RedAccess found roughly 380,000 exposed assets across vibe-coded apps built with the AI tool Lovable: unauthenticated endpoints, exposed keys, ways to reach data that should have been locked. These were not abandoned experiments. They were live apps with real users, and the security crisis was covered widely as it broke.

The number is striking, but the pattern underneath it is the real lesson. Every one of those apps worked. They behaved correctly in the demo, which is the only condition most of them were ever tested in. The holes only showed up when something real applied pressure. The Moltbook app, shipped in early 2026 by a founder who wrote essentially no code, was found to be exposed within days of launch by researchers who simply looked. The app was not broken. It ran fine. It just was not safe, and from the inside there was no way to tell the difference.

That is the trap. AI coding tools are confident. They generate a login flow, a data layer, a payment screen, and they do not stop to flag the thing they left open, because they do not know they left it open. The result looks done. We have written about why "looks done" is the most expensive illusion in software, and security is where it costs the most. An app that looks finished and is quietly exposed is worse than an obviously broken one, because nobody goes looking.

In the demo, the two are the same line. Real customers are where they part.

What should you check before real customers?

Production readiness for an AI-built app comes down to five questions. Each is a door, not a checkbox.

Security: can someone get to data they should not have? Not "does it have a login." The real question is whether a motivated person can walk around it. The common AI-built failures are concrete and repeatable: secrets sitting in the browser where anyone can read them, data access left open by default, missing checks on what users are allowed to send. You usually cannot see any of this while the app is working normally, which is the point.

Data and compliance: do you know your exposure? This is the scariest one because it is invisible until it is not. If your app takes card payments, PCI rules almost certainly apply. If it collects personal data from people in the EU, the UK, or California, privacy regimes like GDPR likely apply. Most builders genuinely do not know which of these touch them, and AI tools do not tell you. These are exposures to identify with qualified counsel, not boxes to self-certify. The goal here is simply to know what you are responsible for before a customer or a regulator asks.

Durability and scale: does it survive real conditions? An app that worked for ten friendly beta users may not behave the same for ten thousand real ones. Error handling that looked fine in development can silently swallow failures in production. The architecture that felt great for a prototype may have a ceiling you will discover the hard way, on the day a campaign sends real traffic. Neither requires heroics to address. Both require someone to ask the questions the AI was never prompted to ask.

Maintainability: can you still change it later? This is the slow-burn risk. AI generates code faster than anyone reviews it, and without architectural discipline the codebase accumulates patterns that are locally fine and globally fragile. GitClear's 2025 analysis of 211 million changed lines found the share of refactored code falling from 25% in 2021 to under 10% in 2024, with code cloning roughly quadrupling. Translation: across the industry, AI-era code is being copied and shipped faster than it is being understood. The bill for that arrives as a product you can no longer safely change.

Is it actually working in the wild? Shipping is a step, not the finish line. Whether the app is healthy for real users, and whether you would know if it were not, is its own discipline. We have written about the feedback loop that tells you what is really happening once customers are in the building.

So how do you know if your app is ready?

You get an outside read from someone who has seen the failure modes. That is the honest answer. Each of these five questions takes judgment that is hard to apply to your own work, because you cannot audit for problems you do not know exist. The 380,000 exposed apps were not found by re-running the AI. They were found by people who knew where to look.

This is the part AI did not change, and the part it made more important. AI amplifies your direction, and it amplifies it whether your direction was right or wrong. Point it at a clean design and it builds fast. Point it past a security hole and it builds right past the hole, just as fast, with the same confidence. The speed is a gift only if someone is holding the judgment that makes the speed safe. Production readiness is not a property of the app. It is a property of the review that stood behind it.

Security is a process, not a product.

Bruce Schneier, Secrets and Lies

What that read actually looks for is specific, not mystical. Where the secrets live, and whether the browser can see them. Whether data access is closed by default or open until someone remembers to close it. What happens on the paths nobody demoed: the malformed input, the duplicate request, the user who refreshes at the wrong moment. Which privacy and payment rules touch your data, and whether you are quietly on the wrong side of one. Whether the next ten features can be added without the whole thing getting more fragile. None of this is visible while the app is working normally, and all of it is answerable by someone who has watched these exact things go wrong before.

None of this is an argument against building with AI. You should keep building with AI. It is an argument for one more step before you put real customers on what you made: a set of eyes that already knows the difference between an app that works and an app that is ready.

AI removed the cost of building. It did not remove the cost of a wrong turn nobody caught. "Works" and "ready" are not the same word, and the distance between them is exactly the distance a customer, or a regulator, or a bad actor, will find first if you do not.

Frequently asked

What does production ready mean for an app?

›Production ready means an app can hold real customers without becoming a liability.

⌄Production ready means an app can hold real customers without becoming a liability. It covers five things: security, data and compliance, durability under real conditions, the ability to handle a traffic spike, and whether the code can still be changed safely later. An app that works in a demo has cleared none of these by default.

How do I know if my AI-built app is secure?

›You usually cannot tell from the outside, which is the problem.

⌄You usually cannot tell from the outside, which is the problem. The common failures in AI-built apps are secrets left in the browser, data access left open by default, and missing input checks. None of them show up while the app is working normally. A security review by someone who knows the failure patterns is how you find out.

Does my app need to be PCI or GDPR compliant?

›If your app takes card payments, PCI rules almost certainly apply.

⌄If your app takes card payments, PCI rules almost certainly apply. If it collects personal data from users in the EU, UK, or California, privacy regimes like GDPR likely apply. Most builders do not know which rules apply to them. These are exposures to identify with qualified counsel, not problems to self-certify.

Can AI tools build a production-ready app on their own?

›AI tools build apps that work, fast. They do not flag the gaps between working and ready, because they are confident even when they skip security, data integrity, or error handling.

⌄AI tools build apps that work, fast. They do not flag the gaps between working and ready, because they are confident even when they skip security, data integrity, or error handling. The judgment that closes that gap is the part AI did not hand you. It still takes a human who has seen the failure modes.

What is the difference between an app that works and one that is production ready?

›An app that works behaves correctly on the happy path, the route you take in a demo.

⌄An app that works behaves correctly on the happy path, the route you take in a demo. A production-ready app also holds up when something real happens: a bad actor, a traffic spike, a compliance question, a customer doing something you never tested. The gap between the two is where most AI-built apps get into trouble.

Considered takes, in your inbox.

We write when we learn something worth sharing. No schedule, no marketing digests. Built for engineers and product owners shipping with agents.