Why Not Just Launch It? (AI Edition)

Whenever I teach product discovery, someone will raise their hand and ask a variant of this question:

“This feels slow. Wouldn’t it be faster to just launch it and see what happens?” 

The answer is No. If you skip product discovery you may get bits into production faster, but you’ll create value far slower. This is not an intuitive observation, but one that more product leaders agree with today, which is why we’ve seen steady pick up in evidence-guided development. 

But since the advent of gen-AI coding with its claims of “10x-ing” development speed the trend seems to be reversing. Output-focus, and “just launch it!” is back in fashion and it’s starting to feel like the early 2000s once again. 

Microsoft employees used to get this decorative Ship-It plaque (affectionately known as a tombstone) on which to place metal stickers commemorating every project launch they participated in. I had one on my desk back in 2003 as did all of my colleagues. Image Source: Engin Uzuncaova on LinkedIn

No one knows yet how much AI truly accelerates development of production-ready, maintainable code in real-world systems (as opposed to limited first-time projects). It certainly feels faster, but it’s fair to say that the 10x claims are mostly hype at this point. Still, the question deserves serious consideration. If we’re able to accelerate development enough, does launching without testing  start making sense? 

Put another way – Do we still need product discovery in the age of AI? 

To answer this important question, I’m going to use a semi-fictional example and we’re going to tackle it twice: with and without product discovery.. 

Conversational Search — An AI-Powered Big Bet 

AcmeProperty is a real-estate marketplace that monetizes through paid ads and premium placements. Search is a key function of the service, but it has long been speculated that unsophisticated users struggle to use it in full, causing them to miss properties. All past attempts to simplify Search have failed.

With the advent of AI, a new idea surfaces: Conversational Search—letting users describe what they want in plain language to an AI chatbot, which then runs the search. The chatbot would also appear on the results page to help refine and compare listings.

This idea is seen as a top priority. In a matter of days a developer vibe-codes a prototype which is demoed to the leadership team. Impressed, they give the green light. 

At this point, we split into two timelines.

Scenario 1: Just Do It

Development and Launch (7 weeks)

A full project team—five engineers, a designer, and a PM—is assigned. They spend 7 weeks building the feature with the help of AI coding. The work takes longer than expected: AI coding seems to struggle with the large legacy codebase of AcmeProperty. Training the AI chatbot to search is harder than expected as it sometimes ignores instructions and creates inconsistent outputs. 

Eventually the feature ships to all users. The conversational chatbot now sits prominently at the top of every page, pushing the classic search down to a less visible spot.

At the next all-hands, the company celebrates the launch and the prototype developer receives praise. Separately, the board is briefed on the successful adoption of AI in product and engineering. Spirits are high.

Iteration 1 (4 weeks)

In the first week after the launch, Conversational Search gets very high use — 68% of weekly active users, but by week 2 this value drops to 21% and by week 3 it is a mere 7%. Classic searches also decline by 7%, as does engagement with property ads. Revenue declines by 3% (although this could be just normal week-over-week fluctuations). 

Leadership requests an urgent review. The team admits they don’t know what’s causing the low usage. The feature is obviously useful, but the users may need a stronger nudge to form the habit. 

The leaders approve two follow-up features: an interactive onboarding guide to teach users how to use conversational search, and an occasional pop up to remind them of its existence. Both are built and launched within a week. 

Iteration 2 (3 weeks)

In the week following the launch, Conversational Search surges to 26% usage, but the following week it recedes to 6%, and further declines to 5.5% in the week after that. Customer support receives many dissatisfied reports about the incessant promos. Searches, properties page engagements, and revenue continue to decline. 

Leadership calls another review. The executives are visibly unhappy—this is not the news they wanted to give to the board. The CPO suggests a bold move: hide classic search behind an “Advanced Search” link, leaving conversational search as the only visible option. Desperate for an improvement, the leaders approve the change. 

The new UI is rushed into production in two short days. 

End Game (3 days) 

Initial data is fuzzy, but within 3 days a strikingly dark picture emerges: searches plummet by 18%, engagement with property ads declines by 22%, revenue takes a 19% hit.

Panicked, the leaders order an immediate rollback of Conversational Search and its supporting features. Classic search is reinstated and the metrics recover, though not fully. Some users and advertisers have apparently left for good.

At the next all-hands, the CEO calls the initiative “a bold strategic bet that didn’t pay off” and “a learning experience”. The true costs are not shared: Roughly 50 person-weeks of development spent, and a loss of 250,000 weekly active users, and USD 150,000 in potential revenue. No retrospective is held, and no changes to development practices implemented. The one takeaway is: “Chatbots don’t work for our use case”.

Upcoming Workshops

Practice hands-on the modern methods of product management, product discovery, and product strategy.  

Secure your ticket for the next public workshop
or book a private workshop or keynote for your team or company. 


Scenario 2: Product Discovery + Delivery

We rewind back the clock. Leadership, impressed by the prototype, asks for further investigation.. A PM, designer, and engineer are assigned.

Initial Triage (15 min)

The trio meets and evaluates the idea. They assign  impact, confidence, and ease scores (ICE) simply based on gut-feel estimates: 

  • Impact: 5% lift to the north star (property ad engagements) → medium-high (7/10)
  • Ease: 15 person-weeks → medium (5/10)
  • Confidence: very low (0.1/10) — based only on internal opinions and the AI theme
The Confidence Meter helps teams generate confidence scores based on supporting evidence. 

Deeper Assessment (1 day)

Each member investigates the idea deeper:

  • PM — Runs a data query and finds that 87% of users already use the current search in full, meaning fewer users than previously estimated need a new form of Search. 
  • Engineer — Breaks down the work and recalculates the cost to be 20 person-weeks.
  • Designer — Tests an AI model for translating natural language into structured search parameters, and finds that it is inconsistent, sometimes ignoring instructions or adding criteria. 

The trio meets and revises the ICE estimates based on the new evidence:

  • Impact → medium-low (4/10)
  • Ease → medium-low 4/10
  • Confidence → still very low (0.3/10), but slightly improved due to the investigation

Usability Test (2 weeks)

With the help of AI the trio plan and launch a usability test with 10 users. The engineering upgrades the prototype to work in a lab test environment.

Results:

  • Only 2 of 10 participants report searching is a challenge for them.
  • Only 1 of 10 preferred searching with the chatbot over classic search. 
  • But 6 of 10 participants loved the idea of a search assistant on the results page, especially as a way to track and compare properties. Several users showed how they maintain separate spreadsheets to do this today, having to copy and paste data from the website. 

Looking at the new evidence the trio concludes that Conversational Search is better split into two ideas:

  1. Chatbot Search (Initiating search using a chatbot )
  2. Search Co-Pilot (AI assistance after search: filtering, comparing, creating tables)

They create revised ICE estimates for both: 

  • Chatbot Search: impact 2/10, ease 7/10, confidence 2/10 
  • Search Co-Pilot: impact 5/10, ease 6/10, confidence 2/10

Leadership reviews the findings. They appreciate the evidence-backed recommendation but aren’t ready to kill Chatbot Search just yet. They ask for more testing and allocate two more engineers.

A/B Test (10 weeks)

The team builds both features and test them a three-way split experiment:

  • A) Control — The current interface unchanged
  • B) Chatbot Search — Prominent at top and pushing classic search down. The search results page remains unchanged. 
  • C) Search Co-Pilot — Classic search, but the chatbot appears on the search results page and has a new table-view feature inspired by user spreadsheets. 

Development takes 7 weeks. The test runs for three weeks.

Results:

Variant B: Chatbot Search (compared to the control group)

  • 65% of users try chatbot search once (novelty effect), but usage plummets to 8% by week 3. 
  • 8% fewer searches
  • 6% fewer ad engagements
  • Revenue down 3% (not statistically significant)

Variant C: Search Co-Pilot (compared to the control group)

  • 45% of users use the feature  on week 1, 43% on week 2, 41% on week 3. 
  • 4.5% more searches
  • 5.5% more property engagements
  • Revenue up 1.5% (not statistically significant)

The team recommends shelving Chatbot Search and launching Search Co-Pilot. Leadership agrees.

Delivery (2 weeks)

After two more weeks of testing and bug fixes Search Co-Pilot is launched. 

Post-launch, long-term data:

  • 35% of users use the feature 
  • Property engagements up 4%
  • Revenue up 1.8%

Results Summary

Scenario 1 (Just Do It) is the winner in time-to-market, but on everything else Scenario 2 (Discovery + Delivery) is clearly better, and that’s not considering the embarrassment factor of launching and rolling back a bad feature in full view of users, customers, and colleagues. 

You might argue that I picked an example that favors discovery. Sure, Conversational Search might have been a slam dunk out of the gate, but idea success statistics tell us that’s not very likely. Decades of A/B test research show that if you don’t test and improve your ideas, your chances of success are somewhere between 8% and 33%. 

What does that mean over all the ideas you develop, not just one? Product discovery can easily 10x your outcomes

Final Thoughts

The reason some companies want to “just launch it” in the AI era isn’t AI. It’s the same old desire by some people to avoid scrutiny of their ideas. AI is simply the latest excuse.

But in the 2020s, not validating ideas is amateurish. Yes, things are accelerating, but that just means good companies are learning faster, further expanding the gap with those that don’t. Product discovery is just as crucial in the AI era as it was before it, and perhaps even more. If, like me, you feel this point is lost in the AI noise, please consider sharing  this article with colleagues and social contacts. 

Join my newsletter to get articles like this 
plus exclusive eBooks and templates by email

Share with a friend or colleague
Tired of Launching the Wrong Things?

Join my Lean Product Management workshops to level-up your PM skills and align managers and colleagues around high-impact products.