What Does Generative AI Mean For Product Development?

If you’re a product person Generative AI should be firmly on your radar, not least because some think that it will replace you soon. Hyperbole aside, it’s very reasonable to assume Gen-AI will affect, maybe transform, product development. But what’s going to change? And are there any downsides?

As this is a very new technology, no one knows the answers for sure. It’s easy to get carried away by the hype, but there’s a bigger picture to consider. In this article I’ll explain why I feel Gen-AI may be a double-edged sword, bringing a massive promise for acceleration, but also a risk of degrading product culture and producing more low-quality features, faster. I’ll go quickly over some ways you may use Gen-AI and then discuss these potential risks.

Ways to use Generative AI

1. Generating Entire Products

This may be the most utopian (or dystopian, depending on whom you ask) scenario: you input an elaborate prompt, the AI does its thing, and Poof! out comes a fully designed and coded feature or product.

In one sense, this is already a reality. There are dozens of Generative AI app builders that will create a simple application for you on demand, no coding or design knowledge needed. I’m not sure how good or robust these apps are, but, like no-code and interactive prototyping tools, Gen-AI app builders look like an interesting way to create prototypes to test your ideas.

I would hazard a guess that for most real-world dev projects, full-automation is years away, and it may turn out to be a very tough problem for AI, like self-driving cars. I’ll list the reasons later in the article.

2. Co-Piloting Development

While the bots can’t replace us completely (yet), they can definitely accelerate our work:

Coding/code-reviewing/debugging: this too is already a reality. According to Inbal Shani, GitHub’s CPO, GitHub’s Co-Pilot is being actively used by 1.5ml developers across 37,000 orgs, with high levels of satisfaction and retention. These are remarkable numbers for such a young technology.
Creating dev artifacts: specs, designs, test plans, product content, interview scripts, experiment plans…
Processing and analyzing data: cleaning, summarizing, reporting, finding insights…
Ideating: generating ideas, proposing goals, suggesting approaches
Acting as a knowledge-base: product data, frameworks, processes, templates, book summaries…
Testing code: both test development and running the tests
… and probably others we haven’t thought of yet

This is what gets practitioners most excited. The bots seem infinitely knowledgeable and capable; truly the closest thing we have to an intelligent co-pilot.

3. Powering Your Product

Many tech companies are considering how to add Gen-AI capabilities to their products, offering those same powerful benefits to their users. Most will use off-the-shelf models like GPT, perhaps with some fine-tuning. A few will opt to develop their own models, which currently requires big amounts of time, data, compute power, and memory (but we should expect the costs to reduce over time).

The Risks of Generative AI in Product Development

Missing Context

I wrote a whole post on this one. The bottom line is that although the bot can produce a compelling user story, OKR, or business model, these are not necessarily the right ones for you. The model is missing a lot of context about your specific product, users, market, and company. Even if you try to condense all the information into an elaborate set of prompts (which I find unlikely), the bot will ultimately output a generic artifact based on what the “industry” (or more accurately, what the model was trained on) tends to do.

It’s important to note that humans suffer from context problems too. Our minds struggle to process a lot of information and may fall for cognitive biases such as Recency Bias and Confirmation Bias that fixate us on a small subset of the data. Even smart and capable people need help thinking broadly and deeply.

This may be a big opportunity for AI. Imagine you had an AI system that is constantly fed all your business and product data: every user action, customer feedback, sales call, interview, competitive offering, past experiment…. Equipped with the right model, this system could help us detect patterns and uncover insights, opportunities, and threats. The AI system may suggest goals and ideas, and will be able to answer that all-important question “what evidence do we have in support of this idea?”. Such a system could be a truly helpful co-pilot when developing business plans, product specs, and other artifacts.

I’m sure someone somewhere is working on such a product right now, although this too may turn out to be a tough problem, partly because of the next issue.

My book Evidence-Guided is now available

From readers:
“The grand unified theory of product management”, “Best Practical Product Management Guide”, “Must read for seasoned and new product managers”, “Top Five Business Book I’ve Ever Read”

Hallucinations

A major downside of the current generation of ML models is their tendency to confidently spew out nonsense. For example, here’s what ChatGPT (3.5) said when I asked it to summarize my book Evidence-Guided:

“Itamar Gilad’s book ‘Evidence Guided’ explores the concept of using evidence to make better decisions in both personal and professional life. Gilad emphasizes the importance of data-driven decision-making and provides practical strategies for incorporating evidence into everyday choices. From understanding cognitive biases to utilizing data effectively, the book offers valuable insights for anyone seeking to make more informed decisions.”

This summary is 80% wrong. I definitely did not write a self-help book about making data-driven decisions. The model obviously has no idea what my book is about, but instead of saying “I don’t know” it produced this plausible-sounding text, probably based on the title alone.

This is a bi-product of the fact that large language models (LLMs) are inherently designed to produce the most correct-sounding answer, rather than the most accurate one. The two often overlap, but not always.

That makes Gen-AI unreliable. Imagine working with a colleague who is very capable, knowledgeable, and eager to help. This person can do a lot, but he has one major personality disorder — he thinks he knows everything, and even when he doesn’t, he tries to bullshit his way through the task at hand and produce something that looks right even if it’s way off. Would you trust this person with critical work?

I’m sure AI companies are working to reduce the rate of hallucinations, but likely this is yet another tough problem. More broadly, Gen-AI is not a one-size-fits-all solution for all classes of problems, other AI approaches have their place, as do good-old human-generated algorithms.

New Costs and Risks

If you’ve tried using any Gen-AI product, you know that there’s an awful lot of trial and error involved. OK, this prompt didn’t work, what if we tweak it this way? Let’s run it again. With each run there’s a wait time, which can accumulate across the development cycle. You can partly compensate by buying more memory and compute power, or opt for a higher-tier of service with your AI provider, but either way it can get expensive and you’re never sure how long it’ll take (if ever) to achieve a satisfactory result.

If you embed Generative AI into your software, things get even less deterministic. We were trained to develop software that follows some human-understandable logic (at least until it devolves into spaghetti code), but what happens when at the heart of the system sits a mysterious statistical model that no one understands? How does this affect design, coding, experimentation, testing, debugging, and maintenance?

Another cost is that of using a 3rd-party Generative AI APIs. Right now Big-Tech as well as some startups and open source projects are jockeying for position, hence the market is very competitive, but sooner or later these organizations will need to recuperate the immense costs that go into assembling datasets, developing models, and of operating the service, so likely charges will creep up.

In other words, while Gen-AI gives us powerful new capabilities, it also brings new uncertainties and complexities. As Marty Cagan and Marily Nika point out, Gen-AI introduces all four classes of product risk: feasibility, usability, value, and business viability.

Accelerating Feature Factories and Junk Features

This one worries me the most. It’s no secret that many company leaders measure the success of their product orgs by their output. For these managers Gen-AI is a godsend, helping to further optimize the feature-factory to produce more working code in less time and cost. It’s not hard to imagine some of these cost-cutting measures:

Less thinking — It’s already hard to get people to surface assumptions, run experiments, conduct product discovery, analyze results, and take evidence-guided action. Gen-AI with its confident outputs may be the perfect antidote to thinking, focusing everyone on execution instead.
Copy-cat mentality — just do what everyone else is doing; the bot says so.
Waterfall — a good product spec is a result of cross-functional colab and constant iteration, as is a good design mockup, and working code. But if we’re each producing ready-made, complete artifacts using Gen-AI tools, collaboration and iteration may take a backseat to classic waterfall development.
Headcount cuts — If a design bot can produce convincing UI designs with a click of a button, do we still need a UX designer? If it can produce user stories at the fraction of the cost of full time PM, can we scale one PM across 10 teams?

The bottom line may be a major regression in culture and quality — an acceleration of the feature factory model, producing more junk features and junk products at a faster pace. If you want to see what that feels like just look at what’s happening to books, articles, images, and social posts.

Gen-AI-Utopia or Techno-Hell?

While obviously very powerful, Gen-AI is not without its risks. We may use Gen-AI to empower individuals, teams, and managers while keeping tabs of the limitations of the technology and leaving flight controls in humans’ hands. We can use Gen-AI to build true intelligence in the org by combining what machines do best (crunch up lots of data, detect patterns) and what humans do best (empathize with other humans, prioritize the most important things, make decisions with partial info …).

On the other hand we may use Gen-AI like some companies use Agile: “optimize” engineering, design, and product work for maximum throughput creating a “production machine” that is increasingly devoid of deep thinking and judgement. Like every other technology or process, Gen-AI is going to be overused, misused, and abused. Whether or not we do these things heavily depends on our very own human intelligence.

Join 15,000 product people who receive my articles, tools, and templates in their inbox

In my workshops we practice implementing product discovery and team empowerment hands-on.

Secure your ticket for the next public workshop or contact me to organize an in-house workshop or keynote for your team.

Share with a friend or colleague