A PM does not need to be the best prompt engineer in the building. A PM needs enough prompting fluency to scope what a model can reliably do, write a spec an engineer can build against, and judge whether the output is actually good. That is a different skill than squeezing out the last 2 percent of performance, and it is more important. Here is the framework I use.
The five levers
When an output is not good enough, the fix is almost always one of five things, in roughly this order of leverage:
- Role and context. Tell the model who it is and what situation it is in. "You are a senior PM reviewing a feature spec" produces sharply different output than a cold instruction. This single lever moves quality more than any other.
- Task specificity. Vague task, vague output. Specify the exact deliverable, the format, the constraints, what to exclude.
- Examples (few-shot). One or two worked examples of input-to-output teach the model your standard better than a paragraph of description. Show, don't tell.
- Output structure. Ask for JSON, a table, named sections. Structure forces completeness and makes output testable.
- Reasoning room. For hard tasks, let the model think step by step before answering. Cutting straight to the answer on a complex task is where mistakes hide.
If an output disappoints, walk these five before you conclude "the model can't do this."
Role-based prompting is the PM's superpower
The technique I lean on most is role-based prompting — instructing the model to reason as a specific expert. On my AI-assisted PRD generator, the core move was a framework that told the model to simulate senior-PM reasoning: ask the questions a senior PM would ask, flag the risks they would flag, structure the document the way they would. The output quality jumped not because the model got smarter but because the role focused its existing capability.
This generalizes. Want a critical review? Assign a skeptical reviewer role. Want options, not a single answer? Assign a role whose job is to generate alternatives. You are not adding knowledge — you are directing attention.
What PMs get wrong
Two failure modes I see constantly:
- Treating the prompt as fixed. The first prompt is a draft. Prompts are iterated against examples, not written once. If you do not have a handful of test inputs you check every change against, you are guessing.
- Confusing a good demo with a good feature. A prompt that works on the three inputs you tried is a demo. A prompt that holds up across a representative set is a feature. The gap between them is exactly the eval set — which is why prompting and evaluation are the same discipline viewed from two angles.
The honest scope
Prompt fluency lets a PM answer the questions that actually block AI features: Is this reliably possible or only sometimes? What does the spec need to say? Is this output good enough to ship? You do not need to win a prompting contest. You need to make those three calls correctly, and the five levers plus a small test set get you there.