How I Use AI as a Product Manager: One Toolkit, Many Jobs

I run product for a large classifieds marketplace, and over the last few months AI stopped being something I experimented with on the side and became the default way I do the job. It isn't one killer feature — it's a dozen small workflows, each of which used to cost me a meeting, a ticket, or an afternoon. This is the umbrella tour: one toolkit (Claude Code, a handful of MCP connectors, and a computer-use agent for the messy parts of the web) pointed at a lot of different jobs. A couple of these grew big enough to deserve their own deep-dives, which I link to rather than repeat.

Research I don't read myself

The first thing I stopped doing was reading documentation to understand our own systems. I connected our Confluence to Claude Code through MCP and asked it to work out the current subscription architecture — account types, product tiers, how they relate. Instead of me opening dozens of pages, the model read them, pulled out the key points, and handed back a summary plus enough context to actually start the work I opened Confluence for. I still read the pages that matter; I just no longer read the forty that don't in order to find the three that do.

Someone else's Figma, folded into our docs

A colleague dropped a dense Figma board on me — an analysis of business accounts with far more data than I wanted to click through. Through the Figma MCP I ran a small pipeline in a couple of prompts: read the board, summarize it, translate it into Russian, and post the result as a Confluence page. What would have been an hour of squinting at frames and retyping became a summarized, shareable doc almost immediately. The point isn't Figma specifically — it's that "ingest someone else's messy artifact and give me something I can use" is now a two-prompt job.

Competitor teardowns without opening forty tabs

For competitor analysis I use the Chrome DevTools MCP, which lets the model drive a real browser the way a person would — navigating, taking screenshots, clicking, and pulling content off the page, OCR included. It walks a competitor's site and hands back a summary of what they offer and how they present it. I'll be honest that this is the heavier end of the toolkit — it's more setup than a normal person wants — but when I need a structured read of a live site rather than a static scrape, nothing else I've tried comes close.

A scoring tool for a decision that had nothing to do with work

The example I like most isn't work at all. My wife was applying to residency programs in the US, and there was an overwhelming amount of historical data on hospitals to weigh. In a few prompts I built a small interface that scored programs across five parameters with weights I could adjust, then sorted them. It turned a spreadsheet nobody wanted to stare at into a tool we could actually reason with. That's the generalizable lesson: any large, messy dataset can become a purpose-built mini-tool in an afternoon — and the dataset doesn't have to be a work dataset for that to be worth it.

Stakeholder dashboards before I pull in an engineer

At an earlier dropshipping company I used the same trick for stakeholder communication: drafting analytics dashboards — top SKUs, on-time versus late deliveries — as working prototypes rather than slides. In one or two prompts you get something a colleague can click, which changes the conversation from "imagine a chart that…" to "here, does this answer it?" I don't ship these prototypes; I use them to align on what we actually want before a single engineer or designer spends real time on it.

User stories and bug triage on tap

Two of my most routine PM chores now live in a Special Tasks folder of dedicated Claude Code conversations. One knows our user-story template: I give it context, it gives me a properly formatted story. Another takes the raw bug reports QA sends over and proposes a prioritization I can sanity-check. The operative word is sanity-check. This is exactly where AI will confidently hand you something wrong — it doesn't hold all the context and it will happily hallucinate — so triage output is a draft I validate, never a decision I outsource.

Answering my own data questions

The single most useful one: I connected Superset to Claude Code through MCP, which gives the model access to our ClickHouse data. Now I can ask business questions and get SQL, charts, and pulled numbers without adding myself to an analyst's queue. Prioritization and scoring that need real business context — the kind of thing I'd previously wait days for — I can now draft myself and confirm with the data team afterward. It doesn't replace our analysts; it means I only involve them when the question is genuinely worth their time.

The two workflows that earned their own write-ups

Two of these grew past "handy trick." The first is brand monitoring: a computer-use agent (OpenClaw, not Claude Code this time) that sweeps news, Reddit, and X for mentions of us and our competitors and files them into a Notion table by category — it caught an emerging complaint wave before anyone on the team saw it. The second is search-quality evals, where an agent judges thousands of real search results a week so we can finally track relevance over time. Both started as exactly the kind of small experiment above and earned a full write-up of their own; the links go to those.

What actually makes this work

The pattern under all of this is the same, and it is not "AI writes my code" or "AI does my job." It is that a product manager can now turn a vague question into a working artifact in the time it used to take to schedule a meeting about it. Need to understand a subscription architecture, tear down a competitor, score a messy dataset, or draft a dashboard for a stakeholder? Each of those used to mean pulling an engineer, an analyst, or a designer into your priority queue. Now it is a few well-scoped prompts and a browser tab. The catch — the part nobody selling AI tools says loudly enough — is that the model hallucinates, misreads context, and states wrong things with total confidence. So I validate every output by hand before it reaches a colleague. Speed only helps if you keep the judgment step you were tempted to skip.

A few habits make the difference between this working and this producing confident garbage:

Prompt in English. The quality is measurably higher — there's research on it, and I feel it daily even writing as a non-native speaker.
Chunk the work and reset context. Small, well-scoped tasks beat one giant thread. I clear context between features so the model isn't dragging stale assumptions forward.
Use Plan Mode. Letting the model plan before it acts noticeably raises the quality of what it produces.
Know the model landscape. For coding I still reach for Opus 4.5, but Gemini and Codex have almost closed the gap — the runner-up is close enough now that it rarely decides the outcome.
Context is everything. The more you document — systems, decisions, templates — the more the AI can reuse, which is the whole reason the Confluence and Superset connections pay off.
Prefer the direct tool. I mostly use Claude Code directly rather than through Cursor; Cursor's limits were tighter, and going straight to Claude Code was simpler.
Validate everything, by hand. This is the one rule I never break.

Originally discussed (in Russian) on my Telegram channel, where I share cases like this from work and side projects.