How I Scope Minimum Viable Data Products That Prove Value Fast

You don’t need a perfect pipeline to prove value. I share my method for building lightweight data solutions that get early feedback and build trust fast.

Apr 16, 2025

Greetings, Data Engineer,

Most data teams want to impress. So they start big.

They blueprint, spec, design architecture diagrams. They pick tools. Build robust pipelines. Then? They ship… something.

And the stakeholder says: “I expected something completely different.”

You and I know that feeling. It’s rough.

But, it’s also avoidable.

The solution is to scope Minimum Viable Data Products (MVDPs). Think of these like rough sketches that solve a real problem fast.

You don't need Airflow. You don’t need dbt. You don’t even need clean data.

You need one user. One problem. One tiny but valuable output.

Let me show you how.

🧠 Understanding the Problem

😣 Why It’s a Challenge

Data teams love to build. That’s the good part.

But too often, you and I build for tech reasons, not user reasons. You chase clean models, perfect joins, or gold-layer pipelines before you even know if anyone cares.

And the result? Big output. Low impact.

But here’s the kicker—most stakeholders can’t describe what they need until they see something. You ask: “What do you want?” They shrug. You build anyway. Six weeks later, they say: “Not this.”

That's not their fault.

It’s the game. You need to show before they know.

So instead of building from specs, start with a tiny, ugly MVP. Show it. Watch reactions. Learn what matters.

Stop Building Bold Data Products: Do This Instead

Yordan Ivanov

January 31, 2024

Read full story

❌ Consequences of Not Doing MVPs

Let me tell you what happens when you skip this.

You spend weeks building. Cleaning data. Writing tests. Stitching five sources together. It looks slick. It even works. You demo it.

And the stakeholder tilts their head and says, “Hmm… I thought this would be something completely different.”

Oof.

Sometimes, they weren’t even asking for this. They were just thinking out loud. You took it seriously. Now you're deep into a project no one actually needed.

Other times you did solve the problem. But they didn’t feel the pain until they saw your version. Now they know what they really wanted, and it’s not this.

Meanwhile, your backlog’s filling with edge-case requests and “maybe someday” ideas that sound important but aren’t tied to any clear outcome. You’re building fast, but not learning. Not proving value. Just delivering tasks.

And here’s the worst part: You think you're shipping. But you’re losing trust.

Because users don’t need polish.

They need clarity.

They need signal.

You and I want momentum, not scope creep. Feedback, not silence. Trust, not nice words.

That’s what MVDPs unlock.

Let’s build one.

Special Offer

Tired of being looped in too late?

If you're a data leader, analyst, or engineer who wants more influence check out my course:

The Data Leader’s Influence System

It’s built to help you:

✔ Lead strategic conversations
✔ Push back without burning bridges
✔ Get clarity before chaos starts

Includes pre-order bonuses + 30-day money-back guarantee.

Yes, I Want To Learn More

🪜 Step #1: Define the Problem (Not the Tech)

You don't start with a dataset. You start with a question. And not just any question—the kind that makes a stakeholder pause.

Try this during a 15-minute Zoom or Slack DM: If I could automate one annoying part of your work this week, what would it be?

Real answers I’ve heard:

“I waste so much time chasing overdue clients.”
“I always have to ping R&D to get updated subscription counts.”
“I don’t trust the revenue numbers in Salesforce.”

Now translate that into a problem statement you can actually scope.

✏️ Example:

Finance can’t prioritise debt collection because overdue data is buried in monthly Excel exports.

That’s tight. That’s operational. And it leads you straight to a solvable data problem.

👉 Don’t move forward until you’ve written this one sentence down and read it back to your stakeholder. You want them nodding.

Become a Member

🎯 Step #2: Set a Success Goal

Once you have the problem, you need a result to chase. But keep it tiny.

Here’s a cheat sheet:

Ask: “If we solve this, what becomes easier?”
Ask: “How would you measure if this helps?”

You want numbers. Or time. Or reduced clicks.

Examples:

“Save 3 hours per week of manual filtering.”
“Help us follow up with 10+ clients earlier each month.”
“Catch revenue anomalies within 48 hours instead of 7 days.”

These are small wins. But stackable.

📌 Write it down in plain text in your ticket, Notion doc, or even the SQL header.

🖼️ Step #3: Sketch the Outcome, Not the System

Stop thinking pipelines. Start thinking pixels.

Open Miro. Or Figma. Or my favourite tool, Excalidraw.

Ask yourself:

“What would they see that would solve this?”
“What’s the format they’d consume this in?”

✏️ Example for overdue invoices:

A Google Sheet with 4 columns: Client, Amount Due, Days Late, Owner
Sorted by Days Late DESC
Refreshed every morning before 9am
Delivered via Slack DM with a message: “Here’s your top 10 overdue clients for the day 👇”

That’s it. That’s your MVP interface.

You could even mock it first using dummy data in Google Sheets. Just show it.

Ask: “Would this help?”

If yes, now you build.

You won’t believe how often I do this.

📊 Step #4: Identify Only the Must-Have Inputs

You’re ready to grab data—but only the minimum viable inputs.

Ask: “What is the one table that gives me 80% of what I need?”

In the invoice example, maybe it’s this:

SELECT
  client_id,
  due_date,
  amount,
  status
FROM
  accounting_db.invoices
WHERE
  status != 'Paid';

Skip joins for now. Just get it in a working format.

🐥 If you don’t have DB access, ask for a CSV:

“Can you export your invoice table into Google Drive as invoices_march.csv?”

Use Google Drive + Colab + Pandas to crunch locally if needed.

Don’t overdo data prep. You’ll clean it later.

⚙️ Step #5: Build the Simplest Working Pipeline

Now, wire it up with the lowest friction stack.

💡 Here’s a sample MVP pipeline stack:

Data: invoices.csv on Google Drive
Processing: DuckDB via Python in a Jupyter Notebook
Output: Save result to overdue_clients.csv
Delivery: Upload to Google Sheets + Slack integration via Zapier

Here’s how that might look in Python:

import duckdb
import pandas as pd

df = pd.read_csv('invoices.csv')
con = duckdb.connect()

result = con.execute("""
    SELECT client_id, amount, julianday('now') - julianday(due_date) AS days_late
    FROM df
    WHERE status != 'Paid'
    ORDER BY days_late DESC
    LIMIT 10
""").fetchdf()

result.to_csv('overdue_clients.csv', index=False)

📬 Then automate sending that via Slack using Zapier or a Python Slack bot.

Even a manual Slack message works: “Here’s today’s overdue list ⬇️”

You didn’t deploy. You didn’t build infra. You proved value.

Did you know the Data Gibberish community dives into a new data topic every day?

Join the Chat

📬 Step #6: Deliver It Directly to the User

Here’s the biggest unlock: Don’t wait for them to “check the dashboard.”

Push results to where they already live.

🚀 Slack example:

“Hey [Name], here’s your top 10 overdue clients for today. Let me know if this format works for you.”

Drop the link to the Google Sheet or CSV.

💬 Ask: “Is this helpful?”

You’re listening for feedback like:

“Can we add ‘Client Owner’ to this?”
“This is great—can we automate it?”
“Could we get this weekly too?”

If they forward it to others? You nailed it.

🔁 Step #7: Log Feedback & Plan the Next Iteration

You shipped something. Now comes the gold: feedback.

This is where most teams either move on or overreact. And this pisses me of!

Don’t do either.

Instead, give them a week or two. Then, send a short Slack message:

🗣️ “Was this useful? What’s missing? What did you ignore?”

Here’s how to structure the responses:

✅ “Yes, this helps.”

➤ Great. Ask what they'd change or add. You’re building V2.

❓ “It’s not quite what I need.”

➤ Dig. Ask what they expected. Compare it with the original problem statement.

😶 “I haven’t used it yet.”

➤ That’s feedback, too. Maybe the timing’s off. Maybe delivery is clunky.

Maybe the problem’s not that painful.

Create a simple running doc or Notion table:

Feedback Action Priority Notes Add Client Owner Include join with CRM High Needed for follow-ups Automate updates Schedule script daily Medium Manual for now Weekly summary Email format Low Nice-to-have

Each cycle is a checkpoint. MVPs are about learning, not perfection.

Your job: keep shipping slightly better versions until the user stops thinking about the problem—because it’s handled.

🙅 Avoiding Common Mistakes

Here’s what trips most people up:

🚫 Over-designing early

➤ If you’re sketching out dbt DAGs before your user sees anything, you’re moving too slow.

🚫 Waiting for clean data

➤ MVPs don’t need clean. They need useful.

🚫 Asking “Do you like it?”

➤ That invites compliments. Ask “Is this helpful?” instead.

🚫 Confusing feedback with rejection

➤ Rough drafts aren’t failures. They’re accelerants.

💭 Final Thoughts

Minimum Viable Data Products aren’t a shortcut. They’re a strategy.

They let you learn fast. Show value fast. Earn trust fast.

Instead of “launching dashboards,” you’re solving problems.

And that’s what gets data teams a seat at the table.

In the future? New tools will lower the bar even more. Auto-ingest, no-code joins, self-service reporting—they’re all coming. But none of that matters without the mindset:

What’s the smallest thing I can build to prove value this week?

Answer that consistently, and you don’t just build data products.

You build momentum.

Cheers,

Yordan

😍 How Am I Doing?

Your feedback shapes Data Gibberish. Vote now to improve the publication.

😍 Awesome | 😐 Okay | 🤮 Bad

Data Gibberish