Data Engineering Solutions: Discovering What Your Stakeholders Truly Need
Key Questions to Uncover Pain Points And Find Solutions
Hi there,
Data Engineering is an exciting job. It's full of new tools and technical challenges.
But while thinking about technicalities, you may often forget the most crucial element: the people who'll use your creations. No wonder many data engineers feel like they are building solutions in a vacuum.
In this article, you and I will explore the process of data product ideation. You will learn how to identify your key players, uncover their pain points, and create solutions that will make them say, "This is exactly what I needed!"
The data product ideation approach helps not only when working with stakeholders but also when implementing your ideas. I demonstrated it to a friend last week, and he just wowed!
Oh, and before I forget, paid Data Gibberish members can find a prepopulated data product ideation template at the end of the article.
Let's roll up our sleeves and get started!
Reading time: 15 minutes
🖼️ Setting the Stage
Imagine you are a data consultant. Your customer, Needle and Thread, is a clothing company that wants to make more sense of all the data they collect.
The company distributes clothes to different stores, which periodically send them reports. The reports are, in fact, CSV files, which Needle and Thread store on an FTP server.
Every month, somebody from the company looks through the CSVs, aggregates the data in an Excel sheet, and presents the numbers to the organisation.
The organisation is looking for a faster, more straightforward, and more automated way. They also said they have significant future ideas but don't know how to implement them.
Based on what I told you, you may already have a favourite way of building a solution for Needle and Thread. But this would be a terrible mistake.
Here are some tips to make your interviews a smash hit:
Listen more, talk less: Your job is to uncover their pain points, not showcase your technical prowess. I hate it when people interrupt others.
Ask open-ended questions: "How do you feel about the current reporting process?" will get you much further than "Is the current process okay?"
Dig deeper: When you hear something interesting, don't be afraid to ask "Why?" multiple times. You might be surprised at what you uncover!
My framework to identify the perfect solution consists of 5 questions:
Who is my target persona?
What problem do they have?
Why is this a problem?
How can I solve the problem?
Is it working?
Let's begin with the first step.
🧑 Step #1: Identifying Your Stakeholders
If you are lucky, you'll know who you need to work with and who you need to get the signoff from. Most often, however, this is not clear from the start.
It's like if you're at a party, and your mission is to find the VIPs. But instead of looking for the fanciest outfits, you're searching for the people who'll have the biggest impact on (and be most impacted by) your data solutions.
So, how do you spot these VIPs in your organisation? Here are a few techniques:
The Org Chart Deep Dive: Start by examining your company's structure. Who's calling the shots in different departments? These folks are likely to be critical stakeholders.
The Water Cooler Method: Informal chats can be goldmines of information. Ask around about who's been vocal about data-related issues.
The Meeting Crasher: Okay, don't crash meetings, but pay attention to who's regularly invited to data-centric discussions.
Once you've got your list, it's time to prioritise. Think of it like creating a guest list for a dinner party. You can't invite everyone, so who gets the golden tickets? Consider two factors:
Influence: How much sway does this person have over decisions related to your project?
Interest: How invested are they in the outcomes of your work?
Plot these on a simple 2x2 grid. You've got your priority stakeholders.
Let's bring this to life with our clothing company scenario. You've done your homework and identified Jenna, the CFO, as a critical stakeholder.
Why? They have high influence (they control the purse strings) and high interest (they're struggling with the current data situation).
But don't stop there! What's their background? Their goals? Their fears?
As your project evolves, look for new players entering the scene. After all, the only constant is change!
🫤 Step #2: Understanding the Problem Space
Alright, you've identified your VIP stakeholders. Now, it's time to dive into the problem space.
You want to understand more about your stakeholder's existing data sources and systems here.
Here are some techniques to help you become a data landscape expert:
The Data Inventory: Create a catalogue of all data sources. Where's the data coming from? How often is it updated? Who's responsible for it?
The Flow Diagram: Map out how data moves through the organisation. This can help you spot bottlenecks and inefficiencies.
The Quality Check: Assess the reliability and accuracy of your data sources. Are there gaps, inconsistencies, or duplications?
The Tech Stack Review: What tools and technologies are currently in use? Are they up to date? Are they talking to each other effectively?
Let's revisit the Needle and Thread scenario. Here, you discover that the company stores all the financial data on an FTP server.
You dig deeper and learn more:
The distributors regularly upload the CSV files manually to the FTP server.
Different distributors use different naming conventions, making it hard to consolidate data.
Someone in the company's job is to spend 2 weeks aggregating the data from the CSVs to prepare a monthly report for the CFO.
Working with CSVs doesn't make sense from a technical point of view. But it's okay from a business standpoint.
Jenna called you because she dislikes the manual reporting process and wants a "real-time" view of their financials.
During your interview, you also discover the CFO's frustration with the lack of forecasting capabilities. They dream of predicting next quarter's sales as quickly as checking the weather forecast.
But why is this a problem? Let's take a close look.
📉 Step #3: Grasping The Problem Impact
Most often, your stakeholders don't know what they need. Your role as an expert is to help them figure it out.
So, as your stakeholder and you discuss the current situation, you need to ask a lot of questions:
Is that a problem?
Why is this a problem?
What would you achieve if this problem didn't exist?
You ask this repeatedly until you and your stakeholders clearly understand the problem and its impact. You may find that your client's situation does not have this big an impact. This may even cause you to get back to the problem definition.
When it comes to Needle and Thread, you find out that the CFO's biggest problem is how slow collecting all this data is. When you add 2 weeks to the monthly report, Jenna's data is half a quarter old.
And that is true when everything else is fine. Due to the manual nature of the reporting process, there are occasional mistakes in the reports, which cause further delays.
The entire process is ridged, so it's nearly impossible to check historical data. Not to mention, the person who does the manual reporting could do something far more productive.
Now, as you have a problem and impact definitions, it's time to find a solution.
🧠 Step #4: Brainstorming Solution Concepts
Believe it or not, you have already gone through the most challenging steps of your data product ideation. Once you know who you serve and what problem you need to address, you need to find a way to improve the situation.
Before jumping straight into the solution, make sure you dump as many ideas as you can. Here are some methods to improve your brainstorming process:
The Classic Brainstorm: Get a diverse group together, set a time limit, and let the ideas fly. Remember, quantity over quality is essential at this stage!
The Reverse Brainstorm: Instead of asking, "How can we solve this?" ask, "How could we make this problem worse?" Then, flip those ideas on their heads.
The SCAMPER Technique: This stands for Substitute, Combine, Adapt, Modify, Put to another use, Eliminate, and Reverse. Apply each of these to your problem and see what ideas emerge.
The Analogy Approach: Think of a similar problem in a different field. How was it solved there? Can you adapt that solution?
Once you've got an array of ideas, it's time to prioritise. One effective technique is the Impact/Effort matrix. Plot your ideas on a grid where one axis is the potential impact, and the other is the effort required. Focus on the high-impact, low-effort ideas first – your "quick wins".
Don't pick just one idea. I usually have one top choice and a handful of alternatives. As you generate these ideas, consider the balance between stakeholder needs and technical feasibility.
For the clothing company, my primary solution is to:
Use PostgreSQL as a database.
Implement an ETL process with Airflow and Polars.
Set up Metabase for the data visualisations.
I hate FTP servers for many reasons, but it does the trick in Needle and Thread's case. Also, it acts as a data lake, which means you don't need a more powerful warehousing solution like ClickHouse or Databricks.
The goal isn't to find the perfect solution right away. It's to generate a range of possibilities you can refine and combine. After all, every data revolution starts with a single "What if...?".
♻️ Step #5: Validating Your Ideas
Before you start coding away, it's time for a reality check. Here's how to validate your ideas:
Create Minimal Viable Products (MVPs): This is like making a prototype of your data solution. It doesn't need all the bells and whistles, just enough to demonstrate the core concept. This could be a simple dashboard showing critical financial metrics for the clothing company's CFO.
Run Pilot Tests: Choose a small group of users (including your key stakeholders) to test your MVP. This is your chance to get real-world feedback before investing heavily in development.
Gather Feedback: Don't just ask, "Do you like it?" Dig deeper with questions like "How would this fit into your daily workflow?" or "What features are missing that would make this indispensable for you?"
Iterate, Iterate, Iterate: Use the feedback to refine your ideas. Maybe your Jenna loves the dashboard concept but wants to see forecasting capabilities. Back to the drawing board, you go!
Validation isn't about proving you're right. It's about learning what's truly valuable to your stakeholders. Be prepared for surprises – sometimes, the feature you thought was a game-changer turns out to be a dud. And sometimes, a seemingly minor addition becomes the star of the show.
Let's say you've created a basic dashboard for Thread and Needle. During the validation process, you might discover:
The CFO loves the real-time updates but finds the interface cluttered.
Jenna's excited about the potential for forecasting but needs more granular data to be truly useful.
An unexpected benefit: other department heads are interested in similar dashboards for their areas.
On top of that, you agree that real-time reporting means weekly updates. This timing aligns with the scheduled C-Suite meetings.
This feedback is gold dust for your project. It helps you prioritise your development efforts and uncover new opportunities. It ensures you're building something that people will use and value.
Validation is not a one-time event. It's an ongoing process throughout your project. Keep checking in with your stakeholders, keep testing, and keep refining. Your goal is to create a data product so good that people wonder how they ever managed without it.
Do you enjoy this article? Please show some love ❤️ by liking my latest LinkedIn post.
😖 Common Pitfalls in Data Product Ideation
Let's shine a light on some common pitfalls.
Mistake #1: The Solution-First Trap 🪤
You know that feeling when you've just learned a cool new tech? Suddenly, every problem looks like a nail for your shiny hammer.
That's the solution-first trap. It's tempting to start with a solution and then go hunting for a problem it can solve. But you are here to solve real stakeholder problems, not showcase our technical prowess.
How to avoid it:
Always start with the stakeholder's problem. Ask yourself, "Am I solving a real need, or am I just excited about this technology?"
Mistake #2: The Short-Term vs. Long-Term Tug-of-War 🕛
Do you go for the quick win that will make your stakeholders happy now? Do you instead invest in a solution that will pay off in the long run?
How to avoid it:
Aim for a balanced approach. Can you design a solution that addresses immediate needs while laying the groundwork for future expansion?
In the Needle and Thread case, the CFO dreams of live views and forecasting. We started with up-to-date dashboards and plan to add predictive capabilities later.
Mistake #3: The Scalability Oversight 🏋️
Picture this: You've built a beautiful data solution that works perfectly for today's data volume. Fast-forward six months, and it's creaking under increased data load. This is a real story, by the way.
How to avoid it:
Always design with growth in mind. Ask yourself, "How will this solution perform if our data volume doubles? Triples? Increases tenfold?"
It's better to overengineer a bit now than to rebuild from scratch later. But remember to balance that with the YAGNI concept.
Mistake #4: The Stakeholder Echo Chamber 🫧
It's easy to get caught up in the needs of your primary stakeholder and forget about other users or departments. But data doesn't exist in a vacuum!
How to avoid it:
Consider how your solution might impact or benefit other departments, even if you primarily serve the CFO.
Mistake #5: The "Perfect Solution" Paralysis 💯
Spending months trying to design the perfect solution will result in your stakeholders still struggling with their data problems.
How to avoid it:
Embrace iterative development. Start with a solid MVP that addresses the core need, then refine and expand based on feedback. A good solution today is better than a perfect solution next year!
Mistake #6: The Communication Breakdown 💬
You create a fantastic data solution, but your stakeholders seem underwhelmed. Why? Often, it's because the value hasn't been effectively communicated.
How to avoid it:
Learn to speak your stakeholders' language. Translate technical features into business benefits.
Don't just show them a pipeline. Show them how it will save time, improve decision-making, and impact the bottom line.
You're already ahead of the game by staying aware of these common traps. Keep your stakeholders' needs at the forefront, stay flexible, and don't be afraid to adjust course as you go.
🏗️ Advanced Techniques for Data Product Ideation
You've mastered the basics of data product ideation. Now it's time to level up! Let's explore advanced techniques that can take your ideation game from good to great.
Design Thinking in Data Engineering 🤔
You might think design thinking is just for UX designers, but it is also a powerful tool for data engineers. The idea here is to understand users, challenge assumptions, and redefine problems.
Here's how you can apply it to data engineering:
Empathise: Go beyond interviewing stakeholders. Shadow them for a day to understand their workflow and pain points.
Define: Create detailed personas for your data product users. What are their goals, frustrations, and motivations?
Ideate: Use techniques like "Crazy 8s," where you sketch eight ideas in eight minutes. This exercise forces you to think beyond the obvious solutions.
Prototype: Build quick, low-fidelity prototypes of your data solutions. This could be as simple as a paper mockup of a dashboard.
Test: Get your prototypes in front of users early and often. Observe how they interact with your solution and gather feedback.
For Jenna, the CFO, this might mean creating a cardboard mockup of a financial dashboard and asking her to "interact" with it. You might be surprised at what you learn!
Data Storytelling 📊
Data without context is just numbers. Your job isn't to provide data. Your job is to help stakeholders understand and act on it. That's where data storytelling comes in.
Know your audience: Tailor your story to your stakeholders. The CFO might be more interested in different aspects of the data than the marketing manager.
Set the scene: Provide context. Why is this data important? What's the bigger picture?
Build tension: Highlight the problem or opportunity the data reveals.
Offer resolution: Show how your proposed solution addresses the issue.
Call to action: What should your client do with this information?
Instead of just presenting sales figures, you could tell a story about how seasonal trends affect revenue. You can also explain how your forecasting solution could help the company prepare for these fluctuations.
Cross-Industry Inspiration 💡
I already mentioned that. Sometimes, the best ideas come from unexpected places. Look at how other industries are solving similar problems:
How do financial services companies handle real-time data processing?
What can healthcare's approach to data privacy teach us?
How do tech companies create user-friendly data interfaces?
You might find that a solution from a completely different industry sparks an idea for your clothing company.
And that's everything for today. Let's wrap it up.
🏁 Summary
You and I explored the art of data product ideation. From identifying stakeholders to brainstorming solutions and avoiding common pitfalls.
Throughout this process, I emphasised the importance of keeping stakeholders at the centre of everything. Remember, you're not just building data solutions – you're solving real problems for real people.
But this is just the beginning. The real magic happens when you take these ideas and turn them into reality.
Remember these key takeaways:
Validate your ideas early and often.
Don't be afraid to think outside the box.
Always start with the stakeholder's needs.
Learn from setbacks and iterate, iterate, iterate.
Keep an eye on the future while solving today's problems.
As promised at the beginning of the article, I have a data product ideation template. You can find this and a few more in the "Resources for Paid Members section".
Now, I have a challenge for you. Take one idea from this article – just one – and apply it to your data problem.
Remember, every data revolution starts with a single idea. What will yours be?
Until next time,
Yordan
📚 Picks of the Week
Unless you live under a rock, you’ve heard (or experienced) the Crowdstrike outage last week. As always,
has a ton of details. (link)Do you need a quick practical tutorial on AWS? Read this excellent collaboration product by two of my favourite data engineering newsletter writers. (link)
Data storytelling is not only for data pros. Learn how software engineers can also use this technique to influence their organisations. (link)
❓ Monthly Ask Me Anything
Do you want to learn anything about data and data engineering? Don’t miss the opportunity to participate in this month’s Ask Me Anything.
Maybe you have something bigger in mind? Reply to this email and reach out to me. I love chatting with friendly people.
Very informative ! Thanks for sharing this.