Hi there,
You are currently reading the special monthly issue of Data Gibberish. These monthly recaps allow you to catch up on what you missed last month. Enjoy your reading!
Demystifying Data Flow: ETL and ELT Explained Simply
Many data professionals struggle to decide between ETL and ELT. But mastering them is crucial for effective data management and analysis. I wrote a two-part series to help you make the best choice for your data product.
In part one of my ETL/ELT series, I explain these concepts in depth.
ETL (Extract, Transform, Load) has been around for ages. Some call it old school, but be aware - it still has its use cases.
On the flip side, ELT (Extract, Load, Transform) has become the standard nowadays, thanks to the modern data stack hype and the power of cloud computing.
I also share some use cases and real-world examples of ETL and ELT use cases I've worked on.
ETL vs. ELT: Which Data Integration Approach Reigns Supreme
In part two of the ETL/ELT series, we dive much deeper into the practical side of things. Here, you learn how ETL and ELT compare in terms of costs, scalability, security, and more.
Why ETL is a Good Fit for You:
Complex Transformations: ETL is well-suited for scenarios where data transformations must be performed before loading into the data warehouse.
Existing Infrastructure: Since you have dedicated ETL servers and staging areas, ETL can leverage your existing setup.
Data Security: ETL allows for data masking or encryption before loading, which aligns with your high security and compliance needs.
Batch Processing: ETL is ideal for batch processing and matches your non-critical real-time processing requirement.
Technical Expertise: Your team’s expertise in ETL processes ensures efficient design and maintenance of ETL pipelines.
Why ELT is a Good Fit for You:
Large Data Volumes: ELT is designed to efficiently handle large to massive data volumes by leveraging modern data warehouses' processing power.
More straightforward Transformations: ELT performs transformations within the data warehouse, which suits your relatively simple transformation needs.
Cloud Infrastructure: Since you rely on cloud-based data warehouses, ELT can take full advantage of the scalability and flexibility of cloud platforms.
Real-Time Processing: ELT supports real-time or near-real-time data processing, which is critical for your use case.
Cost-Effectiveness: ELT can be more cost-effective, especially when using cloud-based solutions that align with your budget preferences.
I also included a questionnaire to help you choose between these approaches for your following data pipeline.
People Analytics For All: Building a Data-Driven Business Revolution
17% manager turnover reduction. That is just one of the many benefits organisations using People Analytics have observed.
Using data can help your organisation eliminate unnecessary HR processes and focus on what matters. People Analytics increases employee satisfaction and productivity.
Here’s what you can learn from my last article:
What People Analytics Is, why is It Important, and why is it hard
Our People Analytics project’s requirements, goals and challenges
What possible solutions were, what our short and long-term picks are
Click the link to read the article and drive a positive organisational transformation.
Data Engineering Solutions: Discovering What Your Stakeholders Truly Need
When designing data products, you get too generic. You try to solve everyone's problem with one solution. Not to mention how hyped you get by the technicalities
Understanding stakeholder needs is crucial in data engineering. Here's a step-by-step process for designing data solutions that solve real problems.
First, identify your stakeholders.
Use techniques like examining the org chart, informal chats, and observing who attends critical meetings. Prioritise stakeholders based on their influence and interest.
Next, understand the problem space.
Conduct stakeholder interviews with open-ended questions to uncover pain points. Focus on five key questions: Who is the target persona? What problem do they have? Why is it a problem? How can you solve it? What would the impact be?
Analyse the current data landscape.
Audit existing data sources and systems, map data flows, and assess data quality. Identify gaps between the current state and desired outcomes.
Brainstorm solution concepts.
Use methods like classic brainstorming, reverse brainstorming, and the SCAMPER technique. Prioritise ideas using an Impact/Effort matrix.
Validate your ideas.
Create minimal viable products (MVPs) and run pilot tests. Gather feedback and iterate based on stakeholder input.
Some tips
Avoid common pitfalls like solution-first thinking, neglecting scalability, and poor communication. Use advanced techniques like design thinking, data storytelling, and leveraging emerging technologies.
Throughout the process, keep your stakeholders' needs at the forefront. Continuous iteration and stakeholder engagement are essential.
I also included a data product ideation for paid Data Gibberish members.
The Counterintuitive Approach to Fostering DataOps Culture That Actually Works
Are you tired of struggling to get buy-in for your DataOps initiatives?
Here's a different approach that works.
𝗧𝗵𝗲 𝗣𝗿𝗼𝗯𝗹𝗲𝗺 🤯
Traditional DataOps approaches often fail because:
Long development cycles delay value delivery
Over-engineered solutions are expensive to maintain
Systems can't adapt to changing business needs
Focus is on technical excellence instead of business value
Slow adoption and limited support from higher-ups
𝗧𝗵𝗲 𝗦𝗼𝗹𝘂𝘁𝗶𝗼𝗻 🏰
Think of it like building a sandcastle:
Quick to build, looks impressive, fast
Won't last forever, but easy to change or rebuild
You get better each time you do it
Key aspects:
Prioritize speed and business value over perfection
Deliver results quickly and iteratively
Show immediate impact to secure investment
Make calculated trade-offs to move fast
Communicate risks upfront
𝗕𝗲𝗻𝗲𝗳𝗶𝘁𝘀 🙌
Faster time-to-value: Deliver results quickly, build momentum, secure buy-in
Increased agility: Respond rapidly to changing requirements
Improved collaboration: Align stakeholders on goals and objectives
Cost reduction: Automate processes, improve efficiency
Scalability: Adapt quickly to changing business needs
Until next time,
Yordan
❓ Ask Me Anything
Do you want to learn anything about data and data engineering? Don’t miss the opportunity to participate in this month’s Ask Me Anything.
Maybe you have something bigger in mind? Reply to this email and reach out to me. I love chatting with friendly people.
😍 How Am I Doing?
I love hearing you. How am I doing with Data Gibberish? Is there anything you’d like to see more or less? Which aspects of the newsletter do you enjoy the most?
Use the links below, or even better, hit reply and say hello. Be honest!