How to Organise Your dbt Projects Better And Boost Domain Expertise
Learn how to apply screaming architecture to make more sense of your dbt models
Data modelling is intricate. You can get lost in any dbt project. Moreover, without documentation and people who can explain the business object dynamics. By placing your files in the right place, you can make business objects and processes first-class citizens. This doesn’t just make debugging easier but also helps you understand the business domain.
In this week’s issue of Data Gibberish, we discuss:
What is wrong with your file organisation in your dbt projects today.
What screaming architecture is and why it’s superior to the standard dbt structure.
Reading time: 5 minutes
But before that, let me share three fabulous pieces of knowledge I read last week.
Picks Of The Week
Some of the brightest people I’ve been working with have no CS degree. In his latest article,
outlines why you can be a succesfull engineer without a degree. (link)A couple of weeks ago, I wrote about how we’ve been unit testing dbt for the last 4 years. Last Wednesday
post an extensive article about unit testing framework dbt Labs introduced last year. (link)Let’s welcome the industry veteran and former MongoDB CTO, Mark Porter as the new dbt Labs CTO. I’m keen to see what this change will lead to. (link)
Now, let’s jump to the main topic this week.
You Are Doing It Wrong
Suppose your company has two products: A
and B
. You need to build a unified view showcasing all users, payments and movements across the organisation’s products. If you follow the community standards, your dbt project probably will look like this:
Community standards to structure dbt projects
In this standard file organisation, your models
have a base directory containing separate subdirectories for each product. Then, in each product_a
and product_b
directory, you can have even more subfolders where you transform and clean the data for each base object.
You also have a staging
directory with some more advanced business logic. This is usually where you match and join users from different products into a single users
model.
Finally, there is a particular folder where you store various models split directories dedicated to departments.
Although it makes some sense, this file organisation is horrendous. Here are some reasons:
Led by technology: Your top-level
base
andstaging
directories mean nothing to the business. You let your technology lead your way.No problem visibility: When technology leads your code, your code tells nothing about the problem it solves. Neither new people nor you can easily understand what magic you are making. Not to mention where it happens.
Challenging maintenance: You’ll find it difficult to support your project. Even if you know which files you need to change, you still need to navigate the whole project for every change.
Now you may think:
Everybody structures their dbt projects that way. Are you saying everybody is wrong?
Well, no, but actually, yes. The good news is it’s elementary to fix that, address the issues I outlined, and even get some more benefits. Let me show you!
Introducing Screaming Architecture
In 2011, Uncle Bob Martin introduced a concept called Screaming Architecture. In his article, Uncle Bob gives you an example of a building blueprint. When you look at the blueprint, you can immediately tell whether the building is a house, a library or something else. The blueprint screams the purpose of the project.
Unlike buildings, your project’s blueprint says nothing about its intent. Instead, your architecture screams about layers, underlying sources and technical processes.
But don’t worry. You can borrow civil engineering practices and apply them in analytics engineering. You don’t need to touch your code but move some files around.
You need just a second to realise this dbt project has user
and payment
objects. You can also immediately tell there’s something for the finance department. Sure, you still have the base
directory, but you can quickly tell this is where you store the logic for the user sources.
And if that is not enough, screaming architecture doesn’t address all the problems we discussed. This organisation also increases scalability and domain understanding. It extends what you already know about SOLID principles.
Now, if your company acquires another smaller product, even if you know nothing about the organisation. You can easily understand how the users of products A
and B
relate to each other. You quickly become a domain expert, and including the users from the new product is a piece of cake.
By now, you should be pretty clear about the benefits of screaming architecture and how to apply it in your dbt projects. Let’s summarise.
Summary
Standard dbt architecture distances you from the domain and puts the technology on a pedestal. It promotes you turn your code into an onion with the actual knowledge hidden under too many layers.
You learned how you can address these issues by implementing screaming architecture. Putting business objects on the front line helps you better understand business dynamics. This organisation also helps with scalability, modularity, maintainability, and flexibility.
Ready to level up your code?
Embrace the power of screaming architecture to unlock better business understanding, scalability, and maintainability. Implement it today!
How Did You Find This Post?
Did you enjoy that piece? Follow me on LinkedIn for daily updates.