Merge vs Rebase in Git: Data Engineer’s Guide to Choosing the Right Workflow
Learn the inner workings of merge and rebase in Git, uncover their key differences, and get actionable advice on when to use each.
Greetings, curious reader,
When working on a branch in Git, you’ll often need to pull updates from the main
into your branch. This process ensures your branch has the latest changes and avoids conflicts when merging your work into main
. At this point, you’re faced with two common options: merge
or rebase
.
Both approaches are valid, but they serve different purposes. For beginners, the decision between merge
and rebase
can be confusing.
Merge adds a new commit to combine histories, while rebase rewrites history to create a linear sequence. Your choice will impact your branch’s commit history and how you collaborate with others on a project.
In this article, I will explain how merge
and rebase
work under the hood, provide examples, and help you make an informed choice. Along the way, you’ll better understand Git’s internals, including how branches, commits, and pointers work together.
🔀 Option #1: Merge
📘 What Is Merge?
Merge is a method of integrating changes from one branch into another.
When you merge main
into your current branch, Git combines the changes from both branches and creates a new merge commit to represent this operation.
This new merge commit acts as a “bridge” that connects the histories of the two branches.
Merge does not modify the commits in either branch. Instead, it keeps the history of both branches intact, maintaining a clear record of all changes.
As a result, your Git history will reflect both the original commits from main
and the new merge commit that brings the two branches together.
Merge is Git’s default and most commonly used method for integrating changes, as it is straightforward, safe, and doesn’t alter existing commits.
⚙️ How Merge Works Internally in Git
To understand merge, it’s helpful to look at Git’s internals. In Git, a branch is just a pointer to the latest commit in a series of commits. Each commit is a snapshot of your project that points to its parent commit.
Imagine you’re working on a branch called feature-branch
, which started from main
. While you’ve been working, the main
branch has moved forward with new commits. Now, you want to pull those updates into feature-branch
.
Here’s what happens when you run a git merge main
:
Git Finds the Merge Base: The merge base is the last shared commit between
feature-branch
andmain
. This is the point where the two branches diverged. Git uses this as a reference to calculate the differences between the branches.Compares Changes: Git looks at the changes introduced on
main
since the merge base and the changes made onfeature-branch
.Creates a Merge Commit: Git combines the changes from both branches and creates a new merge commit on
feature-branch
. This commit ties the histories ofmain
andfeature-branch
together.
The merge commit acts as a “checkpoint” that records the merge event. Both branches’ histories are preserved, and you can see precisely where the merge occurred.
✨ Why Use Merge?
The primary advantage of merge is that it preserves the full history of all changes. You can trace every commit in both branches, including the merge commit. This makes it ideal for collaborative projects where it’s important to keep a record of who contributed what and when.
Merge is also beginner-friendly because it doesn’t rewrite history. Your existing commits remain untouched, and you don’t have to worry about accidentally overwriting changes. This makes merge a safe and predictable option for pulling changes into your branch.
When I introduced Git to my previous place nearly years ago, we used this approach. It was much easier as everybody except me was a first-time user.
However, merge does have a drawback: it creates extra merge commits. And this can clutter your Git history, especially in projects with frequent updates.
While this doesn’t affect functionality, it can make the history harder to read and follow.
🚦 Example Use Case: Pulling Changes with Merge
Let’s say you’re working on feature-branch
, which is two commits ahead of main
. Meanwhile, main
has received two new commits that you want to pull into your branch.
Switch to Your Branch: Run
git checkout feature-branch
to ensure you’re on the branch you want to update.Merge
main
into Your Branch: Rungit merge main
. Git will find the merge base, compare the changes, and create a new merge commit.Resolve Any Conflicts: If the two branches contain conflicting changes, Git will pause the merge and ask you to resolve the conflicts manually. Once resolved, run
git commit
to finalise the merge.
After the merge, your branch will include all the changes from main
as well as a new merge commit. The history will clearly show the integration point.
* Merge commit (new)
|\
* | Commit on main
| * Commit on feature-branch
* | Commit on main
| * Commit on feature-branch
| * Commit on feature-branch
|/
* Base commit on main
🔄 Option #2: Rebase
📘 What Is Rebase?
Rebase is another method for pulling changes from one branch into another. Unlike merge, rebase rewrites the history of your branch to create a linear sequence of commits.
When you rebase main
into feature-branch
, Git moves the feature-branch
pointer to the tip of main
. It then re-applies your commits from feature-branch
on top of the new base. This process creates new versions of your commits, effectively rewriting your branch’s history.
The result is a clean, linear history that looks like your commits were made after the latest changes in main
. I usually say it’s like a time machine.
This can make the history more straightforward to read and debug, but it comes at the cost of rewriting commits.
⚙️ How Rebase Works Internally in Git
Let’s break down what happens inside Git during a rebase:
Git Removes Your Commits Temporarily: Imagine
feature-branch
has two commits, andmain
has two new commits. When you rungit rebase main
, Git temporarily removes the two commits fromfeature-branch
and saves them in a temporary area called the rebase buffer.Moves the Branch Pointer: Git moves the pointer for
feature-branch
to the tip ofmain
. At this point,feature-branch
contains only the commits frommain
.Re-applies Your Commits: Git takes the commits from the rebase buffer and re-applies them one by one on top of the new base. These re-applied commits are technically new commits with new commit IDs.
✨ Why Use Rebase?
The most significant advantage of rebase is the clean, linear history it produces. This makes it easier to follow the progression of changes, especially in long-running projects.
A linear history is also helpful during debugging, as it eliminates the need to account for merge commits. This is the main reason I prefer and usually recommend this flow.
However, rebase is more complex than merge and comes with risks. For example, if the branch is shared with others, you can inadvertently overwrite significant changes by rewriting history or creating inconsistencies.
Rebase should only be used on branches where you work alone.
🚦 Example Use Case: Pulling Changes with Rebase
Imagine you’re working on feature-branch
and want to incorporate the latest changes from main
.
Switch to Your Branch: Run
git checkout feature-branch
.Rebase main into Your Branch: Run
git rebase main
. Git will temporarily remove your commits, move the branch pointer tomain
, and reapply your commits on top of the new base.Resolve Any Conflicts: If there are conflicts, Git will pause the rebase and prompt you to resolve them. Once resolved, run
git rebase --continue
to finish the rebase.
After the rebase, your branch history will look like your commits were created after the latest changes in main
.
* Merge commit (new)
|\
| * Commit on feature-branch
| * Commit on feature-branch
| * Commit on feature-branch
| /
* Commit on main
* Commit on main
* Base commit on main
⚖️ Side-by-Side Comparison
📜 Commit History
Merge preserves all branch histories and creates a new merge commit, while rebase rewrites the branch history to make it linear. Merge is better for preserving context, and rebase is better for simplicity.
😵💫 Complexity
Merge is easier to use and safer for beginners because it doesn’t modify existing commits. Rebase is more advanced and requires caution, especially when resolving conflicts.
🆚 Choosing the Right Workflow
Use merge to preserve all history and avoid rewriting commits. This is the default and safest option, especially for beginners.
On the other hand, if you prefer a clean, linear history and are want long-term maintainability, use rebase. It’s ideal for teams who rely on traceability.
If you have enjoyed the newsletter so far, please show some love on LinkedIn and Threads or forward it to your friends. It really does help!
💭 Final Thoughts
When choosing between merge and rebase, remember that tools are secondary. What matters are the people using them and the problems they solve together.
In data engineering, collaboration is crucial. The workflow that keeps your team aligned and productive will always outweigh technical preferences.
And remember, your projects require effective communication, clear workflows, and trust among team members. Whether your team uses merge or rebase doesn’t matter as much as whether everyone understands the workflow and feels confident using it.
A consistent, team-wide Git strategy will consistently deliver better results than individual optimisation.
Merge and rebase also reflect a broader reality in data engineering: tools come and go, but fundamentals last. Git, like the databases and orchestration frameworks we use, is just one part of the data engineering toolkit.
Its workflows teach important lessons about collaboration, consistency, and trade-offs. These lessons apply to version control and every part of the data engineering process.
While the field of data engineering continues to evolve—with the rise of real-time processing, data mesh, and AI-driven automation—the need for solid fundamentals remains constant.
Knowing how to collaborate, manage complexity, and deliver reliable systems will always be essential.
Merge and rebase may seem like small decisions, but they reflect these deeper skills every data engineer must master to be effective.
Ultimately, Git workflows enable teams to work together and deliver value.
Choose the best approach for your team, experiment to refine it, and stay focused on the bigger picture: building systems that empower others to make better decisions with data.
This version keeps the message concise while still delivering meaningful insights. Let me know if you’d like further tweaks!
🏁 Summary
When pulling changes into your branch, choosing between merge and rebase involves understanding their behaviour, trade-offs, and how they fit into your team’s workflow.
Merge is simple, safe, and preserves a detailed history, while rebase provides a clean, linear history at the cost of rewriting commits.
Both approaches are valuable, but their effectiveness depends on the context in which they’re used.
Key Points to Remember
Use Merge:
When working on shared branches or collaborative projects.
If you want to preserve a detailed history for tracking all changes.
When you prefer a more straightforward, safer workflow, especially for beginners.
Use Rebase:
When working on private feature branches that haven’t been shared.
To create a polished, linear history that’s easier to read and debug.
If you’re confident resolving conflicts multiple times and comfortable with Git internals.
Both merge and rebase are powerful tools with different purposes. The best choice depends on your team’s preferences, project requirements, and confidence in Git.
Start experimenting with both on small projects and work towards mastering them over time.
Until next time,
Yordan
🚀 What’s Next?
✨ Sponsor This Newsletter
Help us grow and reach more curious minds. Your support fuels fresh insights every week. Together, we make it happen.
💬 Leave a Testimonial
Love what you’re reading? Share your thoughts to help new readers discover Data Gibberish. Your words mean a lot.