n247s
n247s

Reputation: 1918

git graceful rebase/complete merge to branch

Goodday,

I am currently trying to figure out how I can 'overwrite' a branch without using something drastic like 'rebase'.

For example:

branch::master
|- dir_a
|  |- file_a
|  |- file_b
|- file_c

branch::dev
|- dir_a
|  |- file_b
|- file_c        [edited compared to branch::master]


branch::master   [after merge from branch::dev with strategy 'ours' (keep changes from branch::dev)]
|- dir_a
|  |- file_a     [is also merged, but should be deleted]
|  |- file_b
|- file_c        [edited version from branch::dev]

To fix the above problem, I could do a 'rebase', but that would break loats of stuff, since the master(head) branch is not a private/personal one.

So the question is: Is there a good way to completly overwrite/replace the content of the master branch so that dependencies on the current master branch are not broken?


information that may influence the possibilities:

If there is any additional clarification needed, feel free to ask. Thanks for thinking with!

Upvotes: 0

Views: 129

Answers (2)

knittl
knittl

Reputation: 265131

If you want to do a merge without doing a real merge, just joining history by selecting the "ours" merge strategy:

git checkout dev
git merge -s ours master

Your history will record a merge of dev to master, history of both branches will stay intact, but the file contents will look like master never existed (they only reflect dev content)

Upvotes: 0

torek
torek

Reputation: 487745

Is there a good way to completly overwrite/replace the content of the master branch ...

There is more than one, but before you pick any, you need to be clear on what you mean by branch and content.

The thing to keep in mind about branches is that each branch name simply says my latest commit is _____ (fill in the blank with a raw commit hash ID). The commits that are "on" the branch depend on that commit.

Every commit has a unique hash ID. That hash ID means that commit, and not any other commit. Once a commit is made, no part it can ever change. All commits are totally frozen for all time. Commits are mostly permanent, too: at most, you can stop using a commit, and take away the means of finding that commit, as we'll see in a moment.

Each commit holds two things: as its main data, a full and complete snapshot of all files, and as its metadata—information about the commit. This includes the name and email address of the person who made it, and the log message explaining why they made that commit. But most important of all, at least for Git, is this: When you, or anyone, make a new commit, Git records the raw hash ID of its immediate parent commit. That hash ID is the hash ID of the commit you had out just a moment ago, that was the last commit in the branch. Again, no part of this data (snapshot) or metadata (including parent hash ID) can ever change from here on out—so this means that this commit remembers the previous branch-tip commit.

Hence, the "content of a branch" can be:

  • the raw hash ID of the last commit in the branch; or
  • that hash ID and hence that commit, plus the hash ID stored in that commit and hence the previous commit, plus another hash ID stored in that commit and hence another previous commit, plus ...

If we draw these commits—which requires making assumptions about their hash IDs; here I'll just replace each actual hash ID with single uppercase letters—we get this kind of picture:

             I--J   <-- master
            /
...--F--G--H
            \
             K--L   <-- dev

That is, the name master holds the raw hash ID of commit J (whatever it really is). Commit J holds the raw hash ID of commit I, which holds the hash ID of commit H. Meanwhile the name dev holds the hash ID of commit L, which holds that of K, which holds that of H.

Note that commits up through H are on both branches.

In this situation, the branches join up at H, at least, when viewed the way Git views them: starting at the end and working backwards, one commit at a time.


1Technically, the index contains the files' names, their mode (+x or -x), and a reference to the frozen-format content.


How branches typically grow

To make a new commit, you might start with git checkout name. Let's say the name in this case is master. This finds the commit to which the name points—in this case, J—and copies the read-only contents of the commit out of the commit, into Git's index and into your work-tree. The work-tree is where you can see and edit your files. This also attaches the special name HEAD to the name master:

             I--J   <-- master (HEAD)
            /
...--F--G--H
            \
             K--L   <-- dev

Git can now see that the current branch is master (by seeing where HEAD is attached) and the current commit is J (by seeing where master points). Your work-tree now has the files that you work with—ordinary files, not freeze-dried (compressed and Git-only) committed files—and in Git's index, there are copies1 of the freeze-dried files from J, ready to go into a new commit.

You can now modify the work-tree all you like. When you're done, you can run git add on various files. This copies each added files' content back into Git's index, squishing them down to the freeze-dried form that Git will store in a commit, replacing the previous copy. Or, you can git add a new file, or git rm an existing file; either way, Git updates the index accordingly.

Then, you run git commit. Git simply packages up whatever is in the index, adds your name and the current time, adds your log message, and writes that out as a new commit. The new commit's parent is the current commit and the hash ID is a unique checksum of the new commit's content, which can no longer ever be changed. We'll call this new commit N (skipping over M). N points back to J:

                  N
                 /
             I--J   <-- master (HEAD)
            /
...--F--G--H
            \
             K--L   <-- dev

but now we get the trick that makes branches useful: Git now writes the new commit's hash ID into the branch name. So we now have:

             I--J--N   <-- master (HEAD)
            /
...--F--G--H
            \
             K--L   <-- dev

Note that HEAD has not changed: it's still attached to master.

Now let's get rid of commit N. It won't actually go away—it will still be in our repository—but we'll arrange for the name master to identify commit J again. To do that, we find the actual hash ID of J, or any proxy for it, and use git reset --hard:

git reset --hard <hash-of-J>

Now we have:

                  N   [abandoned]
                 /
             I--J   <-- master (HEAD)
            /
...--F--G--H
            \
             K--L   <-- dev

If you do this in your repository, without using git push to send new commit N to some other Git repository, only you will have commit N. No one else will ever know you did this!

Using git reset arbitrarily

You can, with git reset, move any branch to any commit. But if you move a branch "backwards", like we just did, the commits that "fall off the end" become hard to find. Suppose you forced the name master to point to commit L. How, then, would you find commit J? What about commit I? They seem to be gone!

This could be what you want. But if some other Git repository already has commits I and J, perhaps someone using that repository has built new commits that link back to J. They won't appreciate you asking them to forget all their commits, along with I and J.

If it's OK to make everyone else forget I and J, you can just reset master to point to L. You then have to use git push --force or equivalent to convince some other Git repository to forget I and J (and maybe additional commits) too, and everyone else who's using a clone of this repository needs to make sure their Git forgets I and J, and so on.

This is the fastest and easiest way to make master match dev, but it is also the most disruptive to everyone else. Git "likes" it when branches grow and "dislikes" it when they lose commits.

How merge works (abbreviated)

Let's look now at what happens if you run git merge dev (with master pointing to J again). Git needs, for this merge operation, three input commits. One is your current commit, which merge calls ours. One is any other commit of your choice, which merge calls theirs. The third one—which in some sense is the first one; at least, it gets number 1 in a moment—Git finds on its own.

By using git merge dev, you selected commit L as the theirs commit. That's because the name dev selects commit L. Git now works backwards from both branch tips to find the best shared commit, which in this case is clearly commit H. Git calls this the merge base.

Remember that each commit holds a snapshot. So commit H has a snapshot, and commit J has one, and commit L has one. Git can, at any time, compare any two snapshots to see what is different. The command that does this for you is git diff. The merge operation wants to know what is different, so it runs the internal equivalent of:

git diff --find-renames <hash-of-H> <hash-of-J>   # what we changed on master
git diff --find-renames <hash-of-H> <hash-of-L>   # what they changed on dev

Git can now combine these two sets of changes, and apply the combined changes to the snapshot in H (not J, not L, but H: the merge base). If we added a file and they didn't, Git adds the file. If we deleted a file and they didn't, Git deletes the file. If they changed a file and we didn't, Git changes the file.

The result, if all goes well, is the combination of our changes and their changes. Git will stuff this result back into the index—which is the place from which Git makes commits—while also updating our work-tree so that we can see what it did. Then, since all went well, Git makes a new commit.

The new commit, which we'll call M for merge, has not one but two parents. The first parent is the same as usual: M links back to existing commit J, which is where master was a moment ago. But Git adds commit L, the one we chose to merge, as a second parent. So now the picture is:2

             I--J
            /    \
...--F--G--H      M   <-- master (HEAD)
            \    /
             K--L   <-- dev

Commit M has a snapshot. Its snapshot was made by Git (rather than by you), which combined changes from the common starting point—the merge base—and applied the combined changes to the merge-base snapshot.

If you get merge conflicts, the index takes on an expanded role and helps with resolving the conflicts. It becomes up to you, not Git, to determine the final snapshot in ultimate merge commit M. But if there aren't any merge conflicts, Git will normally make the merge on its own. Normally is an important word here.


2Commit N still exists, but with no way to find it, you can't see it any more, and we don't need to bother to draw it in. Eventually, Git will remove it entirely—typically some time after at least 30 days have passed. Until then, you can get it back if you want it: you just have to find its hash ID.


That's a normal merge; you may want an overwriting merge

Suppose that you could tell Git: Start the merge, but don't make merge commit M yet. Git would find the merge base as usual, combine your changes and their changes as usual, and update your work-tree and Git's index as usual ... and then stop, without making the commit.

You can do exactly that, using git merge --no-commit. Once you have done that, you can replace the merge result with whatever you like. Put any file you like into your work-tree, and use git add to have Git copy it into the index. The index copy, which is what will go into the new merge commit, now matches whatever file you put in the work-tree.

You can add new files and remove files entirely here. Whatever you do, it's up to you: you control everything at this point. You can make the merge have any content you want.

What you want—according to your question, anyway; think about whether this is really what you want—is to completely ignore the difference from the merge base H to your commit J, and just take the difference from H to their comimt L. Of course, that will match the snapshot in commit L exactly.

There is a fast and easy way, using git merge -s ours, to tell Git that it should completely ignore their branch's changes and just use your version of everything, i.e., to take commit J. Unfortunately, that's the opposite of what you want here: you're asking for a way to completely ignore your branch's changes and just use their version of everything, i.e., to take commit L.

Fortunately, there's a rather magical command3 you can use that is fast and easy, to get what you want here:

git merge --no-commit dev
git read-tree --reset -u dev

This git read-tree -u command tells Git: Replace the index and my work-tree contents with the files that are stored in the commit I name here. The --reset option tells Git to throw out conflict entries, if the merge generated merge conflicts (and should be harmless if not). So now the index, and your work-tree, match the snapshot in L. Now you can finish the merge:4

git merge --continue

This makes new merge commit M from the content stored in the index, which—thanks to the -u option of the git read-tree command—you can also see in your work-tree. So now you have:

             I--J
            /    \
...--F--G--H      M   <-- master (HEAD)
            \    /
             K--L   <-- dev

where the content of commit M exactly matches the content of commit L. The first parent of M is still J, and the second is still L, as they would be for any other merge. But you've "killed off" the changes you had in I and J, while only adding a commit to master. All other Gits will be happy enough to add commit M to their collections of commits.


3It's not really magic at all. It's a plumbing command, meant for use in scripts, rather than for users. The read-tree command is astonishingly complicated, though: it implements (some of) merging, subtree operations, and all kinds of other cleverness. It's a significant part of the reason that the index, which is otherwise kind of a pain, exists.

4If you like, you can run git commit instead. In old Git versions that don't have git merge --continue, you will have to use git commit. It is OK to use git commit directly: all git merge --continue does is check first that there is a merge to finish, then runs git commit for you. Using git merge --continue, you check that you really are finishing a merge.


Consider this option as well

Suppose you don't want a merge. That is, suppose you want to keep the progression:

             I--J   <-- master (HEAD)
            /
...--F--G--H
            \
             K--L   <-- dev

the way it is, but add a new commit O to master, whose parent is J, but whose content matches that of L on dev?

This is easy to do, too. Having freshly done a git checkout master, so that your index and work-tree match J, you now run:

git read-tree -u dev

(with no preceding merge there can't be merge conflicts, so --reset should not be required).

This replaces Git's index and your work-tree contents, reading them from commit L, just as we did in the merge example. Now you can run git commit to make O:

             I--J--O   <-- master (HEAD)
            /
...--F--G--H
            \
             K--L   <-- dev

The content of commit O now matches the content of L; the history obtained by starting at master—commit O—and working backwards reads O, then J, then I, then H, and so on.

(Which of these options do you want? That's up to you.)

Upvotes: 1

Related Questions