Isaac Joy
Isaac Joy

Reputation: 95

Remove a file from a git branch but keep it in another?

Recently I was working with someone else in 2 different branches on the same project. I accidentally merged the branch develop, which contained his code, into mine. I then saw his files in my merge request and subsequently learnt the hard way that doing

git rm HIS_FILES

on his unwanted files in my branch, would not only delete them from my branch, but from his too (and the entire git index).

What I want to know is, how would I correctly remove his files from my branch, so that they are not also deleted from his? Do I make a new branch once I realised his files are in my branch? Do I revert to the previous commit before I merged branch develop into my local one?

Thanks for any help

Upvotes: 2

Views: 3630

Answers (3)

user273083
user273083

Reputation: 11

I then saw his files in my merge request and subsequently learnt the hard way that doing

git rm HIS_FILES

on his unwanted files in my branch, would not only delete them from my branch, but from his too (and the entire git index).

I met the same problem when I forgot to say

git commit -m "HIS_FILES deleted"

after

git rm HIS_FILES

Perhaps, you did so:

git rm HIS_FILES
git checkout HIS_BRANCH

I suppose, you lost command git commit after git rm, that is why rm affected his branch too.

The right way to delete files in your branch only is to run command, when you is on your branch:

git rm HIS_FILES
git commit -m "HIS_FILES deleted"

If you say then

git checkout HIS_BRANCH

you will see his files not deleted.

Upvotes: 1

torek
torek

Reputation: 488519

(This is technically a comment rather than an answer, but I wanted to be able to use formatting ... and, it would never fit.)

[git rm] would not only delete [files] from my branch, but from his too (and the entire git index).

This is not the case. Moreover, this is not the right way to understand Git's index.

The index has three names: Git calls it the index sometimes, but then calls it the staging area other times. In a few places, Git uses the word cache. These mostly all refer to the same thing, whose concrete implementation is mostly just a file in the .git directory named index.1 But the index, regardless of which of these names you use for it, has little to do with existing commits. The main function of the index is that it is where you build your proposed next commit.

When we talk about the index using the phrase staging area, we're concerned with the copies of files that are kept in the index.2 The way Git works, you have at your fingertips, at all times, three copies of each file! You pick a commit—using git checkout or the new git switch in Git 2.23 or later—to be your current commit. That commit holds a snapshot of all of your files, in a special read-only (and Git-only) compressed format.

These frozen format files inside a commit cannot be changed. Nothing about any existing commit can be changed. There are some big advantages to this: for instance, since the files can't be changed, they can be shared. If commit A has some version of some file, and commit Z has the same version of the same file, the two commits can simply share the one underlying file. (This is actually based on file content, not file name.) But that has a big disadvantage too: it means you can't actually do anything new, i.e., do any new work, with these frozen files.

Git therefore needs a way, and a place, to defrost and decompress—i.e., rehydrate—the frozen-and-compressed (dehydrated) files. The place Git puts them is in your work-tree. Files in your work-tree are plain ordinary everyday files, as provided by your computer, so you can get work done.

So this explains two of the three copies of all of your files: there's the dehydrated copy of README.md in the current (or HEAD) commit, and there's the ordinary and useful copy of README.md in your work-tree where you can work with it. But what's this third copy doing?

The answer is: it's sitting there in your index—or "staging area"—in the freeze-dried format, ready to go into a new commit. If you run git commit right now, Git will build the new commit from the freeze-dried copies of files that are in the index. Why not use the ones from the commit? That should be obvious: it's because you can't change those copies! But you can replace the freeze-dried copies that are in your index. That's what git add does: it compresses (freeze-dries) the work-tree version of the file and writes that into the index.3

So, suppose you modify the work-tree version of README.md. Before git add README.md, the index copy of README.md matches the HEAD copy of README.md, and the work-tree copy is different. After git add README.md, the index copy of README.md matches the work-tree copy (except for being in the freeze-dried format). There are three copies at all times, but two of them match. Using git add replaces the index copy, so that it matches the work-tree copy. (The HEAD copy can't be changed.)

This means that at all times, the index is ready to go: git commit simply packages the freeze-dried index files into a new commit. The new commit becomes the HEAD commit, by being added to the current branch. The new commit now has a full and complete (and frozen for all time) copy of every file, as seen in the index. Now the HEAD commit and the index match, and if the index matches the work-tree, all three copies match.

When you use git rm, Git removes the named file from both the index and the work-tree. The next git commit will not have that file, because it's not in the index.

If you then git checkout some other branch, Git now finds all the files in the frozen commit that is the tip of that other branch. Git copies all these frozen-format files into the index, so that they're ready to go into the next commit you make; and having updated the index copies, Git rehydrates those into the work-tree, so that you can see and use the files. Now the (newly-selected, different) HEAD commit, index, and work-tree again all match, and you're ready to work.

If, during switching from commit Z back to commit A, Git finds that commit Z has some file—to-be-deleted.txt perhaps—that isn't in commit A, Git removes to-be-deleted.txt from the index and from the work-tree. So now it's gone—but it's still there in commit Z. If you git checkout commit Z, Git sees that to-be-deleted.txt isn't in commit A, isn't in the index, and is in commit Z, so it copies the commit Z version of to-be-deleted.txt into the index and work-tree ... and now, again, HEAD, the index, and the work-tree all match.

One key to keep in mind at all times is that Git is about commits. We will git checkout some branch name to switch branches, but that name identifies one particular commit. Git will then fill the index and work-tree—both of which are temporary areas!—from that commit. When we make a new commit, Git simply packages up whatever is in the index, adds our name and so on, writes out the new commit with its parent being the commit we checked out, and then updated the branch name to remember the hash ID of the new commit. So branch names move. The commits, once made, never change at all, and normally last forever.4


1We have to say mostly because there are a number of exceptions to this rule. However, you can point Git to a different file, for special purposes, by setting the environment variable GIT_INDEX_FILE to the path name of a temporary index you'd like Git to use. Git will create that file if it does not exist, and then use it as the index. The git stash command, for instance, uses this to create commits from a temporary index, all without disturbing the (main or real) index.

2Technically, the index holds references to blob objects, which are how Git stores files in a frozen format. Unless/until you get around to using things like git hash-object and git update-index, though, it works just as well to think of the index as containing a frozen-format copy of each file.

3This is where git hash-object -w and git update-index come in: git add compresses and writes a new freeze-dried blob object, or discovers that an existing blob has the right content and therefore winds up re-using that existing, already-frozen blob object. That blob object has, or gets if it's new, a unique hash ID, and git add uses the same code as git update-index to write the correct hash ID into the index.

We could equally ask Why not build the new commit from the work-tree? There's no really good answer to this:5 other version control systems, that don't shove an index in your face all the time, actually do that, or something that looks like that. But Git shoves the index in your face, so you need to know about it.

4To get rid of a commit, you arrange things so that you can't find the commit. Since branch names identify the last commit, it's easy to remove a commit that's at the end: just make the branch name go backwards, to identify that commit's parent. But you can find every earlier commit by going backwards, one step at a time, so you can really only remove the tail end commits. (This is clearer in some other VCSes, such as Mercurial, that don't let commits be on multiple branches. Things get confusing in Git, where a commit is potentially on many branches at the same time.)

5One can point out the various features offered as a result, such as git add -p, but that's kind of a weak ex-post-facto justification. The real question is whether these features are worth the initial complexity. I don't think they are—I think Git could offer them in some other way that doesn't make the index as in-your-face as it is—but that's both opinion and speculation, which isn't really appropriate for StackOverflow.

Upvotes: 5

Esha
Esha

Reputation: 151

git rm removes the file from your branch only.

The possible reason that the file has been deleted in other branches is that your branch has been merged to other branches. The merge should have removed that file.

Upvotes: 1

Related Questions