Amelie
Amelie

Reputation: 516

Git is dropping my .gitignore files at checkout

When I checkout branch A to branch B, some files in my .gitignore A are dropped if they're not in in .gitignore branch B. I tried to edit my .git/info/exclude file, but I don't see the difference. I tried with --assume-unchanged and --skip-worktree (keeping the files in .gitignore + make add -f) but I can't checkout if my file is modified. I really need to ignore my file ... Have you got some tips to share ?

Thanks

Upvotes: 2

Views: 1937

Answers (1)

torek
torek

Reputation: 488519

Inconceivable

The problem boils down to this: Git does not think .gitignore means what you think it means.

Files listed in .gitignore are not actually ignored. If a file name matches the contents of a .gitignore or excludefiles control, then:

  • git status and other commands will throw it off the "files to complain about as untracked" list. That is, if it actually is untracked, and git status would be going to complain about it, it doesn't complain. This action is fairly "ignore"-y, but there's that whole "if it actually is untracked" thing: the fact that the pathname matches an ignore entry does not make the file become untracked.
  • git add with various "add all files" flags will skip it if it's currently untracked. (This is the only one of the three actions that is really totally "ignore"-y.)
  • Worst, from some points of view at least, is this: If Git is following some set of instructions that would cause it to overwrite the file, and it really is untracked right now, the fact that there is a matching ignore-entry tells Git: "Feel free to overwrite the file." In other words, this is not a list of files that are to be ignored, it's a list of files to be (safely) clobbered. (This is not the problem in this particular case, but it is an important aspect of .gitignore not meaning what people think it means.)

The reason you are getting hit here is because the files in question are actually tracked, despite being in .gitignore.

Aside: the index, and what it means to be "really untracked"

The word "untracked", or the phrase "really untracked", appears several times above. Untracked files, in Git, are a little bit mysterious: what, precisely, makes a file "untracked"? The answer is remarkably simple—"remarkably" because this is Git. A file is untracked if and only if it is not in the index.

(There are the git add -N special index entries, but they are badly broken in many versions of Git, so it may be best to avoid them for several more years. The other possible complication is that "the index" itself is a complicated beast. Most of its complications are at least meant to be invisible and automatic, though, and in practice they mostly are. If you think of the index as "this is what I am going to put in my next commit" you will be OK. Remember also that the index is where extra information is stored during merges: if you are working on a conflicted merge, the conflict status is kept in the index.)

Switching from commit to commit makes Git write or remove files

When you ran git checkout B to switch from branch A to branch B, Git did two things:

  • Change the "current branch" (stored in the special HEAD file) from A to B. This actually happens last, if and only if the rest of the checkout succeeds.
  • But first, if the two different branches name different commits, git checkout must change the index and work-tree contents.

Your problem occurs during this first step. Branch A names some commit (by its raw hash ID, 7fe3291... or whatever). Branch B also names some commit (by its raw hash ID). If branch B names the same commit, Git's job is very easy, because you are not asking Git to move from one commit to another, just to change its idea of the "current branch name". But if the two branches name two different commits, Git needs to adjust both the index and the work-tree.

Remember, the index is what will go into the next commit you make. If you are going to be on branch B now, the index had better match up, more or less anyway, with the existing commit. Likewise, the work-tree needs to match up, more or less.

Git will try to keep changed-but-not-staged files, in their changed-but-not-staged state. (In fact, it will also try to keep changed-and-staged files in their changed-and-staged state. See Git - checkout another branch when there are uncommitted changes on the current branch for more about that.) But if a file is currently tracked—is in the index now—and does not exist in the commit we're switching to, and has no uncommitted changes, Git must remove the files from the index, and in the process, it also removes them from the work-tree.

Clearly there is no issue with data loss here. The files are in the index and have no uncommitted changes. This means the index entries for those files match the committed (HEAD) versions of those files. (Specifically, the hash stored in the index matches the hash stored in the tree associated with the HEAD commit.) That, in turn, means that the contents of that file are safely saved away in the HEAD commit. If you want the contents back, you can, after switching to some other commit, simply extract the contents from the existing commit, i.e., the one that is now HEAD@{1}:

git show HEAD@{1}:path/to/file > path/to/file  # skips smudge filter

or:

git checkout HEAD@{1} -- path/to/file          # applies smudge filter

The git checkout variant, which applies the smudge filter and does any end-of-line CRLF manipulation, also writes the file into the index, so that the file—which obviously was untracked in the commit you just switched to, which is why Git removed it—now is in the index, and therefore is tracked.

Side note on potential bugs

I don't know of any actual bugs in any actual Git versions, but I did my spot checks for this posting with Git 2.10.1. An obvious potential bug could occur when Git is checking whether there are unstaged changes to a file, before removing the file as part of switching to another commit. Remember, the test for safely removing or overwriting a file here is:

  1. Is this file tracked and unmodified, If so, safe.
  2. Is this file tracked and modified? If so, not safe.
  3. Is the file untracked and clobber-able? If so, safe.
  4. Otherwise (untracked but not clobber-able): not safe.

Case 1 is where the potential bug would be. How shall we test "tracked but unmodified"? What if you have set the --assume-unchanged or --skip-worktree flags for the file? These index flags normally make Git think of the file as "unmodified". If Git obeys these flags during git checkout, it might overwrite or remove the file.

The (light) testing I did with 2.10.1 showed that Git puts these flags aside when testing whether it is OK to remove the file. That is, I modified a tracked file, but set the bits on that file in the index. Then I attempted to switch to a commit that lacks the file. Git gave me an error message.

If some other version(s) of Git accidentally obey the index flags here, they might remove the file even though it does not match the HEAD commit. In that case, there is no in-Git way to recover the contents.

Upvotes: 4

Related Questions