JFortYork
JFortYork

Reputation: 139

How do I avoid tracking certain files completely?

I'm sort of a novice/intermediate with Git, but I'm having trouble understanding something about the staging index. I'm an iOS developer and every time I work on an XCode project and then run git status in my project folder, I frequently see that files like info.plist or .DS_Store are not staged for commits.

I don't care about files like these and don't want to commit them. How can I tell git to completely ignore these files and not prompt me to stage them?

I tried setting up a .gitignore_global in my home directory that contains the following:

# Compiled source #
###################
*.com
*.class
*.dll
*.exe
*.o
*.so

# Packages #
############
# it's better to unpack these files and commit the raw source
# git has its own built in compression methods
*.7z
*.dmg
*.gz
*.iso
*.jar
*.rar
*.tar
*.zip

# Logs and databases #
######################
*.log
*.sql
*.sqlite

# OS generated files #
######################
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db

# XCode Related files #
######################
*.plist

But this doesn't seem to work. I've seen posts on SO about using git rm --cached to untrack files, but then it shows up as a file that is to be "deleted". I don't want to commit that change and alarm anyone.

What is the correct solution to tell git to permanently ignore files?

Upvotes: 2

Views: 112

Answers (2)

torek
torek

Reputation: 490178

As Karol Dowbecki said, you probably want something like:

$ git config --global core.excludesfile ~/.gitignore_global

but that doesn't fix existing problems.

The problem

A commit is a Git object that stores files (as a snapshot) plus some metadata. Each commit has a unique hash ID: every Git in the world that has that commit, has it by that hash ID; no other commit can ever have that same hash ID. Note, too, that commits are frozen forever. Nothing—no power in the universe—can change an existing commit.1

Now, suppose that you have gotten some file you want to be ignored, into some existing commit. Never mind how it got in there, just consider that it is in there. If you git checkout that commit, Git will copy that file into your index and work-tree.

Your index is largely invisible, but it holds a copy of every file in the commit you just checked out. Git takes the files from the deep-freeze version stored via the commit, copies it—or technically, its blob hash—to your index, and then copies the file to your work-tree so that you can see and work on it. The index is also where you build the next commit you will make.

When you run git add, you tell your Git: take the version of the file that's in my work-tree now, and copy it into the index. This overwrites the previous copy—which was there, even though git status didn't say anything—with a new, different copy. If you run git commit now, the new commit has the updated file. It also has all the un-updated files that you didn't overwrite. That is how and why every commit is a full snapshot of all of your files: the git commit command actually makes commits from the index. The index holds all the files, always.


1You can stop using a commit, and make your Git forget that it ever existed. It still does exist, of course, unless and until every Git repo that has the commit, drops it. After that, the hash ID—the true name of that commit—will just elicit a: say what? I don't know a commit by that hash ID response from every Git in the world. To drop a commit, you must drop all of its descendants as well, because the hash ID of a commit is built in part from the hash IDs of every ancestor commit. See the Wikipedia article on Merkle trees for more.

Also, two different commits could share the same hash ID, provided that those two Gits never meet each other. (I like to think of these as doppelgänger commits.) If the two Gits do meet, they'll refuse to share the doppelgänger commit, and any other commit that has the doppelgänger commit as part of its history cannot go into the other Git either.


What does all this have to do with .gitignore and tracked files?

A tracked file is a file that's in the index.

That's it. That's all there is to it. A file is tracked if and only if it's in the index right now. A file that is in the work-tree, but not in the index, is untracked.

So, suppose .DS_Store is in your work-tree right now (because Finder has written or created it). Is .DS_Store in your index right now? If so, it's tracked. If not, it's untracked.

Listing a file in .gitignore tells Git that if the file is untracked, Git should:

  • not complain about this, and
  • not auto-add the file when you use an en-masse git add of many files (* or . or whatever).

In fact, if the .DS_Store file is untracked and you have it listed in a .gitignore, and you run git add .DS_Store explicitly, even that won't add it: it will tell you that the file is ignored and you should use --force if you really want it added.

But ... what if .DS_Store is already in the index? In this case, listing it in a .gitignore has no effect. It's already tracked. It's too late! You can git rm it, with or without --cached, to remove it from your index right now. Then it's untracked. It won't be in the next commit.

That doesn't remove it from any existing commits that have it. Nothing can do that. Those commits, found by their hash IDs, will have .DS_Store in them forever.

What you can do about this

You can make new and improved (but different, and different-hash-ID) commits that are like the old ones but don't have the file. Because commits encode their ancestor hash IDs, you have to do this for pretty much every commit in your repository. Then you can get yourself and everyone else in the world that uses this Git repository or a clone of it, to switch from the old commits to the new-and-improved ones. That's fairly painful, though it's a one-time pain for each user of this Git repository or clone-thereof. The git filter-branch command can do this, or there are tools like The BFG that are faster and more user-friendly, i.e., less painful, that can do it.

Your other option is to leave these problematic commits alone. They're still there, in your history, but you can git checkout each branch tip commit, fix it up so that the unwanted files aren't in the index right now, and then git commit to make a new commit that becomes the new tip commit of that branch. Now each branch's tip commit doesn't have the unwanted files, and git checkout of that branch won't try to create the file.

The problem this leaves behind is that if you ever use one of these historic commits—perhaps as part of git merge, or perhaps via git checkout—they'll have the unwanted files. How messy is that?

  • Well, for git merge, maybe not messy at all: if all your branch tips don't have the file, and you git merge some other branch-tip with this branch-tip, and the merge base does have the file, your Git will say: Ah, you and they both deleted .DS_Store, so that's all good, we'll keep .DS_Store deleted in the new merge commit.

  • But for git checkout: kind of messy. If you git checkout one of thes old commits, Git will want to put, say, .DS_Store into your index and work-tree. In some cases—e.g., there's no .DS_Store file in your work-tree right now because you removed it and Finder hasn't gotten around to creating it again—Git will be able to create the file. In others, you can use git checkout -f to forcibly overwrite the file. (How this affects whatever uses the file, depends on the file and the whatever. For Finder and .DS_Store, at least, it's not too terrible.)

    But note that as soon as you switch away from this commit—with git checkout master for instance—your Git will compare what's in the index to what's in the commit you're moving to. That other commit doesn't have a .DS_Store. So your Git now removes .DS_Store from the index and from the work-tree.

    This removal is allowed if the .DS_Store in the work-tree matches the one in the index, or if you use git checkout -f. But it's also allowed if .DS_Store is marked ignored. For .DS_Store, that's not that big a deal: Finder will just create a new one. For other files, and other programs, it might be a big deal. Some files are pretty important. You ought to be able to mark them as ignored but precious: don't save with new commits, but don't remove from work-tree either. But you can't, at least not with today's Git.

It's up to you whether to do anything. If you replace all your old bad commits, it's painful but the result is good. If you do nothing, you need to be a little bit careful. Since most ignore rules are mostly in committed .gitignore files, the lack of ability to mark ignored files as "precious" doesn't bite very often. However, using global ignore rules, it does bite. Cave canem stultus!2


2Using the translation of Git as idiot, stupid person and then google translate. It's Git itself being insulted here, not the user.

Upvotes: 0

Karol Dowbecki
Karol Dowbecki

Reputation: 44980

Global .gitignore file should be configured with core.excludesfile global option, otherwise Git won't apply it:

$ git config --global core.excludesfile ~/.gitignore_global

Upvotes: 2

Related Questions