Leo
Leo

Reputation: 121

What is the difference between `git reset --hard head` vs `git clean --force -d ` vs `git checkout -- .` to discard local changes

What are the differences between git reset --hard head vs git clean --force -d vs git checkout -- . then git pull to discard local changes and get the latest from the Git server repository?

Upvotes: 2

Views: 1418

Answers (1)

torek
torek

Reputation: 490068

As Ryan commented, git clean is very different from the other two (which are pretty close, but not identical).

For comparing, in gory detail, git checkout -- . and git reset --hard HEAD, see my answer to What is the difference between "git checkout -- ." and "git reset HEAD --hard"? Pay close attention to the descriptions of the index since that will inform you about the next part.

Remember, although you cannot see the index directly,1 it's what goes into the next commit. As such it is crucial that you understand that Git makes commits using the index, not the work-tree. Usually, it's sufficient to "see" the index by comparing it to the HEAD commit: wherever a file in the index is exactly the same as that same file in the HEAD commit, git status says nothing; but wherever a file is different, git status prints something.

(Note: git status also, separately, compares the index to the work-tree. Where those are different, it says something—except for untracked-but-ignored files. Only work-tree files can be untracked, by definition, so there's no question about whether index files are untracked. When summarizing the two git diffs it ran, git status can either do it one at a time, in the default long-form output, or all at once, in the --short output.)


1Actually, you can see the index: run git ls-index --stage and Git will spill out the whole thing.2 This is actually quite useful for debugging. With a large repository with many files, though, it prints way too much for every-day use, and git status is a much better tool.

2Actually you need to add --debug to get the whole thing, including the --assume-unchanged and --skip-worktree flags, and even then Git hides the special undo entries from you.


git clean removes only (some or all) untracked files

In Git, an untracked file is actually remarkably simple to define: it's a file that is not in the index. That's almost all there is to it, though for git clean purposes, we also need one more item, namely, whether the file is ignored too.

A file cannot be ignored if it is tracked (is in the index). Such files are known to Git, so git clean will never touch them: that's not its job; its job is to remove some or all of the untracked files. Only an untracked file can be ignored, so an untracked file is either untracked-but-not-ignored, or untracked-and-also-ignored.

By default, git clean will remove—or pretend to remove, depending on options like --dry-run vs --force—only those untracked files that are not also ignored.

With the -X (uppercase X) option, git clean will remove (or as always, pretend to remove) only those untracked files that are ignored.

With the -x (lowercase x) option, git clean bypasses all the "ignore" rules, which means that all untracked files automatically fall into the untracked-and-not-ignored category. Thus, git clean -f -x will remove all untracked files, even those that are normally ignored.

With -d, git clean will also remove directories. By definition, directories are never tracked,3 so all directories are effectively untracked—but they're untracked directories, not untracked files. Git uses a special short-cut treatment for a subdirectory that contains nothing but untracked files (or that is completely empty), though: instead of enumerating every (untracked) file in that directory, Git just considers this an "untracked directory".4 The git clean command normally leaves these alone:

$ mkdir tt
$ cd tt
$ git init
Initialized empty Git repository in ...
$ echo for testing git clean > README
$ git add README
$ touch untr
$ mkdir sub
$ touch sub/subfile
$ git status --short
A  README
?? sub/
?? untr

The double question mark output from git status --short indicates an untracked file or untracked directory. Since sub is a directory with an untracked file in it, it shows up as an untracked directory. Running git clean -f (or git clean -n) shows that Git removes (or would remove) untr, which is an untracked file that is not ignored; but Git does not remove either sub/subfile or sub itself:

$ git clean -f
Removing untr

Adding -d to the git clean options makes Git remove both sub/subfile and sub:

$ git clean -df
Removing sub/

(Removing the entire directory implies removing all of its contents first, as required by POSIX.)

Adding pathname arguments to git clean restricts its cleaning to the given pathnames, which is pretty straightforward.

Note that there is yet another special corner case for a directory that is not ignored but contains another Git repository (whether as a regular repository, or as a submodule of the current repository): git clean -df or git clean -dfx will not remove this sub-repository, but git clean -dff or git clean -dffx will.


3What this really means is that you cannot add a directory to the index. If you try hard enough, using the plumbing commands, you can trick Git into storing an entry with the right mode and name, but under a number of conditions, Git changes this entry's mode from "directory" to "gitlink", after which things go quite badly awry. (Gitlink entries exist to store submodule information, and are normally found in a Git index.)

4Git secretly does store stat information about (at least some) directories in the index, as a performance hack. The gist of this is that if Git has found that some directory such as sub contains nothing but untracked files or (recursively) directories with nothing but untracked files, Git can classify the thing as "to skip in future work-tree scans". This, plus Git's special ignore rules that prevent explicit "unignoring" of files within ignored directories, allow Git to avoid scanning the directory for additional untracked-but-not-ignored files if the directory itself is not modified since the last scan. (This same idea—that is, that if the directory itself is unmodified, no new files can possibly have been added to it—is even applicable to directories containing tracked files, although I don't remember offhand if Git uses the fact.)

Upvotes: 7

Related Questions