Parsa
Parsa

Reputation: 3236

Git: went to old commit, now I can't come back

I used the following to move to an old commit which has some code I have deleted:

git checkout xxx .

But now I want to come back to where I was with my work I tried the following: I had not stashed or committed my latest code

git checkout xxx/xxxx .

xxx/xxx is the name of the branch I have been working on.

But my files have not changed, they still contain all the old code and none of my new code!

Any help would be very much appreciated.

Upvotes: 0

Views: 386

Answers (2)

torek
torek

Reputation: 489748

I'm afraid par's answer is probably the right one here. If, however, you git added the files—even without git commit-ing the result—you may be in luck. If this is all "TL;DR", you can skip to the last section below.

For clarity, let's start by noting that there are essentially three or four forms of git checkout:1

  • git checkout branchname
  • git checkout commit-ish
  • git checkout commit-ish -- path [ path ... ]
  • git checkout -- path [ path ... ]

The first form, where you give git checkout a branch-name and no path arguments, tries to put you on the branch. For instance, git checkout master will put you on branch master (or fail and do nothing at all).

The second form, where you give git checkout something that resolves to a specific commit ID but—because of the first form—is not an ordinary local branch name, checks out the specified commit and gives you a "detached HEAD". (Or, again, it can fail and do nothing at all.) The way this works internally is that you leave whatever branch you were on before, and are now on a new, temporary, anonymous (unnamed) branch that ends at the commit you just checked out.

This second form is the usual way to look at old commits, or rebuild them, or to check out a tag and build that, or similar. For instance you might git checkout c0ffee1 (a hash ID) or git checkout v2.7.2 (a tag). But it's not what you did.

The third and fourth form are very different in their actions. If I were in charge of the Git world they would not be be spelled git checkout at all, because they do not take you to another commit. They have no effect on HEAD. If you were on a branch, you stay on that branch. If you were in detached HEAD mode with a specific commit, you stay in detached HEAD mode with that same specific commit. But now we get into the first complication, because now we must talk about the index and work-tree.

The first two forms of git checkout are relatively safe, because they make sure you do not lose files from the index and work-tree. You can request that they do overwrite files using the -f / --force flag. The third and fourth forms of git checkout are not safe, though. Git considers each path argument a request to overwrite the given files or directories.

The commit-ish argument is anything that Git can use to find a commit. This can be a raw hash ID, like c0ffee1, or a tag, like v2.7.2, or even a branch name like master. Note that we split off branch names as a special (first) form of git checkout, but that applies only when there are no path arguments. When you do give some path arguments, a branch name like master stops being special.

Incidentally, the -- before path is optional. The reason it exists is to allow you to use a path that resembles an option. For instance, if you have a file named -f or --force, you need to be able to tell Git to refer to the file -f or --force, rather than using the -f or --force option. If you have a file named master, you need to be able to tell Git that you are referring to the file named master, not the branch named master. If you leave out the --, Git makes its best guess about whether you mean a branch name, a commit ID, an option, or a file name. In your case, you used ., which is definitely a file name, so git checkout xxx . became the third form. And of course, . means "the entire current directory including all its files and subdirectories".


1The exact number of forms depends on how you decide to count these. You can, for instance, collapse the first two into one form and the second two into one other form, or keep the first two separate and combine the third and fourth. There is also git checkout -m, git checkout -p, git checkout --ours, and git checkout --theirs, all of which are a little bit different from these main four. But don't worry about them until you start using them.


Commits, the index, and the work-tree

Commits are Git's reason for existence. They record everything you ever did—or at least, committed—for all time, so that you can get these back. Each commit holds a complete snapshot of some set of files, arranged in a tree (directory full of files and sub-directories and so on). It also holds the ID of a previous (parent) commit, and some metadata like your name and email address and a time-stamp, and a commit log message. And that's pretty much it: a commit is a saved tree plus some metadata.

The work-tree is pretty obvious as well: it's where Git lets you work. Commits inside Git are in a format that only Git itself can use, but you need to use the computer to do real work, with ordinary files. So Git can fill the work-tree from a commit—though actually it has to use its index, as we'll see in a moment—and then you have regular files that regular programs, web servers, or whatever can all use. For many reasons, the work-tree can also hold files that you won't commit, and that you never intend to commit. You will keep these files around as "untracked" files (and Git will get whiny about them, and you'll want to shut it up).

Git's index, also called the staging area (as in git diff --staged) or sometimes the cache (as in git diff --cached: exactly the same as --staged), has several roles, but the interesting one here is that it is where you build the next commit you will make.

When you start out with a clone of some existing repository and check out some branch name, Git fills the index from the snapshot that goes with the latest commit on that branch. (This latest commit is called the tip commit.) So now the index matches the commit. Thus, if you were to make a new commit right now,2 your new commit would have the same tree as your current commit, because you haven't changed the index.

When you run git add to update an existing file, Git simply replaces the index copy with the version from the work-tree. Now the next commit you make will have the new version. When you run git add to add a totally new file, Git copies that file from the work-tree into the index, and now the next commit will have the new file. If you use git rm on a file name, that removes the file from both the work-tree and the index, and now the next commit won't have the file.

(Incidentally, this allows us to state precisely what it means for a work-tree file to be "untracked". A file is untracked if and only if it is not in the index. That's it—that's all there is to it! Now, when Git gets whiny about an untracked file, you can add the file's name to a file named .gitignore, and that will make Git shut up about it. It won't actually make the file untracked: that's determined by the file not being in the index. It mainly just makes Git shut up about it, and also not add it automatically when you use one of Git's "add many files at once" short-cuts.)


2Git will try to keep you from making a new commit that exactly matches the HEAD commit. You can, however, force it to allow the commit, using --allow-empty. "Empty" is a funny way to spell it, because the new commit isn't empty at all, it's just identical, at least in terms of saved work-tree. (New merge commits also are always allowed even if they match HEAD.)


When does git checkout overwrite the index and/or work-tree?

If we go back to what I called the third and fourth forms of git checkout, we'll see that one of them has a commit-ish argument and the other doesn't. This gets into something that should, maybe, be more of a hidden implementation detail—but Git has a habit of letting implementation details show right through to the user.

For git checkout to copy a file from a commit, to the work-tree, it must first write the file into the index. Hence the third form, git checkout commit-ish -- path, finds the version of the file path associated with the given commit-ish, copies that to the index, and then copies the index version to the work-tree.

The fourth form, however, has no commit-ish argument: git checkout -- path. In this case, Git copies the version of the file from the index into the work-tree. Most of the time, the version on the index is the same as the version in the work-tree, so most of the time, this doesn't do anything. If, however, you've modified the work-tree version, and then decide you want to discard your modifications, you can extract the index version.

The index version may be the same as the current commit (HEAD) version. In that case, git checkout -- path and git checkout HEAD -- path both copy the HEAD version to the work-tree—but the one with an explicit HEAD copies the HEAD version to the index first, and the result is only the same because the HEAD and index versions were the same anyway.

For completeness, I'll mention that the first two git checkout forms—the "safe" ones—will also overwrite the index and work-tree, but without going into a lot of detail here, Git tries hard to overwrite only those entries that are "safe" to overwrite, so that you won't lose uncommitted work. See Git - checkout another branch when there are uncommitted changes on the current branch for (much) more detail.

Summary

There are, at all times, up to three interesting versions of each file:

  • the one in the current (HEAD) commit;
  • the one in the index, which will go into the next commit; and
  • the one in the work-tree.

Using git checkout commit-ish -- path, as in git checkout xxx ., leaves the HEAD version unchanged but copies the xxx (committed) version to the index and work-tree. If there were other versions in the index and work-tree, they are now gone. If those versions matched some committed version, you can get them back. If not, Git can't help you ... probably. But see the last section!

The special secret recovery method

There's one unusual exception to the "gone forever" rule, though it's painful to use. When you git add a file to the index, Git actually puts a copy of the file in the repository. The index contains only the 40-character SHA-1 hash for the file. This means that if you git add-ed a file, it's saved away in the repository itself. If you overwrite the index version with yet another version, that actually copies the "yet another" version into the repository, and puts the new hash into the index. The intermediate version doesn't get removed! Well, not yet.

These hashed, but never committed, files can be recovered, up until the point where Git "garbage-collects" them. By default, that gives you at least 14 days from the time you git add-ed them. The command that will recover them is git fsck --lost-found.

The problem with this recovery method is that the file's names are gone. What git fsck --lost-found does is find Git objects—commits, trees, tags, and "blobs", which is what Git calls the stored files—that have no references to them. When you git add-ed the file, Git stored the contents of the file as a new blob, and then wrote the hash ID of the blob object into the index, using the index to hold the file's name. When you then overwrote the index entry for the path, you lost the name, and the repository blob object became unreferenced. The --lost-found option makes git fsck copy the original file's contents into .git/lost-found/other/, storing it under the hash ID, since the name is gone. You can then look through every such file to find the one(s) you want, and move them out of the lost-and-found area to get them back.

Upvotes: 2

Parsa
Parsa

Reputation: 3236

If you checkout of a branch without having previously stashed or committed your code your code will be lost and irretrievable.

Upvotes: 0

Related Questions