john doe
john doe

Reputation: 9690

Rollback commit of a single file git reset

I am using git on a terminal. I have few HTML files in a folder. I created a repository using git init.

I added index.html to the staging area using:

git add index.html

Now, I want to remove the index.html from the staging area. I tried git reset --HEAD index.html etc but nothing seems to be taking the file out of the staging area.

I want to simply remove the index.html file from the staging area.

Here is the output after running git reset HEAD. As you can see index.html is missing.

On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)

    landscape.css
    max-640px.css
    min-640px.css
    portrait.css
    result.html
    styles.css

Here is git log output:

$ git log
commit d70e697ed08fbe6709ba3f13d202bade793e7b1a
Author: johnson 
Date:   Tue May 30 19:36:12 2017 -0500

Upvotes: 0

Views: 829

Answers (1)

torek
torek

Reputation: 489748

Before I start here, let's mention that the very first commit is special. The reason is that until you make the first commit, there is no current commit. For Git's own internal purposes, Git sometimes pretends that the current commit does exist anyway, and that it's truly empty: that it has no files at all. (In fact, there's a special semi-secret Git thing in every repository, for just such purposes.)

The staging area1 sits between the current commit (HEAD) and the work-tree:

current commit:        staging area:         work-tree:
  file1                  file1                 file1
  file2                  file2                 file2
   ...                    ...                   ...
  fileN                  fileN                 fileN

The staging area (always) holds all the files that will go into the next commit. If you have a current commit—which you do/did; you show it above—the staging area starts out full of all the files that are in that commit. (So does the work tree: as you can see, there are three copies of every file! Usually two of them—the current commit and staging area copies—are secretly compressed and/or secretly shared though. Git can do this because those two copies are under Git's control. It's only the work-tree copy that has the normal form of any file on your computer; Git can't keep these in fancy, compressed, de-duplicated form.)

In other words, as long as you have a current (HEAD or @) commit, the staging area is pretty much guaranteed to be loaded with lots of files. It's just that the files stored there are the same as the ones stored in the current commit! So git status, which compares HEAD-vs-stage, says nothing about them.

(Note: git status then also compares the staging area to the work-tree. We'll see an example below, where git status --short prints two question marks to indicate an untracked file. That's because the file will be in the work-tree, but not in either the current commit or the staging area.)

When you run git reset --mixed commit path (and --mixed is the default),2 this tells Git: please make the staged copy of path match the commit copy. Since this was git reset HEAD index.html, this means: if the file isn't in the HEAD commit, remove it from the staging area; if it is in the HEAD commit, copy it from the HEAD commit, back to the staging area.

Since index.html is in the HEAD commit, you're stuck with it being in that commit forever. But you're not stuck with it in the staging area: you can, instead, use git rm index.html to remove it. Of course, git rm index.html removes it from your work-tree as well! Fortunately there's an option to tell git rm: remove this from the staging area, but don't touch the work-tree. This is spelled --cached:

git rm --cached index.html

Now the staging area has no copy of index.html:

current commit:        staging area:         work-tree:
  foo.css                foo.css               foo.css
  index.html              ---                  index.html
  README                 README                README

This isn't what you did—using git revert makes a new commit that undoes a previous commit—but there's still an important point here, that would apply if you had done this. The git status command compares the HEAD commit to the staging area. The file index.html is (or would have been, before the revert) in the HEAD commit. After git rm --cached, it is not in the staging area. So the file becomes deleted.


1Just because it's Git, the staging area has two more names: it's also called the cache and, most commonly, the index. Of course the name index is confusingly similar to index.html. :-) So for the rest of this answer I am trying to stick with the phrase staging area.

2The reason this is --mixed is that there's a lower degree of git reset, --soft, that does not touch the staging area. There's also a higher degree, --hard, that writes on both staging area and work-tree.


When you reverted, you made a new commit

If you use git revert to "undo" a commit, that just makes a new commit that does the opposite of the reverted commit. The commit you reverted was the first commit ever, and the first commit, by definition, consists of adding all the files that are in that commit. So reverting the first commit means: remove all the files.

Git's commits don't actually store changes, but rather snapshots. Now that you have two commits, the HEAD (or current) commit is the second commit, which—compared to the first commit—removes all the files:

$ mkdir tt
$ cd tt
$ git init
Initialized empty Git repository in ...
$ echo test revert of first commit > README
$ echo la la la > file
$ git add README file
$ git commit -m initial
[master (root-commit) 2cff9aa] initial
 2 files changed, 2 insertions(+)
 create mode 100644 README
 create mode 100644 file

So far, so good: we have an initial commit, which (compared to the nothingness that precedes the initial commit) adds all the files.

$ git revert HEAD
[editor session, snipped]
[master 92f4eea] Revert "initial"
 2 files changed, 2 deletions(-)
 delete mode 100644 README
 delete mode 100644 file

The revert undoes the effect of the initial commit, so it removes all the files. Git compares "previous commit" to "current commit" and sees that README and file are deleted.

$ ls
$ 

The work-tree is now empty. All the files are gone! This is because we've reverted them.

Don't worry though! All the files are still there, too! They are in that first commit. We just have to get them back. Let's do that now:

$ git log --all --decorate --oneline --graph
* 92f4eea (HEAD -> master) Revert "initial"
* 2cff9aa initial

(Only the --oneline option does much here, but the rest are often a good idea. I actually have this as a Git alias. Remember this as: when using git log, get help from A DOG: All Decorate Oneline Graph.)

We need all the files back from the first commit (2cff9aa), so:

$ git checkout 2cff9aa -- .
$ ls
README  file

There's one fly in the ointment: both files are now in the staging area, and will hence go into the next commit:

$ git status
git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

    new file:   README
    new file:   file

(This happens because git checkout copies the files from the specified commit, to the staging area, before finally copying them from the staging area to the work-tree.)

We want to get rid of one of them (file, in this case). Now we can use either git rm --cached or git reset HEAD file. The current commit, named HEAD or @ or 92f4eea, does not have the file file in it. In fact, the current commit has no files in it! It's a truly empty commit: checking it out makes your work-tree empty.

Let's reset (or rm --cached) now:

$ git reset file
$ git status --short
A  README
?? file
$ ls
README  file

The A status means "added", and the two question marks mean the file file is untracked (we'd see this spelled out explicitly in the longer default git status output). So now we can commit:

$ git commit -m 'this should have been the first commit'
[master 0945e1d] this should have been the first commit
 1 file changed, 1 insertion(+)
 create mode 100644 README
$ git log --oneline
0945e1d (HEAD -> master) this should have been the first commit
92f4eea Revert "initial"
2cff9aa initial

There's a bit of a problem here now, perhaps best shown by illustration. Note that we have our file file safely in the work-tree right now. Let's check out the initial commit again, to see what was in it:

$ git checkout 2cff9aa
error: The following untracked working tree files would be overwritten by checkout:
    file
Please move or remove them before you switch branches.
Aborting

Suppose we do manage to get that commit checked out, e.g., using --force:

$ git checkout --force 2cff9aa
Note: checking out '2cff9aa'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 2cff9aa... initial
$ ls
README  file

Now let's go back to the tip commit of master, getting back on branch master and out of this "detached HEAD" mode.

$ git checkout master
Previous HEAD position was 2cff9aa... initial
Switched to branch 'master'
$ ls
README

Wait, where did our file named file go?

This is the trap we have created: there is a commit, 2cff9aa, that has the file committed in it. If we ever, somehow, manage to make that the current commit, we'll have that version of file in the staging area and work-tree too. Then, when we move off that commit—say, back to master—Git will remove the file, both from the staging area and from the work-tree!

There is no way around this. If the file is in a commit, it will go into the staging area (and work-tree) when you check it out. It will then get removed (from those two same places) when you move off that commit to a commit that doesn't have the file.

Git will, as we saw above, warn you when you're about to move to that commit. This gives you a chance to save your data. You need to know that you must do that (save your data), or avoid checking out that particular commit.

The only other option is to rewrite your repository to get rid of that commit entirely. In most cases—most repositories, with a lot of history in them, and perhaps many clones and so on—this isn't worth it. In a new repository with only one commit, it might be worth it!

Upvotes: 1

Related Questions