Reputation: 935

What does `git commit file.txt` do if file.txt is not staged?

I have a folder in which I have added all but one file (file.txt) to the staging area. At this point, what happens when I run the command git commit file.txt?

Does git automatically add it to the staging area before committing it? What exactly gets committed? Just the file or all the files?

Upvotes: 0

Answers (2)

torek

Reputation: 489313

Git's documentation here is not very good. What's missing is an explanation of precisely what Git does with git commit in the first place, which one needs before git commit --only file.txt and git commit --include file.txt can make sense.

You probably already know that Git commits are numbered, with random-looking hash IDs rather than simple sequential numbers. Each commit's hash ID is actually a cryptographic checksum of the full contents of that commit, which is the key reason why nothing in a commit can ever actually be changed, not even by Git itself.

You may not know—but should know—that each commit saves a complete snapshot of every file that Git knows about. That is, when you run git commit, Git makes one of these snapshots. That snapshot becomes the main data of the commit. The commit also contains metadata, including your name and email address, and the commit number—the hash ID—of the previous commit.

The interesting part for our discussion here is the phrase files that Git knows about. What files are those? We will get to this in just a moment, but think about the implication of the two facts above:

no commit can change;
a commit holds a snapshot.

This means that the files inside a commit are read-only. They're stored in a special Git-only format, compressed and de-duplicated. The de-duplication takes care of the fact that many commits just re-use the same files over and over. The Git-only format is frozen forever, so it's safe to re-use the files, which literally can't be changed. But that also means that these files can't be used to do new work.

To actually do work, you need a regular, ordinary, read/write copy of each file. Git will extract the committed files from a commit, and put these extracted files into these regular read/write copies. These ordinary-file copies live in what Git calls your working tree or work-tree.

You might think these would be the files Git knows about, but that's not actually the case. Other systems do something like this, but not Git. Instead, Git keeps a third copy of each file, in what Git calls, variously, the index, the staging area, or—rarely these days—the cache. When you check out some commit, Git populates both its index—which holds files in the frozen, de-duplicated, and Git-only format—and your work-tree. Since it's already de-duplicated, all the "copies" in the index that match the files in the current commit use no space.¹

The git add command, which you normally need to use, tells Git: make the index copy of this file match the work-tree copy. This means that at this time, Git compresses the work-tree copy down into the Git-ified, ready-to-commit form, and if that's a duplicate, de-duplicates it. Either way, the file is now ready to be committed, and is in the index: if it was in the index before, now a different version of the file is in the index, and if it wasn't in the index before, now it is.

So, a typical git commit, with no extra options or arguments, just packages up whatever is in the index right then to use as the new snapshot. It also gathers up everything it will need for metadata, such as your name, the current time, and a log message. The commit command then packages all of these up to make a new commit.

The files that Git knows about are precisely those files that are in Git's index. In effect, the index acts as your proposed next commit. This is why Git calls it the staging area: the next commit will take a snapshot of whatever is in Git's index.

¹Index "copies" of files do use a bit of space: some bytes to hold the file's name, the file's mode, its cache data, and the internal blob hash ID. The length is variable and depends on the file's name.

Extra options to `git commit`

The phrase the index implies that there's exactly one Git index. That's ... almost true. 😀 There is in fact one primary index, or more precisely, one primary index per work-tree (because you can add more work-trees using git worktree add). But when you run git commit, Git can create a temporary index, or even two of them.

The way Git creates this temporary index or two depends on the options you supply. The command:

git commit --only file.txt

or:

git commit file.txt

(which means the same thing) will:

create a temporary index holding the current commit, as if you just checked out that commit;²
add, to that temporary index, the work-tree copy of file.txt;
create a second temporary index holding the current index, and add to that temporary index the work-tree copy of file.txt;
use the first temporary index to create a new commit; and
if that works, replace the normal index ("the" index / staging-area) with the second temporary index.

The end result is that the new commit contains the same files as the previous commit, except with file.txt replaced or added. If that works, Git proceeds as if you had run git add file.txt, because the second temporary index is equal to the result of running git add file.txt. If you tell Git not to make the commit after all—there are several ways to do this, including with a pre-commit hook—Git throws away both temporary index files and it seems as though you never had Git run git add file.txt at all.

When using git commit --include, Git only makes one temporary index, instead of two. The temporary index starts out as a copy of the main index, and then Git does the git add using the temporary index, and tries committing using the temporary index. If all goes well, the temporary index becomes the main index. If not, Git deletes the temporary index and the setup looks like Git never ran git add.

Note that git commit -a is equivalent to running git commit --include with a list of every file that Git knows about. That is, Git makes this temporary index, and then does a git add -u with it and tries the commit.

²If you don't have a current commit—as is the case in a new, empty repository—Git creates an empty temporary index.

That seems awfully complicated

Unfortunately, it is kind of complicated. But that's what Git really does, and we need to know all the bits and pieces here to explain why the results are what they are, including in the cases where you abort the commit after all.

If it helps to remember it all, though, just remember that git commit normally uses the index, but when using --include, --only, or -a, it makes some extra ones and uses those and then, if all goes well, makes it look like it didn't do any of that. Then consult the documentation to remember in more detail what goes in each of these temporary index files.

Upvotes: 0

erik258

Reputation: 16304

As https://git-scm.com/docs/git-commit explains, the commit is to the changes to file.txt (regardless of whether they're staged) and the state of any other files, staged or unstaged, is not changed.

Since staged changes would be included in the commit, the committed file(s) will no longer have staged changes.

After staging changes to many files, you can alter the order the changes are recorded in, by giving pathnames to git commit. When pathnames are given, the command makes a commit that only records the changes made to the named paths:

$ edit hello.c hello.h
$ git add hello.c hello.h
$ edit Makefile
$ git commit Makefile

This makes a commit that records the modification to Makefile. The changes staged for hello.c and hello.h are not included in the resulting commit. However, their changes are not lost — they are still staged and merely held back. After the above sequence, if you do:

$ git commit

this second commit would record the changes to hello.c and hello.h as expected.

The commit will be the changes made to file.txt. In my test, even though I had changes staged to c1.txt, staged and unstaged changes were all committed as part of my commit. This makes sense because changes made in different stages are committed all at once, and there's no distinction after committing.

$ git init ./
Initialized empty Git ...
$ echo a > a1.txt
$ echo b > b1.txt
$ echo c > c1.txt
$ git add --all
$ git status
On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

    new file:   a1.txt
    new file:   b1.txt
    new file:   c1.txt

$ echo 'more c' >> c1.txt
$ git commit c1.txt

(editor comes up)

[master (root-commit) 2849446] c1.txt
 1 file changed, 2 insertions(+)
 create mode 100644 c1.txt
$ git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

    new file:   a1.txt
    new file:   b1.txt

Upvotes: 1

What does `git commit file.txt` do if file.txt is not staged?

Answers (2)

Extra options to git commit

That seems awfully complicated

Related Questions

Extra options to `git commit`