AJJ
AJJ

Reputation: 2074

Git tracked, untracked, staged, indexed meaning?

Can someone clarify the meaning of these terms? Are tracked files any files that have, at some point, been added to the stage? Is the "index" the same as the "stage"? Are all staged files tracked, but the reverse is not necessarily true (namely, files that were once staged and committed, but aren't part of the current stage to be committed)? How do I know which files are tracked? How do I know which files are staged?

Upvotes: 3

Views: 1747

Answers (2)

Toby
Toby

Reputation: 10144

It may be clear by showing rather than describing.

Note that the information in the answer from @torek is correct.

Recall from there that index, stage and cache are all synonyms in git.

## No new files

$ git status
On branch master
nothing to commit, working tree clean  
# So git shows no files or changes

## New file that is not tracked
$ touch foo
$ git status
On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)
        foo

nothing added to commit but untracked files present (use "git add" to track)
# So git realises there is a new file, but it is not tracking it

## Tracked but not staged
$ git add --intent-to-add foo  # shorthand equivalent flag is -N
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        new file:   foo

no changes added to commit (use "git add" and/or "git commit -a")
# Now git is tracking the file, but no changes are staged for commit yet

## Tracked and staged
$ git add foo
$ git status
On branch master
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        new file:   foo

# Now the file is still tracked and the change is staged

From this hopefully you can see the difference between untracked, tracked and staged

Upvotes: 1

torek
torek

Reputation: 488123

There are three things to consider here: the current commit (known variously as HEAD or @), the index, and the work-tree.

The index is also called the staging area and the cache. These represent its various functions, because the index does more than just hold the contents of the proposed next commit. Its use as a cache is mostly invisible, though: you just use Git, and the cache tricks that make Git go fast, are all done under the hood with no manual intervention necessary. So you only need "cached" to remember that some commands use --cached, e.g., git diff --cached and git rm --cached. Some of these have additional names (git diff --staged), and some don't.

Git is not very consistent about where it uses each of these terms, so you must simply memorize them. One issue seems to be that for many users, "the index" is mysterious. This is probably because you can't see it directly, except using git ls-files (which is not a user-friendly command: it's meant for programming, not for daily use).

Note that the work-tree (also called the working tree and sometimes the work directory or working directory) is quite separate from the index. You can see, and modify, files in the work-tree quite easily.

I once thought "tracked" was more complicated, but it turns out that tracked quite literally means is in the index. A file is tracked if and only if git ls-files shows that it will be in the next commit.

You cannot see files in the index so easily—but you can copy from the work-tree, into the index, easily, using git add:

git add path/to/file.txt

copies the file from the work-tree into the index. If it was not already in the index (was not tracked), it is now in the index (is tracked).


Hence:

Are tracked files any files that have, at some point, been added to the stage?

No! Tracked files are files that are in the index right now. It does not matter what has happened in the past, in any commit, or at any point in the past. If some path path/to/file.txt is present in the index right now, that file is tracked. If not, it is not tracked (and is potentially also ignored).

If path/to/file.txt is in the index now, and you take it out, the file is no longer tracked. It may or may not be in any existing commits, and it may or may not still be in the work-tree.

Is the "index" the same as the "stage"?

Yes, more or less. Various documentation and people are not very consistent about this.

Are all staged files tracked, but the reverse is not necessarily true (namely, files that were once staged and committed, but aren't part of the current stage to be committed)?

This question doesn't quite make sense, since "the staging area" is the index. I think staged doesn't have a perfectly-defined meaning, but I would define it this way. A file is staged if:

  • it is not in @ / HEAD, but is in the index, or
  • is in both @ / HEAD and the index, and is different in the two.

Equivalently, you could say "when some path is being called staged, that means that if I make a new commit right now, the new commit's version of that file will be different from the current commit's version." Note that if you have not touched a file in any way, so that it's in the current commit and in the index and in the work-tree, but all three versions match, the file is still going to get committed. It's just neither "staged" nor "modified".

How do I know which files are tracked?

While git ls-files can tell you, the usual way to find out is indirect: you run git status.

How do I know which files are staged?

Assuming the definition above, you must ask Git to diff the current commit (HEAD / @) and the index. Whatever is different between them is "staged". Running git status will do this diff for you, and report the names of such files (without showing detailed diffs).

To get the detailed diffs, you can run git diff --cached, which compares HEAD vs index. This also has the name git diff --staged (which is a better name—but, perhaps just to be annoying, --staged is not available as an option to git rm!).

Because there are three copies of every file, you need two diffs to see what is going on:

  • compare HEAD vs index: git diff --cached
  • compare index vs work-tree: git diff

Running git status runs both of these git diff-s for you, and summarizes them. You can get an even shorter summary with git status --short, where you will see things like:

 M a.txt
M  b.txt
MM c.txt

The first column is the result of comparing HEAD vs index: a blank means the two match, an M means HEAD and index differ. The second column is the result of comparing index vs work-tree: a blank means the two match, an M means they differ. The two Ms in a row mean all three versions of c.txt are different. You can't see the one in the index directly, but you can git diff it!

Upvotes: 5

Related Questions