SoftTimur
SoftTimur

Reputation: 5490

Show different files between local and remote

I have a local folder which is linked to a remote git repository.

Sometimes, We have instructions to check the difference between local and remote.

But their difference could be big and long. Sometimes, I just want to see which files are different/added/removed, and don't need to go into details.

Does anyone know how could I quickly get that information?

Edit 1:

Additionally, I just realised that I need to eventually git add new local files, and locally make a new git commit before git fetch and git diff. Am I correct?

Upvotes: 1

Views: 1774

Answers (2)

torek
torek

Reputation: 487775

Your question mentions folders and files. Git doesn't store folders, and doesn't really work with files either—not at the level of talking to another Git on a different machine, anyway. Git works, instead, with commits. When you have your Git call up some other Git, on some other machine, your Git gets from that Git any commits they have, that you don't, that you should have. That's a git fetch operation. (Or, your Git gives to that Git any commits that you have, that they don't, that you want them to have: that's a git push operation. But it sounds like the one you care about here is "get from them", i.e., fetch.)

So, having had your Git call up their Git and get commits from them, your job is now to compare these commits. Each commit contains files. It only has files, not folders, but the files often have names that may force your OS to create folders. For instance, the file named dir/sub/file.ext probably requires your OS to first create the folder dir, then within dir, the folder sub. (That's your OS's problem, not Git's: as far as Git is concerned, it's just a file with a long name. Git will work around your OS's fixation on "folders" as needed here.)

Every commit has a full, complete snapshot of all of that commit's files. So if you compare commit C123 with commit C456, for instance, you can see what files are common to both, which ones are unique to each, and what the differences are between any common-to-both files. The git diff command does just that, with a lot of options:

git diff --name-only <commit#1> <commit#2>

tells you which files (by full names like a/b/c.ext) are different in the two commits, or are only in one of the two. Or:

git diff --name-status <commit#1> <commit#2>

tells you which files are different, and in what way: for instance, a file might need to be Added to commit#1 for commit#1 to start coming closer to commit#2. Another file might just need to be Modified to change its content a bit. What git diff prints here are, in effect, instructions: if you make these changes to the left-side commit, you'll get the right-side commit. The default for git diff is to provided complete instructions: everything you must do.

Regarding your edit

Additionally, I just realised that I need to eventually git add new local files, and locally make a new git commit before git fetch and git diff. Am I correct?

That might be wisest, but it's probably not necessary.

I mentioned above about Git working with commits, and the diffs shown above all use commits. Commits contain frozen snapshots of all of your (committed) files. All commits are completely read-only: once made, no part of any commit can ever be changed. The files stored inside commits are in a special, read-only, Git-only frozen format that only Git can use. (This format makes comparing the commits faster and easier, and likewise for other things that Git does. It also makes the storage of commits more compact, because once a file is frozen forever, each new commit can just re-use the old commits' files, if they match. It's impossible to change them, so if you make a new commit, and 97 out of 100 files are the same, Git can just re-use those 97 files. The "copy" in the new commit is really just a reference to the existing, shared, already-frozen copy.)

Stuff that's frozen for all time is great for archival. But Git isn't just an archiver. Its data is in this Git-only format: nothing else on your computer can use the files. So it's totally useless for getting any actual work done. What this means is that Git must, to let you get work done, copy the frozen files out of some commit, into some sort of working area. The files in this work area—the area Git calls a working tree or work-tree or similar—are just ordinary everyday files. They're not frozen, and they're in the ordinary everyday format that your computer uses, whatever that is. (They're in folders and everything!)

This process, of copying out the frozen files into your work-tree, also copies the frozen files to Git's index. When you git add and git commit, what you are doing is copying the work-tree files back into the index (git add) and then making a new commit from whatever is in the index (git commit).

The index can get complicated (during merges), but it can be described pretty simply: it's where you build up your next commit. The files that are in it1 are in the frozen format, ready to go into a new commit, but unlike a commit, aren't frozen. You can overwrite them, and you can put all-new files in, and you can delete files from the index. So you can copy each work-tree file into the index to make it ready to be committed. This process, of copying a work-tree file back into the index—replacing the old copy if there was one, creating an all-new file if not—is called staging the file, and hence the index is also called the staging area.

So: to work with files, you have them copied out from commits to the index and the work-tree. You can view and modify files in your work-tree, and you can re-copy files from the work-tree back into the index, staging them for committing.


1Technically, what's in the index is a reference to a frozen-format file, just like in commits. This shows up if you start looking at the internal details of the index, using git ls-files --stage or git update-index. But for the most part, you can just think of the index as having its own separate copy of each file: that mental model works fine until you get into this low level.


git status

The git status command runs two git diffs: one compares the current commit (frozen, remember) to the index. Whatever is different here is staged for commit. That is, if you ran git commit right now, Git would make the new commit from whatever is in the index. All the files that match the HEAD (current) commit are not very interesting: they're exactly the same. So git status says nothing about them. For any file that's new, or deleted, or different, though, git status lists that file's name.

That's what git diff --name-only does, and that's what git status does for this step: it compares, and whatever is different, prints out the file's name. Instead of comparing two commits, though, it compares one commit—specifically, the one that is HEAD right now—and the index.

But git status then goes on to run a second git diff. This time, it compares the files in the index to the files in your work-tree. For everything that is the same, it says nothing. For everything that is different, it prints the file names.

Again, this is the same thing that git diff --name-only would do, except that instead of comparing two commits, it's comparing the index and your work-tree.

Because your work-tree is an ordinary set of folders, you can create files (and subfolders) that aren't in the index. These are your untracked files. Since they are not in the index, they will not be in the next commit. If you run git add on them, Git copies them from the work-tree into the index, and now they are in the index and will be in the next commit.

If you don't copy these into the index, they'll just remain untracked files. They won't participate in git diff. Git will just whine about them: I found these untracked files, what should I do with them? A .gitignore file tells Git to shut up about untracked files. It has no effect on tracked files—files that are in the index—because those files are in the index. But you can put files into the index, and take them out of the index, any time you like.

How all this goes with your original problem

If you wish to compare two commits, you'll need to make a commit. That lets you adjust the index contents as needed, and then you have two commits, where the comparison is easy: you just give git diff both hash IDs, or names for the two hash IDs.

But, if you're willing to use the index or the work-tree as the source of one of the two sets of files, you can do that too. When using the work-tree as one set of files, Git will only believe that those files that are also in the index are part of that work-tree: the other files are all untracked and hence do not take part in the git diff. And now you have one other problem: git diff --cached <hash> compares with the index on the right, and the given commit on the left, for instance. But there's a -R (reverse the sides) flag, if you want the comparison to go the other way.

If, after getting a diff, you decide you'd like to send a commit to the other Git, you'll probably want to just make a commit. Making a commit is generally quite cheap: it's just a matter of packaging up whatever is already in the index, and adding the metadata that goes with a new commit.

Upvotes: 1

bk2204
bk2204

Reputation: 76409

You'll definitely need to fetch the remote repository in order to do that. There's no way to avoid that. Once you've done that, you can do a comparison. There are a couple of ways to do that.

One is to do git diff --stat master origin/master and see the stat output:

$ git diff --stat master origin/master
 Documentation/RelNotes/2.26.0.txt | 44 +++++++++++++++-----------------------------
 1 file changed, 15 insertions(+), 29 deletions(-)

You can also view only the names of the files with git diff --name-only. If you want to see specifically which files have been added or removed, you can use the --diff-filter option to list only the files that have changed in the given way.

Finally, if you want something more like git status -s output, you can use git diff --name-status.

Upvotes: 2

Related Questions