Hashim Aziz
Hashim Aziz

Reputation: 6092

What do the remotes in the output of git log mean?

Yet another dawn, yet another day spent trying to learn the shockingly counter-intuitive, jargon-infested mess that is Git. When I run the command git log on my local development server, I get the following output:

$ git log
commit 07cfffea573a70a59b2be7a6ff8d7dee3e831e (HEAD -> master, live/master)
Author: My Name <[email protected]>
Date:   Mon Jul 6 17:02:16 2020 +0000

    manual changes to functions.php permissions

commit 3f507d8211be744b1c5a997c81d13b77d968be (central/master)
Author: My Name <[email protected]>
Date:   Thu Mar 12 00:14:22 2020 +0000

    Contact Form 7 update

commit 41c64ed4fbde1924557e7886901ab8078d088a
Author: My Name <[email protected]>
Date:   Wed Mar 11 19:11:53 2020 +0000

    Added PHP error log for Contact Form 7

In simple terms, what exactly are those remotes there signifying? No question that I've found seems to have answered this yet, and it took one look at man git log to know it wasn't written for a Git beginner.

To me, the output seems to be saying that those remotes are currently stuck on the respective commits that they appear besides, but I know this can't be the case because I've already pushed the latest commit from live directly to central (and confirmed on Github that the changes were made) and yet central is still shown next to the old commit.

What exactly do the remotes here mean?

Upvotes: 0

Views: 357

Answers (2)

torek
torek

Reputation: 488273

TL;DR: the answer to your commentary question:

Does it mean that I have to fetch/pull from both my remotes to keep my local repository up to date, even if the content of the two remotes is the same (due to pushing the changes of one to the other)?

is "yes, you must run git fetch to each remote". You might want to define a group of remotes and/or run git remote update, which runs git fetch to all remotes. Or, git fetch --all also updates from all remotes. (The group stuff gives you much greater control; you can use git remote update group to update a specific group of remotes.)

Long

To make sense of this stuff, you need to realize that:

  • A Git repository stores commits (and other Git objects) in one database—usually the big one—and some set of names, such as branch names, tag names, and remote-tracking names, in a second (usually much smaller) database.

  • Commits are uniquely numbered by their hash IDs, but all Gits (all Git repositories, that is) that have the same commit always use the same hash ID for all the commits that are 100% identical in those two Gits.

  • Branch names—but not any other names—have a special property: they always, by definition, contain the raw hash ID of the last commit that is part of that branch.

  • To make all of the above work, when you get "on" a branch and then make a new commit, Git automatically replaces the stored hash ID in the branch name with the hash ID of the newly-created commit (which has a hash ID that has never appeared in your own Git repository before, and won't be in any other Git repository either, but which every future Git that gets that commit will agree is the hash ID for that commit).

  • In general—there are some exceptions, but we'll leave those aside for now—we only ever add new commits to a repository. This makes it easier to think about the various cases, so it's a helpful simplification.

Then, to that we add this:

  • Each Git repository is normally independent of any other Git repository.

  • Occasionally, though, you'll connect one Git to another Git. When you do, one Git will present some or all of its names to the other Git. These are the fetch and push operations. One Git will offer to the other Git some set of commits that one of the two doesn't have yet:

    • For git fetch, the other Git lists its names and hash IDs, and your Git gets commits it doesn't have yet but wants / needs.

    • For git push, your Git lists hash IDs, and the other Git takes those if it needs them. Then your Git asks politely (regular push) or commands (force-push) the other Git to set some of its name(s) to some hash ID(s). Typically those are branch names: we have a new commit, or maybe n new commits, for branch master or branch develop or whatever, so we send them over, then ask them to update their master or develop or whatever.

The remote-tracking names, like origin/master, central/master, live/master, and so on are your Git's memory of some other Git's branch names. These memories can only be updated when your Git contacts their Git. For internal reasons, your Git only updates the specific names that you git push, when you git push; but git fetch by default updates everything, unless you tell it some specific set of names it should update.

If there is only one other Git that you ever contact and they never update their names except by your commands, your Git always has the right information. That's the common case with a private GitHub storage-repo for your own personal repository, for instance: nobody else ever adds to it. It only gets new commits from you. So your git fetch never really has to do any work: their branch names are always a result of some earlier git push of yours.

If other people can update the other Git (or Gits, plural), though, you'll need to run git fetch to contact each other Git and find out what new commit(s) have been added to their branch(es) since the last contact your particular Git made to that particular Git. Your git fetch remote will obtain the new commits—with their unique, but common-across-all-Gits, hash IDs—from that other Git and then update your remote/branchname remote-tracking branch or branches as appropriate.

If you have remotes alice and barney and run git fetch alice, you'll get any new commits, then update all your alice/* names. A subsequent git fetch barney might not actually get any new commits—maybe you already got them from alice for instance—but will then update all your barney/* names.

It is possible to drop commits from branches

Note that one can drop a commit from a branch using git branch -f or git reset. That is, given something like:

... <-F <-G <-H   <-- master

where the name master identifies the commit whose hash is H, and commit H leads back to commit G and so on, we can git checkout master; git reset --hard HEAD~1 to shove H out of the way:

            H   [abandoned]
           /
... <-F <-G   <-- master

Commit H remains in this repository for a at least a little while—how long, depends on many other factors we won't go into here—but now master identifies commit G instead of commit H.

If this Git repository is someone else's remote—e.g., if we're alice at the moment—and you already got commit H from alice so that your alice/master names commit H, then when you run:

git fetch alice

you'll see that your alice/master is force-updated. This tells you that Alice did something that got rid of a commit. You'll still have commit H in your repository too, for a while; you may or may not be able to find it easily some way; but your alice/master is also now retracted by one commit, to identify commit G too.

Tags don't fit this pattern

Tag names in Git should never move. That's a constraint we place on human beings, who don't always obey. When and whether your Git updates a tag name that your Git remembers is a little tricky. Tags are different from branches in two ways: first, they're not supposed to move, unlike branch names; and second, your Git will often just copy some other Git's tag name to your Git, without asking and without this kind of fancy remote-tracking stuff. There's no alice/v1.2, there's just tag v1.2.

If tags are never deleted and never move, you'll never have to worry about any of this: you either have a tag, and it's right, or you just don't have it yet, and you can get it from any other Git.

Upvotes: 1

philb
philb

Reputation: 2990

In simple terms, what exactly are those remotes there signifying?

The annotations next to the commits in the output of git log always correspond to your local "refs" (i.e. branches or tags). Here, live/master and central/master are remote-tracking branches; you can list those using git branch --remote (or -r for short).

Remote-tracking branches are local references, they are updated when you:

  1. push a change to the corresponding branch on the respective remote, i.e. git push central master should push the new changes on the currently checked out branch to the master branch on the central remote, and update refs/remotes/central/master to point to the head of the current branch
  2. fetch a remote. i.e. git fetch central master would fetch the master branch on the central remote and update refs/remotes/central/master. Similarly, git fetch central would fetch all branches on the central remote. And since git pull does git fetch under the hood, remote-tracking branches are also updated when pulling.

Regarding the Git man pages: it's true that some are not easy to read for beginners. I recommend that you start with these resources:

Some pages from the official documentation:

Upvotes: 0

Related Questions