kevcoder
kevcoder

Reputation: 955

What does git "behind or ahead by X commits" really mean?

So git status returned

Your branch is ahead of 'origin/Dev-Branch' by 5 commits.
  (use "git push" to publish your local commits)

I did that and now git status returns

Your branch is up to date with 'origin/Dev-Branch'.

but git log shows that my last commit was from the day before: i.e. nothing got pushed to origin today

How does Git calculate the behind and ahead commits messages?

Upvotes: 3

Views: 4616

Answers (5)

pkamb
pkamb

Reputation: 35052

nothing got pushed to origin today

Your commits were pushed to origin today, when you ran the git push command.

But those commits were made yesterday. The timestamps on the commits show the time at which they were committed, NOT when they were pushed.

The time when the branch was pushed might be recorded on GitHub, or have generated some emails, or is stored the git reflog, but isn't really recorded in basic git itself. Your branches do not store the timestamps of when they were created or pushed.

How does Git calculate the behind and ahead commits messages?

If your remote and local branches are pointing to the same commit, they are "up to date". Otherwise, one is ahead of the other by the X commits that you've made.

When you're working on your computer, making commits locally, your local branch is "ahead" of the remote branch. After you push those local changes to the remote repo, your branches will be "up to date".

Upvotes: 0

tmaj
tmaj

Reputation: 35135

Working with git is a dance between the following trinity:

  1. Dev-Branch local, currently checked out
  2. Dev-Branch on the remote repository
  3. (remotes/)origin/Dev-Branch - local copy of the remote's Dev-Branch

The ahead/behind is about 1 and 3.

3. is what gets updated when you do a fetch (and git pull = fetch & merge).

Upvotes: 0

torek
torek

Reputation: 490038

It helps a lot to understand Git if you keep in mind these items:

  • Git is all about commits. Branches—or branch names—only matter in terms of letting you (and Git) find hash IDs. The hash IDs are the real names of the commits. The hash IDs look random, so we need names to find them. A branch name, such as master or dev, holds one hash ID: the hash ID of the last commit in that branch.

    These hash IDs are universal! They are always the same, in every Git repository everywhere. A Git repository either has that commit, which has that hash ID ... or it doesn't have that commit at all. No hash ID can ever be re-used for a different commit.1

  • Git is called distributed, but you might think of it better as replicated. (Technically, though, distributed is a better word. Use whatever you need to keep it straight in your head.)

  • Every repository has (normally anyway) a complete copy of all of the commits it has ever seen. The first time you git clone some other repository, you get a full copy of all of its commits. After that, though, the two clones can drift apart, except when you make one call up the other.

  • Clones only talk to each other when you connect them—by having one repository call up the other, usually via https://... or ssh://... URL, but usually you hide this URL under a simple name like origin.

    You do the connecting with git fetch ("get commits") and git push ("give commits"). The git pull command is a distraction here: it really just means run git fetch, then run a second Git command. It's the git fetch part that makes your Git talk to another Git.

So, to update your clone with anything new they have obtained or made, you run git fetch. Your Git calls up their Git and your two Gits have repository intercourse, and since the direction was "get new stuff from them", whatever they have, you now have it too. But your Git remembers what they have using your Git's remote-tracking names. Your Git asks them about their master. They might say: my master is commit a123456.... If you don't already have that commit (and any earlier ones that go with it), your Git has them send that commit (and any earlier ones they have that you need too). Once your Git has the commit, your Git sets your origin/master—not your master!—to remember that their master says a123456.

To update their clone with anything new that you have obtained or made—obviously you'll have to have made this yourself, or obtained it from somewhere that's not them—you have your Git call up their Git and say: Do you have commit b789abc...? If not, you give them that commit, and any others that you need to give them to complete the job. Then your Git says: Now, please, if you will, set your master to b789abc....2 Note that they do not set a remote-tracking name! They don't have a brian/master or kevin/master or anything like that; they just have their master.

If their Git comes back and says OK, I set my master to b789abc..., well, now your Git knows that their master means that hash ID. So your Git updates your origin/master to remember that their master remembers b789abc....

This brings us to what remote-tracking names like origin/master are all about: These are your Git's memory of what hash IDs their Git is remembering. These hash IDs can be out of date! Running git fetch has your Git get anything new from their Git, and update your remote-tracking names, so now your Git has the right information. If it's been a while since you ran git fetch, your Git may be out of date.3


1This uniqueness constraint is actually slightly relaxed: if two Gits will never, ever, have Git-sex with each other, one of the two can re-use the hash ID that the other one is using for a different internal object. Aside from this exception, all of Git is crucially dependent on the uniqueness of hash IDs. They are what makes all of the magic work. That's why they look so random, even though they're actually just cryptographic checksums that are rigidly computed: they must be unique.

2Your git push will do this even if they already have b789abc.... The git push command consists of two parts: send commits if/as needed, which all works by the unique hash IDs, followed by requests or commands to the other Git: set branch name X to hash ID H1, set branch name Y to hash ID H2, and so on.

3How long is "a while"? That depends on how active the other Git repository is. Maybe they get new commits daily. Maybe it's just yearly. Or maybe they get thousands of new commits an hour and if it's been even half a second, why, that's practically forever!


Next, you need to understand that commits connect to each other

In Git, a commit—the thing with the unique hash ID—is:

  • a snapshot of all your files (not changes to the files, but a full snapshot)
  • plus some metadata:
    • your name and email address, and the date-and-time stamp of when you made this commit: this is the committer data
    • the same repeated an extra time, as the author data (this can be different if you copy a commit)
    • your log message, in which you tell others, or yourself next month/year, why you made this commit
    • crucially for Git, the raw hash ID of some previous commit(s).

This last bit is how Git history exists. A commit is a snapshot, but has the hash ID of the previous snapshot. That is, if we have some big ugly hash ID—let's just call it H for "hash"—that locates one commit, that commit has inside it the hash ID of a previous commit. Let's call that second hash ID G. Then:

      ... G <-H

H comes after G, but points to earlier commit G. Of course G has some big ugly hash ID saved inside it, too, so G points to F:

... <-F <-G <-H

and F points back yet again, and so on.

With a chain like this, we can work all the way back from any commit to the very first commit. That commit doesn't point back to an earlier commit, because it can't: there is no earlier commit. Git calls this a root commit. Let's say there are just eight commits, A through H, all in a nice linear row, and that the name master holds the hash ID of the last commit H:

A--B--C--D--E--F--G--H   <-- master

We say that master points to H. (I switched the arrows between commits to lines because they're easier to draw, especially in the next few drawings! But they still all point backwards. Keep in mind that Git works backwards; every once in a while, that's useful to know. Here, it doesn't matter that much.)

Now let's make some more commits, but make them like this, on two different branches br1 and br2:

          I--J   <-- br1
         /
...--G--H   <-- master
         \
          K   <-- br2

The name br1 holds the hash ID of commit J: br1 points to J. The name br2 points to K.

One of the tricky things about Git here is that now, commits up through H are on all three branches. (Other version control systems do not do this.) If we make a new commit on master now, it gets another new, unique hash ID, and the name master moves to point to it:

          I--J   <-- br1
         /
...--G--H--L   <-- master
         \
          K   <-- br2

When you add commits to a repository, none of the existing commits change, at all, in any way. (They literally can't change because their unique hash ID is just a checksum of their contents. If you change anything about any commit, all you get is a new and different, unique, commit with a new, different, unique hash ID.) But the branch names move! The branch name always, by definition, points to the last commit in the branch.

Now we can draw your situation

Let's go to your Git repository, or one close to it. I have no idea how many commits you have total—probably way more than the 26 uppercase letters I can use—but let's just draw it this way:

...--G   <-- origin/Dev-Branch
      \
       H--I--J--K--L   <-- Dev-Branch (HEAD)

The HEAD here indicates that this is the branch you have checked out right now. If you have lots of branches—or even just two—we need to know which one you're "on", so that git status can say on branch Dev-Branch, and so that when you make a new commit, Git knows which branch name to move.

The origin/Dev-Branch we drew in is the remote-tracking name. Your Git has talked with their Git—the one over at origin, the name that holds the URL your Git uses to talk to them—at some point and they said my Dev-Branch names commit G so your Git has your origin/Dev-Branch pointing to (shared) commit G.

Meanwhile, your Dev-Branch points to commit L.

Commits always point backwards. New commits point back to whatever commit you had out when you made them, so L points back to K, which points back to J, and so on.

How many commits are there, if you start counting at L, and stop when you reach the commit that origin/Dev-Branch names?

Ahead 5, behind 1

Now, suppose were to run git fetch and they had a new commit—let's call it N, skipping over M for some reason—that came right after commit H, you would end up with this in your repository:

...--G-----------N   <-- origin/Dev-Branch
      \
       H--I--J--K--L   <-- Dev-Branch (HEAD)

That's because your Git would ask their Git about their Dev-Branch and they'd say "oh, that's commit N". Your Git would get commit N and then see that you already have commit G and be done with that phase, and then your Git would update your origin/Dev-Branch to point to N.

Now if you have your git status count commits, how many commits are there that are on your Dev-Branch that aren't shared? How many commits are there on your origin/Dev-Branch that aren't shared? (Note that shared here means between these two names. So commits G-and-earlier are shared, but H-and-later aren't. We don't worry about what's really in the other Git, just what our Git remembers about their Git.)

Suppose this were the actual situation in their repository (they have commit N). Even if you didn't / don't have N in your own repository, you could now run git push. Your Git would call up their Git and sent them your H-I-J-K-L chain, and now they would have the same drawing we have here (but using their names, not yours). Then your Git would ask them to change their Dev-Branch to point to commit L:

...--G-----------N   <-- Dev-Branch [in origin]
      \
       H--I--J--K--L   <-- proposed new Dev-Branch

If they were to move their Dev-Branch name to point to L, what happens to commit N? The answer is: nothing actually happens to it, but now, they don't have a name for it, and they can't find it any more. The arrows only go backwards: there is no way to go from G to N, only from N to G. So if you do this, they'll just say no, I won't move my Dev-Branch. (They would call that a non-fast-forward.)

At this point, you would need to make a merge commit in your own repository, or otherwise make sure that they won't lose their commit N. Here's what such a merge might look like:

...--G--------------N   <-- origin/Dev-Branch
      \              \
       H--I--J--K--L--M   <-- Dev-Branch (HEAD)

Your new merge commit M would refer back to your existing commit L, but also to their (now also your) commit N. (If you didn't have N yet, you'd need to git fetch first, so as to get N.)

Once git push succeeds, update your drawings

Let's go back to this drawing:

...--G   <-- origin/Dev-Branch
      \
       H--I--J--K--L   <-- Dev-Branch (HEAD)

You run git push (or git push origin Dev-Branch). Your Git calls up their Git, gives them commits H-I-J-K-L if they don't have them—if they do have some, your Git sends over whichever ones they still need—and then asks them to set their Dev-Branch to point to commit L. They say OK, I did, so your Git updates your origin/Dev-Branch to remember that they accepted your request:

...--G
      \
       H--I--J--K--L   <-- Dev-Branch (HEAD), origin/Dev-Branch

Now, when git status counts commits, it will find out how many commits are on your Dev-Branch that aren't shared with your origin/Dev-Branch (your Git's memory of their Dev-Branch), and how many commits are on your origin/Dev-Branch that aren't shared with your own Dev-Branch. Since the two names match up exactly, you're no commits ahead, and no commits behind.

This is all based on whatever information you have locally. It does not matter what their Git has at this point. What their Git has matters when you run git fetch and when you run git push, but not when you run git status.

When you go to look at the various commits, such as L, Git shows you what's in that commit. None of the metadata can be changed. None of the snapshot can be changed. The metadata says the commit was made yesterday, so that's what you'll see.

(To show the snapshot, Git will actually retrieve both the snapshot in L and the snapshot in its immediate parent, K. Then Git will compare the two snapshots to see what changed. What changed is more useful, in general, than what all the contents are, when you want to look at a commit like this. But each commit is still a full snapshot.)

Upvotes: 7

Gabriel Luci
Gabriel Luci

Reputation: 41008

Remember that you have a local copy of the entire git repository. When you commit something, nothing is sent to the server - the commit is saved to your local copy of the repository. You can make as many commits as you want, and no one will see them until you run git push.

On to your questions:

Your branch is ahead of 'origin/Dev-Branch' by 5 commits.

This means there are 5 commits in your local copy of the repository that are not in "origin/Dev-Branch". In other words, 5 commits that you haven't pushed yet.

After you push, everything is the same.

git log only shows you commits, not what was or was not pushed. So if your last commit was yesterday, then that's accurate. It doesn't change the date of the commit when you push it.

Upvotes: 0

Ry-
Ry-

Reputation: 225272

but git log shows that my last commit was from the day before: i.e. nothing got pushed to origin today

That’s not what that timestamp means. You committed the day before, and pushed to origin today.

Git keeps track of where your current branch is in history and the last information it received about where the upstream – the corresponding branch on your remote (GitHub) – is. It can tell you how those two histories compare. Sometimes all it takes to put two repositories (e.g. yours and GitHub’s) in sync is to tell one of them about new commits. This process doesn’t create any new commits, so you won’t see any timestamps inside Git associated with it.

Upvotes: 0

Related Questions