Francis
Francis

Reputation: 11

Git log different behaviour

I have a strange behaviour between my laptop ang gitlab CI. When I execute the following command git log -1 --date=short index.html on my laptop it display the last commit for the file.

git log -1 --date=short index.html
commit 5df9780d5431fd8efe6ba968ea74090a0189a228
Author: test <[email protected]>
Date:   2021-03-31

    Commit for index.html

Whereas when I execute the command in .gitlab-ci.yml when pipeline is executed I have the last commit for the whole repo and not only for the index.html file.

git log -1 --date=short index.html
commit 55b13a77fb7e081dc4531e040f3c2c227c0e6412
Author: test <[email protected]>
Date:   2021-09-07

    Commit for README.md

There is somebody which can help me to understand why ? I try to display many different option but the both machine seems to be similar except to display git log history for a file. To complete I work only on master branch and there is not other branch created.

Upvotes: 1

Views: 929

Answers (1)

torek
torek

Reputation: 489858

This is a combination of two things:

  • git log, asked for file history, fakes it from the only history that actually exists: the commits in the repository;
  • GitLab CI, asked to clone a repository, often (for good reasons) uses a shallow clone, which carries fewer commits = less history.

In particular, the command you ran—git log -1 --date=short index.html—tells git log:

  • Start at the current, or HEAD, commit. Put this commit into the to-be-visited queue.

  • Now, for each commit in the queue (in a loop, with commits being removed from the queue in priority order to loop over them—but initially there's only one commit anyway):

    • Examine the commit. Load its saved source tree, and the saved source tree of the immediately-previous or parent commit or commits. (Merge commits are those with two or more parents.) If there is no previous commit, e.g., this is the very first commit, load up an "empty tree" for the saved parent tree.

    • If, in comparing the file index.html as saved in this commit, to the file index.html as saved in the parents (or fake empty tree), there's some difference in this file, mark this commit as "to be printed out". This difference can be a change in the file (if it's in both saved snapshots), or a change in its existence (here in this commit, gone in the parent, for instance).

    • Place the parent or parents into the queue.

    • If we're to print this commit, print it with --date=short, and then quit (-1).

That's the end of the loop. So what happens is that we print the first commit that changes the one specified file in some way, e.g., creates it initially (it wasn't in the parent) or modifies it (it's in both commits but is different in the two snapshots).

Merge commits here are tricky and generally don't get printed by this process (you can modify this behavior with additional git log flags). They do, however, push multiple parents into the queue, which makes the priority-queue aspect matter. But merge commits are probably a red herring here. Instead, let's talk a bit more about shallow clones.

To make a shallow clone, a Git client (the clone operation is run as a client) asks a server to stop sending commits after some point. That is, the server says, e.g., that the latest commit is badc0ffee, so the client says give me that commit. The server then says that the commit before badc0ffee is deadcafe but the client says: Oh, I only want one commit, don't give me deadcafe after all.

So, the server doesn't. The client, to remember that while there's some parent deadcafe for commit badc0ffee, stores the hash ID deadcafe in a shallow grafts database. When the client encounters a reference to commit deadcafe, it pretends that commit doesn't exist and that there was no such reference.

The result is that git log -1, run on the client that made this one-commit shallow clone, thinks that the one commit in the shallow clone is the only commit ever. The file index.html exists in that last commit, and because there's no parent—the shallow-ness effectively effaces the parent—it was created in that commit. So it's different from the empty tree, and git log -1 will print out the hash ID and metadata from that last commit.

If we make a shallow clone that's two commits deep, we'll have both badc0ffee and deadcafe, but the parent of deadcafe—let's say it's cabbabb1e—will be cut off. If we run git log -1 index.html here, Git will first compare index.html in badc0ffee vs index.html in deadcafe. If those are different, git log will print commit badc0ffee (and then quit, because of the -1). If not, Git will move on to commit deadcafe. Since its parent cabbabb1e is cut off via the shallow clone, git log will believe that commit deadcafe created index.html and will print out that commit (and then quit).

Conclusion

You need to know what a shallow clone is, when to use it, and when not to use it. You may need to make your GitLab setup produce a less-shallow, or not-at-all shallow, clone to get whatever effect you desire here.

Upvotes: 1

Related Questions