Reputation: 8795

How can I get the latest tag from a commit ID?

I need to get the Tags from a commit ID in a GitLab repository. Searching past solutions I'm trying to follow this one: How to find the tag associated with a given git commit?.

I cloned the repository into my local PC and typed the following in the command prompt, inserting the appropriate commit ID:

>>git describe --exact-match <commit-id>

and

>>git describe --tags <commit-id>

But I get the following error with both:

>>fatal: <commit-id> is neither a commit nor blob

I'm a beginner, not sure what exactly I'm doing wrong. Any help would be appreciated.

Upvotes: 0

Answers (1)

torek

Reputation: 488183

Summary

Use git describe in the submodule.

Long

The problem here, as noted in comments, is that you're using a repository that has a submodule. The commit hash ID you are using is that of a commit in the submodule. To understand this properly, you need a proper mental model of a Git repository and a submodule:

A Git repository is, at its heart, a pair of databases. One of these two databases—usually the largest one by far—contains Git objects, each of which has a hash ID. The most interesting of these objects, at least to humans, tends to be the commit objects, for reasons we'll see in a moment.

The other database in a Git repository holds name-value pairs, with the names being things like branch and tag names. Each name holds one (1) hash ID as its value. That's all Git needs here.

Peculiarly, using git clone to copy a repository makes a new repository in which all the objects are copied as-is, but hardly any of the names are copied as-is. The new clone has its own (often different) names. Only the tag names tend to be copied as-is, which is part of what makes tag names different from branch names.
Meanwhile, a submodule in Git is made up of separate but linked bits of information about some other Git repository. We'll see why this is, in just a moment, but now it's time to look at commits.

Commits are Git's raison d'être: if Git did not store commits, we would not use Git. Since these commits are in that object database—the part that gets copied during cloning—they have a hash ID. This hash ID is unique, in the sense of being a Universally Unique Identifier or UUID. This hash ID alone is supposed to allow Git to find the commit.¹ You do, however, have to have the commit, in your Git clone, to see it.

The raw commit itself is actually quite tiny, but indirectly—through more Git objects, which is why git clone copies the entire object database—the commit object allows Git to obtain and extract a full snapshot of the entire set of files that was committed, and is now saved forever as part of that commit. It also allows Git to work backwards through history from that point in time, i.e., to every previous commit. So: Each commit holds a full snapshot and the entire history of the project up to that point.

One can see how this is generally pretty useful, and the idea that cloning a repository gets you the entire history, often saved into a very small space (relatively speaking), is pretty amazing. Of course "relatively small" is just that—relative—so people have come up with various ways of avoiding getting the entire thing, including "shallow clones" (which have existed for decades) and a new "partial clone" concept (which is just a few years old now and still full of teething issues). But that's what Git is about: a repository holds commits, and then the commits hold files-as-snapshots and chain together to make history.

This brings us to the problem. Let's say that someone has a repository—full of commits, i.e., history—that implements some library, and someone else wishes to use these commits with a higher-level program that uses the library. A repository cannot hold another repository. It only holds commits and names and other objects. A commit cannot hold another commit: it holds only files, and references to other commits in the same repository. So how do we have one repository refer to another?

Git's answer here is the submodule. A submodule reference within a commit contains only one piece of information: the unique hash ID of some other commit. This "other commit" is in some other Git repository. That is, we tell Git: Go clone another Git repository and find this commit hash ID. This of course requires that we store another piece of information, namely the URL of the repository that Git should go clone.

We end up, then, with two linked bits of information: Go clone this URL and check out this commit. The URL for the clone is stored in a file named .gitmodules; this file should exist in every commit in which the hash ID appears. The commit hash ID to check out goes into what Git calls a gitlink, which—like any ordinary file stored in a commit—has a path name path/to/submodule and a raw hash ID.²

When Git is checking out a commit in the "superproject"—with superproject being defined as any Git repository in which some commit contains a gitlink and a .gitmodules file—the superproject records the desired hash ID in Git's index aka staging area.³ Once that's done, a "recursive" checkout, or running git submodule update --init if necessary, will clone (if necessary) and check out the corresponding submodule commit in the correct location within the working tree.

The upshot of all of this is that you have more than one repository, in the end. The "containing" repository—which doesn't actually contain the other repository, as that's literally impossible⁴—is the "superproject" and the "contained" repository is the "submodule". A submodule can itself be a superproject for its own submodule (which is where footnotes 3 and 4 can sometimes come in again: an occasional bug can still pop up here, though it's pretty rare now), and so on. But there are definitely at least the two repositories.

To do work within the submodule, you can simply maneuver into the appropriate directory—at least if you're using command-line Git tools—and run the usual Git commands:

git clone <superproject>       # clone the top level project
cd <repo>
git switch <branch>            # extract desired commit
git submodule update --init    # clone the submodule if necessary

git -C path/to/submodule describe --always

The final command here would print out a description of the submodule commit checked out by the git submodule update --init step. That commit exists in the submodule repository, not in the superproject repository, so only a git describe command that's run in the submodule (with -C path/to/submodule in this case) can describe it. The superproject can tell you the desired hash ID, but the tag names and other names that are used by git describe were not copied until the submodule repository was cloned, and they now exist in the names database for the submodule repository.

¹In practice, it does do so, but see also How does the newly found SHA-1 collision affect Git?

²For ordinary files, the associated hash ID is that of an internal Git blob object. This particular bit of implementation detail stays pretty well hidden—as in, programmers don't need to be aware of it to use Git—until you hit submodules, when it leaps out and bites you on the nose.

³We haven't covered Git's index here, but you must know about it in order to use Git effectively. This is another place where some people complain—correctly, in my opinion—that Git is full of too many leaky abstractions. Leaky abstractions are somewhat unavoidable but Git elevates them to an art. 😈

⁴In the past, Git stored the submodule repository clone outside the superproject repository entirely (in the usual hidden .git folder, but inside the working tree). This proved problematic, so Git now relocates the submodule clone to a secret compartment within the superproject clone, but it's still a separate repository, with all that this entails. You're not supposed to have to be aware of this, but see footnote 3: this detail may leak out unexpectedly.

Upvotes: 1

How can I get the latest tag from a commit ID?

Answers (1)

Summary

Long

Related Questions