Reputation: 943
I would like to have my application set up into some services, but all in one repository. So I wanted to add one submodule for each service (I am only having two for the moment). So my project hierarchy is:
- root
|--rootDoc.txt
|--.git
|
|---- sub1
|--sub1.txt
|--.git
|---- sub2
|--sub2.txt
|--.git
Now I made the following changes:
sub1.txt
sub1
submodule sub2.txt
sub2
submodule Now I'd like to return sub1
-submodule to the state before the last changes in it but keep sub2
in its current state. If that is not possible for submodules, is there another solution for my problem or would I need to use two completely different repositories?
Edit: What I tried:
c:\dev\root\sub1>git log
commit a172db9a5f11738383d28e082db2c22d3f2d3e75 (HEAD -> master, origin/master, origin/HEAD)
Author: %me%
Date: Sun Dec 2 20:24:59 2018 +0100
updated sub2
commit 0becb718a4db9c73b03fa65e332f20c7715463cb
Author: %me%
Date: Sun Dec 2 20:23:40 2018 +0100
sub1 actual now
commit 85d68703bff1af2b95a7ee8d7926d7fd700b1da4
Author: %me%
Date: Sun Dec 2 20:10:50 2018 +0100
Added submodules
commit b3b67de3e54f1db7e56d516af2baaf50541f7ca2
Author: %me%
Date: Sun Dec 2 20:05:44 2018 +0100
initial commit
c:\dev\root\sub1>git checkout 85d68703bff1af2b95a7ee8d7926d7fd700b1da4
Note: checking out '85d68703bff1af2b95a7ee8d7926d7fd700b1da4'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b <new-branch-name>
HEAD is now at 85d6870 Added submodules
After this checkout my sub2
is also changed although I checked out from the sub1
-dir (where the other submodule is located).
Upvotes: 1
Views: 323
Reputation: 488453
You can do what's in your title question ("return to previous commit in a single submodule"). Every submodule is an independent repository in its own right. What's not clear is what you actually done. I suspect that you have made one repository with several sub-directories, and perhaps another two repositories that live under the one repository but are not submodules.
It's worth stepping back here and defining some terms. I'm not really thrilled with Git's terminology here ("submodule" and "superproject" are kind of clumsy) but I will stick with them.
A submodule is a Git repository.
A superproject is a Git repository.
Obviously this is not much help, 😀 so let's add some qualifiers:
A submodule is a Git repository that is currently being used by another Git repository which we call the superproject. There is exactly one superproject for this submodule Git repo.
A superproject is a Git repository that is currently using another Git repository as a submodule. There may be multiple submodules within this superproject.
(This leads to the possibility that some Git repository is simultaneously a submodule and a superproject. This is a bit of a nightmare and you should try to avoid it, but it does happen.)
Now, when a superproject makes demands on another Git repository that the superproject is using as a submodule, the way the superproject Git does this is—at least normally—to command the submodule Git to enter detached HEAD mode. Any Git repository can be in this state, but most normal repositories aren't, except when you're in the middle of a long rebase, or are using git checkout commit-or-tag
to move to a specific historic commit. Normally, when developing, you're on a branch like master
or develop
, which is the opposite of "detached HEAD": here the name HEAD
is figuratively attached to the branch name. So git checkout master
attaches your HEAD
to master
, and git checkout develop
attaches your HEAD
to develop.
(HEAD
, written in all-capitals like this, always—always—refers to the current commit in the current Git repository. The underlying implementation of this is that the .git
directory that holds the repository has a file named HEAD
in it. This .git/HEAD
file either contains a branch name, in which case you're on that branch, or it contains a commit hash ID, in which case you have a detached HEAD at that commit. Since Git stores this in a file, it's possible on Windows and MacOS to use head
in all lower-case, but it's better to stick with the all-capitals version. If you want a shortcut that's easier to type, @
by itself also means HEAD
.)
When you want to use a regular repository, in a system in which you start by cloning the repository (rather than creating it from scratch), you do this:
git clone <url> [<directory>]
e.g., git clone https://github.com/git/git.git
to clone the Git repository for Git via GitHub. This creates a git
directory wherever you are right now. If for some reason you wanted the clone to be put in /tmp/git
you would use git clone https://github.com/git/git.git /tmp/git
. So there are two key items that Git needs, in order to make a clone:
The URL is typically an https://
or ssh://
style URL, listing some upstream host / server (or cloud-system such as GitHub) and a path on that host / server. (Note that [email protected]:path/to/repo.git
is just shorthand for ssh://[email protected]/path/to/repo.git
. The two mean exactly the same thing.)
The process of adding a submodule to an existing repository is much the same:
git submodule add <url> [<directory>]
The url
here will also typically begin with https://
or ssh://
. The <directory>
is the path within your repository, i.e., the place to put the submodule.
The reason for this URL-and-path is that git submodule add
will in fact run git clone
for you. The clone it makes will be an ordinary Git repository, because a submodule is an ordinary Git repository. Git just needs to know where do I get the clone from and where should I put it within this repository.
The other thing that git submodule add
will do—the extra part that makes your current Git repository act as a superproject to that submodule—is to create or update a file named .gitmodules
, and to add an entry to your superproject's index.
Note that the subproject does not have to know about its superproject, and in the bad old days, really didn't know anything about it. (In modern Git the subproject's .git
directory gets migrated into the superproject's .git
directory. The .git
that would be found at the submodule is replaced with a file that points the submodule to its superproject's holding area.)
Anyway, the side effect of all of this is that the set of commits in a submodule is determined by the contents of the submodule alone. The superproject has no effect on it! The submodule is just a clone of some existing URL.
This is not the way you're trying to use submodules, but before we get to that, let's look at the rest of the normal operation of all of this. We have some superproject—a local Git repository that is perhaps a clone of some origin
repository—where we make our superproject commits. Within this superproject, we have now created a file named .gitmodules
that gives the URL and path of another Git repository. Let's say the path is dir/sub
. If we run:
cd dir/sub
we find that we are now in the work-tree of a separate clone, that has its own origin/master
and master
and so on; but this clone has a detached HEAD. Running git log
shows the the detached-HEAD commit, then its parent(s) and their parent(s) and so on, as if history ends at whatever commit we have out as the detached HEAD. This is our submodule Git repository.
If we cd
back up into the original repository:
cd - # or cd ../..
we're back into the main repository. Using the ordinary file system tools shows us that dir/sub
exists now and is a directory. There is a file (or if your Git is older, a diretory) named dir/sub/.git
. If it's a file, it contains one line reading:
gitdir: ../../.git/modules/sub
Running git status
shows two added files:
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
new file: .gitmodules
new file: dir/sub
But inspecting the index—which is a little tricky; I'll use git ls-files
here—shows that dir/sub
is not a directory at all:
$ git ls-files --stage dir/sub
160000 50298bbf97b317f17b3e1cf9287e912fb5de886e 0 dir/sub
Entries with mode 160000
are what Git calls a gitlink.
If you know that dir/sub
is a gitlink, you can view its hash ID more directly using git rev-parse
. The syntax :0:dir/sub
means "dir/sub from the index (at slot zero)":
$ git rev-parse :0:dir/sub
50298bbf97b317f17b3e1cf9287e912fb5de886e
These tell us the same thing, except that if dir/sub
weren't a submodule, we would be able to see that in the git ls-files --stage
output.
The general idea here is that, in your superproject, you use some sort of third-party library (say, Google gRPC) that you personally don't control in any way. Instead, you write your software and make it work with one particular version of that library:
$ (cd dir/sub; git checkout v3.2.1)
By checking out some particular tag in the submodule, you move the detached HEAD to that particular commit. Then you make any changes needed to your own project—your superproject—to make it work with v3.2.1 or whatever version that is:
$ ... make some changes ...
$ git add ... files ...
Having now updated your files, you now also update the gitlink entry that says that your superproject Git should git checkout
the one particular commit that you have right now in your submodule:
$ git add dir/sub # update the gitlink to whatever hash v3.2.1 represents
Now when you make a new commit, the superproject commit continues to list the other repository—with its URL, whatever that is, and its path, dir/sub
—in your .gitmodules
, and this same commit declares: This commit works with the submodule detached to <gitlink submodule hash>.
So, whenever someone runs git clone
on your superproject, and then does a git checkout
of that particular superproject commit, a subsequent:
$ git submodule update
will make sure that dir/sub
has that particular gitlink-ed commit checked out, as the detached HEAD. Now your superproject and submodule are in sync, and you can build.
In your case, you already have the submodule Git repositories. They may or may not have a suitable upstream repository. They exist at sub1
and sub2
. I'll use, as my example, dir/sub
again, though:
$ git submodule add ./dir/sub dir/sub
Adding existing repo at 'dir/sub' to the index
The URL here, ./dir/sub
, is pretty useless to anyone else. (It has to start with ./
or ../
to be relative to the current working directory—Git refuses to add the submodule without the leading ./
.)
At this point, the same thing happens as with a normal URL: Git has created or updated your .gitmodules
to list the URL and path:
$ cat .gitmodules
[submodule "dir/sub"]
path = dir/sub
url = ./dir/sub
and put the hash ID that corresponds to the submodule's HEAD
into the index to serve as the next committed gitlink entry:
$ (cd dir/sub; git rev-parse HEAD)
1fdcf14961c81d03496b359389058410f0169782
$ git rev-parse :0:dir/sub
1fdcf14961c81d03496b359389058410f0169782
$ git status --short
A .gitmodules
A dir/sub
Thus, if you now make a new commit at this point, the new commit will have the .gitmodules
and index entries needed to make this Git repository attempt to manage—or clone, if it's missing—the other Git repository into dir/sub
, based on the URL ./dir/sub
.
This URL is of course entirely useless unless there's already a Git repository at dir/sub
, but that's how we tell this Git that it is being the superproject to another Git repository at dir/sub
. You can use Git this way, and as long as you already have another Git repository at dir/sub
, your superproject Git will be OK with that and will command it. The command your superproject Git will issue to the submodule Git is: Check out this one specific commit, as a detached HEAD.
Assume you go into the submodule and use git checkout
to check out, or even create, some other commit, perhaps by doing git checkout
of some branch name and then maybe working in the repository as usual and committing. Then you cd
back to the superproject and run git status
. Your Git will tell you that the submodule is modified (note the blank before the M
here):
$ git status --short
M dir/sub
This modification exists, but is not yet in your index, i.e., is not yet set up to be committed:
$ (cd dir/sub; git rev-parse HEAD)
860be47095f79afbf94c62d0c3936a9875905e16
$ git rev-parse :0:dir/sub
1fdcf14961c81d03496b359389058410f0169782
As you can see, the submodule is detached at 860be47095f79afbf94c62d0c3936a9875905e16
, even though the index says that the next commit will contain a directive to use 1fdcf14961c81d03496b359389058410f0169782
. **This is exactly like any modified file in the same repository,* except that you use git add
here to tell Git: put the new hash ID in rather than telling it copy the work-tree contents in.
Hence, once we do git add
, the --short
status output will move the M
one letter to the left:
$ git add dir/sub
$ git status --short
M dir/sub
because now the superproject's index entry for the submodule differs from the HEAD
value for that submodule, but does match the actual submodule as found in the work-tree. So now, if everything is ready and we want to tell our superproject Git to command the submodule Git to use 860be47095f79afbf94c62d0c3936a9875905e16
in the next commit we make, we're ready to make that commit now:
$ git commit
[edit a message, etc]
Again, the keys here are:
.gitmodules
, as needed. A new clone of just the superproject obviously does not have any of the submodules cloned yet, so that's what the .gitmodules
entries are good for: they provide the URL and the path!HEAD
: that gets the superproject Git the actual hash ID, and lets you git add
that hash ID to the superproject's index, ready for the next commit you make in the superproject.git checkout
, as a detached HEAD, the one specific commit hash ID that is in the superproject's index right now.If you want to make your superproject command multiple submodules, you git submodule add
all those submodules. To make sure those submodules get the right commit hash ID checked-out as detached HEADs, you enter the submodules, put them on the right commits, and then git add
the submodules in the superproject.
In modern Git, the git submodule
command has some fairly-fancy tricks to coordinate updating submodules using branch names found in the remote (origin
, usually) for the submodule. The idea here is that if you are using, say, Google gRPC, and you want to upgrade, git submodule
can replace several of the above steps—cd
-ing into the submodule, running git fetch
, running git checkout
, and cd
-ing back—with one step. But the actual design of submodules is still "detached HEAD as commanded by superproject": it's up to you to make sure that the superproject Git repository records the correct submodule hash IDs.
Upvotes: 5