Reputation: 49371
What are the conceptual differences between using git submodule and subtree?
What are the typical scenarios for each?
Upvotes: 434
Views: 146155
Reputation: 63
What are the conceptual differences between using git submodule and subtree?
Both submodule and subtree are ways to embed another Git repository into a Git repository. However, submodule is reference-based and subtree is copy-based.
What are the typical scenarios for each?
Let's say you need to embed repository A into your project B:
If A is stable (i.e., the referenced repository will not be deleted or moved) and you don't need to change A frequently during the development of B, then a submodule is better.
If B does not need to update A frequently, or if A needs to be changed frequently during the development of B, then a subtree is better.
Upvotes: 1
Reputation: 326
The simplest way to think of subtrees and submodules is that a subtree is a copy of a repository that is pulled into a parent repository while a submodule is a pointer to a specific commit or branch in another repository.
Upvotes: 10
Reputation: 1776
Git submodule is useful when you want to keep the embedded repository's commit history separate from the main repository. However, using submodules can be complex and difficult to manage, especially when you need to update the embedded repository.
[Git Subtree and comparison with Submodule - Atlassian]
Git subtree is a solution that allows merging one repository into another as a subdirectory, but keeping the entire commit history. It is useful when you want to share a set of files between different projects without the need to maintain a separate repository. Using a subtree is simpler than using a submodule and is generally easier to manage.
In short, if you need to keep the shared repository's commit history separate from the main repository, git submodule might be the best choice. If you need to share a set of files between different projects without the need to maintain a separate repository, git subtree might be the best choice.
Let's compare the commands for sending and receiving updates:
1. Submodule
#push updates:
cd path/to/submodule
1. git add .
2. git commit -m "Submodule Update"
3. git push origin master
cd ..
4. git add submodule
5. git commit -m "Submodule ref update"
6. git push origin master
# >Needs to be in this order! Easy to get trouble<
#pull:
git submodule update --remote
2. Subtree
#push updates:
cd path/to/shared/repo
1. git add .
2. git commit -m "Subtree update"
3. git push origin master
#then
4. git subtree push --prefix=path/to/shared/repo shared-repo master
#pull:
git subtree pull --prefix=path/to/shared/repo shared-repo master
Upvotes: 7
Reputation: 1324417
What if I want the links to always point to the HEAD of the external repo?
You can make a submodule to follow the HEAD of a branch of a submodule remote repo, with:
o git submodule add -b <branch> <repository> [<path>]
. (to specify a branch to follow)
o git submodule update --remote
which will update the content of the submodule to the latest HEAD from <repository>/<branch>
, by default origin/master
. Your main project will still track the hashes of the HEAD of the submodule even if --remote
is used though.
Plus, as noted by philb in the comments, git subtree
is a contrib/
, as opposed to git submodule
(core command)
Upvotes: 262
Reputation: 3507
The conceptual difference is:
With git submodules you typically want to separate a large repository into smaller ones. The way of referencing a submodule is maven-style - you are referencing a single commit from the other (submodule) repository. If you need a change within the submodule you have to make a commit/push within the submodule, then reference the new commit in the main repository and then commit/push the changed reference of the main repository. That way you have to have access to both repositories for the complete build.
With git subtree you integrate another repository in yours, including its history. So after integrating it, the size of your repository is probably bigger (so this is no strategy to keep repositories smaller). After the integration there is no connection to the other repository, and you don't need access to it unless you want to get an update. So this strategy is more for code and history reuse - I personally don't use it.
Upvotes: 166
Reputation: 1811
sub-module
pushing a main repo to a remote doesn't push sub-module's files
sub-tree
pushing a main repo to remote pushes sub-tree's files
Upvotes: 31