Nathan H
Nathan H

Reputation: 49371

Differences between git submodule and subtree

What are the conceptual differences between using git submodule and subtree?

What are the typical scenarios for each?

Upvotes: 434

Views: 146155

Answers (7)

dsyx
dsyx

Reputation: 63

What are the conceptual differences between using git submodule and subtree?

Both submodule and subtree are ways to embed another Git repository into a Git repository. However, submodule is reference-based and subtree is copy-based.

What are the typical scenarios for each?

Let's say you need to embed repository A into your project B:

  • If A is stable (i.e., the referenced repository will not be deleted or moved) and you don't need to change A frequently during the development of B, then a submodule is better.

  • If B does not need to update A frequently, or if A needs to be changed frequently during the development of B, then a subtree is better.

Upvotes: 1

Pervaiz Iqbal
Pervaiz Iqbal

Reputation: 326

The simplest way to think of subtrees and submodules is that a subtree is a copy of a repository that is pulled into a parent repository while a submodule is a pointer to a specific commit or branch in another repository.

Upvotes: 10

pxeba
pxeba

Reputation: 1776

[Git Submodule - Atlassian]

Git submodule is useful when you want to keep the embedded repository's commit history separate from the main repository. However, using submodules can be complex and difficult to manage, especially when you need to update the embedded repository.

[Git Subtree and comparison with Submodule - Atlassian]

Git subtree is a solution that allows merging one repository into another as a subdirectory, but keeping the entire commit history. It is useful when you want to share a set of files between different projects without the need to maintain a separate repository. Using a subtree is simpler than using a submodule and is generally easier to manage.

In short, if you need to keep the shared repository's commit history separate from the main repository, git submodule might be the best choice. If you need to share a set of files between different projects without the need to maintain a separate repository, git subtree might be the best choice.


Get/Update Workflow Comparison

Let's compare the commands for sending and receiving updates:

1. Submodule

#push updates:
cd path/to/submodule
1. git add .
2. git commit -m "Submodule Update"
3. git push origin master
cd ..
4. git add submodule
5. git commit -m "Submodule ref update"
6. git push origin master
# >Needs to be in this order! Easy to get trouble<

#pull:
git submodule update --remote

2. Subtree

#push updates:
cd path/to/shared/repo
1. git add .
2. git commit -m "Subtree update"
3. git push origin master
#then
4. git subtree push --prefix=path/to/shared/repo shared-repo master

#pull:
git subtree pull --prefix=path/to/shared/repo shared-repo master

Upvotes: 7

VonC
VonC

Reputation: 1324417

What if I want the links to always point to the HEAD of the external repo?

You can make a submodule to follow the HEAD of a branch of a submodule remote repo, with:

o git submodule add -b <branch> <repository> [<path>]. (to specify a branch to follow)
o git submodule update --remote which will update the content of the submodule to the latest HEAD from <repository>/<branch>, by default origin/master. Your main project will still track the hashes of the HEAD of the submodule even if --remote is used though.


Plus, as noted by philb in the comments, git subtree is a contrib/, as opposed to git submodule (core command)

Upvotes: 262

Niklas P
Niklas P

Reputation: 3507

The conceptual difference is:

With git submodules you typically want to separate a large repository into smaller ones. The way of referencing a submodule is maven-style - you are referencing a single commit from the other (submodule) repository. If you need a change within the submodule you have to make a commit/push within the submodule, then reference the new commit in the main repository and then commit/push the changed reference of the main repository. That way you have to have access to both repositories for the complete build.

With git subtree you integrate another repository in yours, including its history. So after integrating it, the size of your repository is probably bigger (so this is no strategy to keep repositories smaller). After the integration there is no connection to the other repository, and you don't need access to it unless you want to get an update. So this strategy is more for code and history reuse - I personally don't use it.

Upvotes: 166

Matt Rek
Matt Rek

Reputation: 1811

sub-module
pushing a main repo to a remote doesn't push sub-module's files

sub-tree
pushing a main repo to remote pushes sub-tree's files

Upvotes: 31

Feng
Feng

Reputation: 5333

submodule is link;

subtree is copy

Upvotes: 504

Related Questions