Mukunda Modell
Mukunda Modell

Reputation: 974

My project uses over 100 git submodules, which submodule alternative can handle a lot of repositories gracefully

I've been researching git subtree and other alternatives to git submodules. My project has well over 100 submodules and it's very unwieldy to manage them all.

Can anyone recommend a workflow that works really well with a large number of repositories that need to be kept in sync.

Upvotes: 8

Views: 7466

Answers (3)

vaughan
vaughan

Reputation: 7472

Far better to use a monorepo. The sensible reason for submodules would be if you need different packages to have different access privileges.

If this is the case, then split code into separate monorepos based upon access privileges. Then use https://github.com/ingydotnet/git-subrepo to allow all monorepos in a single monorepo.

Upvotes: 1

Mac Lara
Mac Lara

Reputation: 72

I've had the same issue, not 100 submodules, but about 15-20, I built a cli to assist in commit, push, pull, rebase, checkout, etc. I also used hard linking within my applications so the cli also handles that, but its not necessary to hard link. The cli is written in go, and has releases for all sorts of os platforms

For my applications, my workflow usually has a ".boiler" folder where all my submodules go, then I hard link files within the .boiler to the "src" of my application, then when i make edits to the linked file, it updates the source file, which is in the gitsubmodule

here's the link to the cli with install instructions, of course you can just download the release and add it to any path thats in your global PATH

https://github.com/ml27299/lit-cli

Upvotes: 2

ivan.sim
ivan.sim

Reputation: 9288

If you project has over 100 git submodules of components and dependencies, their management will be unwieldy no matter which approach you use :-) I suggest look for ways to script and automate as many parts as possible. Trust me, the novelty of playing with and chaining git commands wear out very quickly for most people, especially when deadlines are approaching. There is already a very good answer here on the comparison of the different approaches to manage git sub-projects.

Regarding workflow, I will first separate repositories that are under your control from those that aren't i.e. 3rd party repositories.

For 3rd party repositories which don't change often (either via merges or upstream PRs), you can still use submodules. Typically, you will point these submodules to the HEAD of some stable tags. Sync-ing them it's just a matter of running (or scripting) git submodule update --recursive --remote. If these 3rd party dependencies can be specified in package management tools like bundler (for ruby projects), it will help to simplify your subprojects management.

For repositories that your own and change often, either gitslave or git-subtree are two alternatives, depending on your team's preferences.

gitslave multiplexes git operations into multiple branches. IOW, when you branch, merge, commit, push, pull etc., each command will be run on the parent project and all slaves in turn. This mandates the team to work in a top-down manner, starting from the super-project down to the slaves.

gitsubtree uses Git’s subtree merge functionality to achieve a similar effect as submodules, by actually storing the files in the main repository and merging in changes directly to that repository. The end result is a canonical repository with the option of including all the subprojects' history. In a way, this allows team members to focus more on the subtrees they are responsible for, but will require extra work to merge back to the parent tree.

As a developer, my preference is to work at the lower sub-projects level (to do my "red, green, refactor" cycle), and touch the parent projects only when necessary. But regardless of whether you choose a top-down or bottom-up workflow, try to identify repetitive error-prone steps in your branching & merging strategy, and script them as much as possible.

Upvotes: 10

Related Questions