PRinCEKtd
PRinCEKtd

Reputation: 270

How to set git submodules

The scenario

We have a large project with a lot of sub-projects in sub-directories and interdependencies. We are planning to outsource a part of the project.

Our project structure is like this (the text are subfolder names):

MAIN
|_Custom
|   |_Source
|   |   |_CustA (Contains multiple projects each in own directory)
|   |   |_CustB (Contains multiple projects each in own directory)
|   |
|   |_Dll
|   |  |_Debug
|   |  |_Release 
|   |
|   |_Lib
|      |_Debug
|      |_Release
|
|_Dll
|   |_Debug
|   |_Release
|
|_Lib
|  |_Debug
|  |_Release
|
|_Plugins
|  |_Dll
|  |  |_Debug
|  |  |_Release
|  |
|  |_Source
|     |_PluginA
|     |_PluginB
|
|_Source
   |_Module1
   |   |_M1A
   |   |_M1B
   |
   |_Module2
       |_M2A

As I mentioned above, the 'Custom' part is what we would like to outsource. These custom projects depend on the dll and lib files in the Main/Custom/Dll, Main/Custom/Lib, Main/Dll, Main/Lib and Main/Plugins/Dll folders to run. The trouble here is that irrespective of the root drive, the Main/Dll-Lib and Main/Plugins/Dll folders must maintain the exact same hierarchy and position inside the Main folder.

That is, suppose CustA has a project under it, which depends on some dll from the Main. All projects under CustA must mandatorily set the output paths so that the exe and dll files go to Main/Custom/Dll and lib outputs to Main/Custom/Lib. These exe (suppose it is inside Release) must necessarily look for the referenced main dll using the relative path "......\Dll\Release" which will point to the Main/Dll/Release folder and similarly for any main-plugin dlls. It cannot reference the dll or lib from some other arbitrarily set path.

The requirement:

When Main is to be cloned by our own people, they must get all source and dlls and files under Main. But for Custom, the folders Main/Custom/Source, Main/Custom/Dll and Main/Custom/Lib must be created(they may contain some empty file in case git does not allow empty folders.), but the specific custom modules (like CustA and its subdirectories and its output exe, dll & lib files) must not be cloned. The custom modules(CustA, CustB, CustC...) must be explicitly Pulled/Cloned one by one as needed, under the Main/Custom/Source folder to get their source code, and built to obtain their exe-dll-lib files.

On the other hand, when outsourcing, it must be easy to setup at their end also. Here, they should be able to clone the Main in such a way that they obtain the Main/Dll, Main/Lib and Main/Plugins/Dll folders and their content exe-dll-lib and other output files, but not the source code inside Main/Source and Main/Plugins/Source. Also, as the outsource will be done per module of Custom, supposing that this developer has been assigned CustA, he must be easily able to get the source code for all the projects under CustA, but must not be able to Clone/Pull Custom/Source/CustB.

What I have already tried:

The whole Main and everything inside it currently is backed up to an SVN repository on our own server machine. But we are looking to migrate to git, and use Nulabs-backlog for project issue tracking and management.

I did some research and created a copy of our project structure with dummy files and was able to create a test repo with all (I mean ALL) the files and subfolders, but this did not allow the restricted access like I mentioned above.

I understand that I can partition the whole project into multiple smaller repositories and then use the git submodule feature to reference specific repositories under other repos. So I created separate repositories from Main/Plugins/Source, Main/Source, and for the custom modules, separate repositories each for Main/Custom/Source/CustA, Main/Custom/Source/CustB, Main/Custom/Source/CustC etc. and upload them to remote. Then I created a repository for the Main folder itself and added the Main/Dll, Main/Lib, Main/Plugins/Dll folders to this. Here, the #/Source modules show up as sub-modules which at first seems OK. When I push this main repo to remote, the remote also shows that CustA, CustB..., Main/Source, Main/Plugins/Source etc. are sub-modules while the Dll and Lib folders show the correct files.

But I cannot understand how to clone these properly.

The problem:

When I clone Main repo from remote, the cloning does recreate the outsource scenario where the source folders do not contain anything as they are submodules, while the Dll and Lib folders are properly populated. But when I try to pull the Main/Source or Main/Plugins/Source folders explicitly, it does not work. Neither it does not allow me to set the remote path of the source folders as these folders themselves are part of the main repo, nor does it allow me to execute a Pull when I delete these empty source folders, and then recreate them and set their remote path to reflect the actual sub-module repo URL.

Is the partitioning that I did wrong? Or is the clone step wrong? If so, how can I set up git properly to allow the above requirements?

Upvotes: 0

Views: 132

Answers (1)

PRinCEKtd
PRinCEKtd

Reputation: 270

Finally I managed to figure out things with a lot of googling and other ideas I found on SO itself.

I went with a simple straightforward Git working tree. Basically, I partitioned the whole project structure into 3 parts.

Part 1: Stores the actual folder-subfolder structure as well as the Dlls, pdb, lib and header files. These files are necessary for the development of the 'Custom' part as well as for the 'Main' application itself. Thus, this repo allows access to both our local team as well as to the outsourced team.

Part 2: Stores the actual code (cpp) and csproj, sln and related files related to the development of the 'Main' application itself. This repo allows access only to our own local team.

Part 3: Stores the 'Custom' modules. These modules are divided, each module (CustA, CustB, CustC...) to its own repo. Our local team has access to all of these repos. The outsourced team has access only to the repo containing the module that has been assigned to it.

Each of these repos already have a master and a 'Develop' branch. The actual development is done on a custom branch derived from the 'Develop' branch and merged back when FULLY completed. The 'Develop' branch will be synced to the master on each Release so that the master branch always contains only stable 'Release' code.

I wrote 2 simple bash scripts to allow easy setup of the development repo.

The script for the local team asks for a root folder, name of a custom module (if custom module is being developed) and a URL for a custom module repo (if custom module is being developed). Then it creates an empty Git repo on the local machine where it is run, pulls the 'Part 1' repo as it is necessary for all the development scenarios, continues with pulling the 'Part 2' repo into the correct folders. And if a custom module name and URL is provided, it will continue with pulling in the appropriate repo of 'Part 3'. Next, as the 'Part 1' repo is not generally modified, to prevent accidental commits or Pushes, the script navigates to its root folder, and renames the '.git' to something else, so that the Part 1 repo acts as a repo no longer. When you need to change or update this repo, you need to explicitly rename the folder back to '.git' Pull/Sync to get it upto the latest version and then do the changes and Push. Of course, there is a small catch that after completing, we need to remember to rename the '.git' folder back to something else as well as to take care that the proper header files are pushed to Part 1 while the code is pushed to Part 2.

The 'Part 2' repo and any 'Part 3' repos will automatically be switched to the 'Develop' branch, and the master branch is deleted from the local machine. Of course the master branch will be brought back whenever we sync with the remote.

In case of outsource team, the script will not pull the 'Part 2' repo. It will specifically ask for a root folder and a custom module name and URL, setup the 'Part 1' and the 'Part 3' repos, rename the '.git' folder of the 'Part 1' repo, switch to the Develop branch of 'Part 3', and if a branch name is also passed as argument to the script, will derive a new branch from Develop and switch to this new branch. Again, here we will be providing a guide or rules to follow before any commit/push is done to the outsource team, so that they know how and when to re-Pull the 'Part 1' repo, as well as to remember not to commit/push to 'Part 1', and other rules to be followed.

Upvotes: 1

Related Questions