Adam L.
Adam L.

Reputation: 89

Is there a way to split up a Git fetch into multiple smaller fetches?

I need to VPN in to our Git server to pull changes and the VPN connection is quite slow (~200kbps). I'm trying to pull a few months worth of changes, but it's 3GB of files and the VPN connection keeps disconnecting before it finishes fetching all the changes.

I'm wondering if there's a way to only pull half the changes at a time so that I could split it into 2 batches?

Upvotes: 2

Views: 489

Answers (1)

torek
torek

Reputation: 487993

The key to splitting up a big fetch is that fetch brings in commits. One fetch operation either succeeds all the way, or fails entirely if the network connection flakes out in the middle. But, if your git fetch wants to bring in, say, 16384 commits, which will bring 3 GB worth of data, which won't make it all at once, you can break this up:

  • First, bring in 8192 commits that bring in 1.5 GB of data;
  • then bring in the remaining 8192 commits that bring in the other 1.5 GB of data.

If that's not small enough, continue breaking up the commits into smaller and smaller sets of commits.

There's one major flaw with this plan, though. If 16383 of the commits bring in, say, 500 MiB of files, then one of those commits—the 16384th—brings in 2.5 GiB of files. You can't break that one up.

Also, you might not be able to pick commits this way anyway, as many servers won't let you run git fetch by raw hash ID. Two! There are two major flaws with this plan... insert Monty Python Spanish Inquisition sketch here.

Seriously, if you have the right kind of access, you can have someone place branch names or tag names against various commits, and break up the large batch of commits this way. That gets you down to the one possible major flaw.

Edit: As jthill notes in a comment, you can also work this from the opposite direction: run git fetch with a --depth option (--depth=1 tries to get just the last commit at each branch name, --depth=2 tries to get the last two, etc). Then you can run additional fetch operations with --deepen, and once you have enough, git fetch --unshallow to get everything else. This is probably the easiest to work from your end alone.

Alternatively, have someone run git bundle and make a bundle file. Then, use some restartable transfer protocol to send the file over. Once you have the whole file, run git fetch against the bundle file. A bundle file simply splits git fetch into its various separate parts:

  • aggregating the objects that are required for transfer (git bundle does this part);
  • transferring the bundle file (you do this part yourself); and
  • extracting the bundle file into commits (git fetch knows how to do this).

There are a bunch of questions and answers on StackOverflow about git bundle; see, e.g., How to use git-bundle for keeping development in sync?

Upvotes: 4

Related Questions