Gregg Leichtman
Gregg Leichtman

Reputation: 131

Fetch/Pull Part of Very Large Repository?

This is probably obvious and has been asked many times in different ways before, but I have not been able to find the answer after searching for some time.

Assume the following:

How do I efficiently pull or fetch the last committed versions of, say, DIR001/subdir2/fileB1 ... DIR001/subdir2/fileBN from the remote repository and nothing else?

AND

just the last committed version of a single file from DIR001/subdir2/fileB1 ... DIR001/subdir2/fileBN from the remote repository and nothing else?

AND

How do I efficiently pull or fetch a previously committed version of a subset of said files and nothing else?

Maybe fetch/pull is not the correct command for this.

Upvotes: 10

Views: 5433

Answers (1)

VonC
VonC

Reputation: 1323753

The answer to "Partial cloning" can help you start experimenting with shallow clones.
But it will be limited:

  • to a certain depth, and/or to certain branches,
  • but not to certain files or directories (you can get a file or directory though sparse checkout, but you still have to get the full repo first!)
  • Even a certain commit.
    (Git 2.5 (Q2 2015) supports a single fetch commit! See "Pull a specific commit from a remote git repository").

The real solution would be to separate the huge remote repo into submodules though.
See What are Git limits or Git style backup of binary files for illustrating this kind of situation.


Update April 2015:

Git Large File Storage (LFS) would make pull/fetch much more efficient (by GitHub, April 2015).

The project is git-lfs (see git-lfs.github.com) and tested with server supporting it: lfs-test-server:
You can store metadata only in the git repo, and the large file elsewhere.

https://cloud.githubusercontent.com/assets/1319791/7051226/c4570828-ddf4-11e4-87eb-8fc165e5ece4.gif

Upvotes: 6

Related Questions