Kevin MOLCARD
Kevin MOLCARD

Reputation: 2218

How to handle a large git repository?

I am currently using git for a large repository (around 12 GB, each branch having a size of 3 GB). This repository contains lots of binary files (audio and images).

The problem is that clone and pull can take lots of time. Specially the "Resolving deltas" step can be very very long.

What is the best way to solve this kind of problem?

I tried to remove delta compression, as it it explain here using the delta option in .gitattributes but it seems to not improve the clone duration.

Thanks in advance

Kevin

Upvotes: 14

Views: 4881

Answers (2)

soda
soda

Reputation: 523

Follow these steps.

1.install git lfs in your local machine by typing in the following code.

git lfs install

2.Now add the file type you want lfs to manage for you.

git lfs track "*.mp4"
  1. Now you are all set. Go ahead and add , commit and push your files and there'll be no warning.

Upvotes: 0

VonC
VonC

Reputation: 1323723

Update April 2015: Git Large File Storage (LFS) (by GitHub).

It uses git-lfs (see git-lfs.github.com) and tested with a server supporting it: lfs-test-server:
You can store metadata only in the git repo, and the large file elsewhere.

https://cloud.githubusercontent.com/assets/1319791/7051226/c4570828-ddf4-11e4-87eb-8fc165e5ece4.gif


Original answer (2012)

One solution, for large binary files that don't change much, is to store them in a different referential (like a Nexus repository), and version only a text file which declares which version you need.
Using an "artifact repository" is easier than storing binary elements in a source repo (made for comparing versions and merging between branches, which isn't of much use for said binaries).

The other solution, more git-centric, is git-annex:

git-annex allows managing files with git, without checking the file contents into git.
While that may seem paradoxical, it is useful when dealing with files larger than git can currently easily handle, whether due to limitations in memory, time, or disk space.

It is however not compatible with Windows.

A more generic solution could be git-media, which also allows you to use Git with large media files without storing the media in Git itself.

Finally, the easiest solution is to isolate those binaries in their own git submodule as you mention in your question: it isn't very satisfactory, and the initial clone will still take times, but the next updates for the parent repo will be short.

Upvotes: 12

Related Questions