Reputation: 2102
I am trying to Migrate one project from GitLab to GitHub. The repository size is 685.83MB and it consists of few .dat,.csv,.exe,.pkl
files which are more than 100MB to 3383.40 MB. it is failing with below errors.
GitLab To GitHub Migration Steps:-
$ git clone --mirror [email protected]:test/my-repo.git
$ cd my-repo.git
$ git remote set-url --push origin [email protected]:test/my-repo.git
$ git push
Error
remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
remote: error: File Src/project/label/file1.dat is 476.32 MB; this exceeds GitHub Enterprise's file size limit of 100.00 MB
remote: error: File Src/models/label/file2.dat is 2431.49 MB; this exceeds GitHub Enterprise's file size limit of 100.00 MB
remote: error: File test/test1/label/model/file3.exe is 1031.94 MB; this exceeds GitHub Enterprise's file size limit of 100.00 MB
remote: error: File test/test2/usecase/filemarker/file3.csv is 997.02 MB; this exceeds GitHub Enterprise's file size limit of 100.00 MB
remote: error: File src/msg/sports/model.pkl is 3383.40 MB; this exceeds GitHub Enterprise's file size limit of 100.00 MB
remote: error: File test/movie/maker/marker.dat is 1373.45 MB; this exceeds GitHub Enterprise's file size limit of 100.00 MB
remote: error: File project/make/level/project/realmaker.csv is 1594.83 MB; this exceeds GitHub Enterprise's file size limit of 100.00 MB
remote: error: File src/moderm/network/test.pkl is 111.07 MB; this exceeds GitHub Enterprise's file size limit of 100.00 MB
Git LFS/BFG Method:
$ git clone --mirror gitlab-heavy-repo
$ cd gitlab-heavy-repo.git
$ java -jar bfg-1.12.5.jar --convert-to-git-lfs '*.dat' --no-blob-protection
$ java -jar bfg-1.12.5.jar --convert-to-git-lfs '*.exe' --no-blob-protection
$ java -jar bfg-1.12.5.jar --convert-to-git-lfs '*.csv' --no-blob-protection
$ java -jar bfg-1.12.5.jar --convert-to-git-lfs '*.pkl' --no-blob-protection
$ git reflog expire --expire=now --all && git gc --prune=now
$ git lfs install
$ git remote set-url origin [email protected]:some-org/githubheavy-repo.git
$ git push
Even after above process, it fails with same error. it seems Git LFS have 2GB Limitation. So tried to remove the above larger files completely from repository. Followed below method to remove.
1) git clone gitlab-heavy-repo
2) cd gitlab-heavy-repo
3) git filter-branch --force --index-filter "git rm --cached --ignore-unmatch Src/project/label/file1.dat" --prune-empty --tag-name-filter cat -- --all
4) git reflog expire --expire=now --all
5) git gc --prune=now
6) git push origin --force --all
7) git push origin --force --tags
8) rm -rf .git/refs/original/
Repeated the same steps for all the above larger files. But now in Gitlab repository storage size shows - 1.9-GB
initially it was only 685.83MB.
Please correct me. Thanks in advance.
Upvotes: 6
Views: 6949
Reputation: 14569
Add all files above 100MiB to .gitignore:
find . -size +100M | cat >> .gitignore
If you have not committed the files yet:
Read files from .gitignore and remove them from repo (without deleting them from disk):
On Linux:
git ls-files -ci --exclude-standard -z | xargs -0 git rm --cached
On macOS:
alias apply-gitignore="git ls-files -ci --exclude-standard -z | xargs -0 git rm --cached"
On Windows:
for /F "tokens=*" %a in ('git ls-files -ci --exclude-standard') do @git rm --cached "%a"
If you have committed the files:
You'll need to clean them from commit history.
Run the following command to remove a file from all previous commits:
Warning! Rewriting history is dangerous.
On Linux and macOS:
git filter-branch --prune-empty -d ~/tmp/scratch \
--index-filter "git rm --cached -f --ignore-unmatch PATH/TO/FILE" \
--tag-name-filter cat -- --all
On Windows:
git filter-branch --prune-empty -d /tmp/scratch \
--index-filter "git rm --cached -f --ignore-unmatch PATH/TO/FILE" \
--tag-name-filter cat -- --all
(Replace PATH/TO/FILE with path to the actual file)
Greg explains this command better in his answer here
If you need to run the command above for a folder instead of a file, add an -r
switch after git rm
in the second line:
... \
--index-filter "git rm -r --cached -f --ignore-unmatch PATH/TO/FOLDER" \
...
git rm
can take multiple arguments so you can add multiple paths in the second line:
... \
--index-filter "git rm -r --cached -f --ignore-unmatch FILE1 FILE2 FOLDER1 FOLDER2" \
...
Upvotes: 6