Reputation: 3219
I created a Git repo that will exclusively be stored locally and I ask myself, if I really need Git LFS for binaries? As far as I can see, the .gitattributes
is properly configured as in:
*.psd binary
And yes, the files land in .git/objects/...
, but they are compressed and don't take much space. So to sum it up, what are the benefits of Git LFS in a local repository if I never push/pull from/to a remote repo?
Thanks!
Upvotes: 10
Views: 9030
Reputation: 42491
To add to the excellent answer already provided by @Schwern and address OP's comment.
Here is a link to the documentation of GIT LFS from Atlassian, one of the two main companies (the other is GitHub) that stand behind this extension.
The idea is that the binaries are downloaded from the "remote" repository lazily during the checkout process rather than during cloning or fetching.
Technically git lfs stores "lazily" evaluated pointers to the binaries.
This makes a lot of sense because git has a "commitment" to be able to provide you access to the state of the code base after every commit, so the following situation is possible:
a.bin
is in version 1)a.bin
(a.bin
is in version 2 now)a.bin
in version 1.This is true even if you've decided to remove the a.bin
and commit it, there should still be a possibility to access the file-system state after "commit A".
So At least locally there is no point in storing version 1 if you explicitly don't need that.
One more note, to directly address the question and clarify: yes you have to enable git lfs support locally, but in addition, you also have to enable git lfs support on your remote repo (I did that with Bitbucket once, I'm sure its competitors support that as well).
Upvotes: 3
Reputation: 301
It depends on your workflow and the facilities you have access to.
Git stores versions of files as blobs. These blobs are diff compressed, whereby only differences are stored. Therefore, the file size increases only marginally.
The situation is different if the versioned file is a binary or a file where a single change restructures the whole file. In that case, Git stores a copy of each file, whereby the repository grows rapidly.
Git does a good job in diff compressing even big files. I've found that the compression of large files can be excellent (size of versioned file in .git/
after running git commit
or git gc
):
type | change | file size | as git-lfs blob | as git blob | after git gc |
---|---|---|---|---|---|
Vectorworks (.vwx) | added geometry | 28.8 MB | 28.8 MB | 26.5 MB | 1.8 MB |
GeoPackage (.gpkg) | added geometry | 16.9 MB | 16.9 MB | 3.7 MB | 3.5 MB |
Affinity Photo (.afphoto) | toggled layers | 85.8 MB | 85.6 MB | 85.6 MB | 0.8 MB |
FormZ (.fmz) | added geometry | 66.3 MB | 66.3 MB | 66.3 MB | 66.3 MB |
Photoshop (.psd) | toggled layers | 25.8 MB | 25.8 MB | 15.8 MB | 15.4 MB |
Movie (mp4) | trimmed | 13.1 MB | 13.1 MB | 13.2 MB | 0 MB |
delete a file | -13.1 MB | 0 MB | 0 MB | 0 MB |
If you don't have a remote to push to, it is better to not use Git-LFS because Git-LFS versioned files seem to add no additional compression at all (see above).
Also one important lesson learnt here is that Git's diff compression method doesn't work with real binary files like .fmz. These would be the best candidates for putting under Git-LFS versioning.
For other file types that seem to be non-textual, but their structure is text-like (.vwx or .afphoto) the diff method performs well. In a single user scenario, where overall repository size and not committing speed is critical, I wouldn't put these under Git-LFS versioning because the Git blob size is significantly smaller than the LFS blob, thus saving space at the local and the remote.
Git-LFS provides a solution to this problem by storing older version of large binary files at a place outside the repository (the Remote) and replacing it by a pointer file. If an older version is needed, then the client pulls it from the remote. Therefore, if a designer pulls the latest state from the remote, he will only download the latest state and the pointer files.
Therefore, Git-LFS can only be facilitated if you have access to a remote that is located at an LFS-enabled server. If there is no server to push the blobs to, then LFS-tracked blobs will stay in the local repo, therefore the advantage of decreasing local storage consumption is not utilized.
Usually, the remote is an LFS-enabled git provider, which can be too expensive for some projects. However, there are also solutions to host a Git-LFS remote locally.
Natively, Git-LFS allows transferring data through HTTPs only. Therefore, you require a separate Git-LFS server for storing the large files. However, there is ''no official server'' implementation for local hosting. But there are some unofficial ways like Git-LFS Folderstore to do that.
Git-LFS Folderstore provides a way to manage a Git-LFS remote locally. It works on a local machine and on a network drive. If you are on Mac OS X, then you can set it up by copying the lfs-folderstore executable lfs-folderstore
to /usr/local/bin
and then:
# Creating a remote repository on a volume (attached drive or NAS)
cd path/to/remote
mkdir origin
# create a bare git repository in origin
cd origin
git init origin --bare
# Add remote to local repository
cd path/to/local/repository
git remote add origin <path/to/remote/origin>
# Enable Git-LFS in local repository
git lfs install
# Track filetype psd
git lfs track "*.psd"
# Configure lfs of the local repository
git config --add lfs.customtransfer.lfs-folder.path lfs-folderstore
git config --add lfs.standalonetransferagent lfs-folder
git config --add lfs.customtransfer.lfs-folder.args "Volumes/path/to/remote/origin"
# Commit changes
git commit -am "commit message"
# Push media to remote
`git push origin master`
Use "'
if your remote path contains spaces.
You can compress the size of your git repository by calling the Git Garbage Collector git gc
. It won't touch the Git-LFS blobs tough.
Git-LFS will only remove blobs from the local repository .git/lfs/objects/
if they have been pushed to a remote AND if the commit containing the blobs is older than recent (3 days). Here are the commands if you want to do it manually:
# remove lfs duplicates
# https://github.com/git-lfs/git-lfs/blob/main/docs/man/git-lfs-dedup.1.ronn
git lfs dedup
# clean old local lfs files (>3 days of commit)
# https://github.com/git-lfs/git-lfs/blob/main/docs/man/git-lfs-prune.1.ronn
git lfs prune
Upvotes: 20
Reputation: 164919
git-lfs
stores old versions of file contents in the cloud while keeping their history on disk. This has two main benefits.
git clone
of a repository.Obviously number 1 doesn't apply if the repository is never shared.
If these binaries are really large, and if you change them frequently, they may begin to impact your available free disk space. If so, git-lfs can be of benefit by offloading the old copies of the binaries to cloud storage.
Fortunately, you can always retroactively apply git-lfs later using the BFG Repo Cleaner if the local repo gets too large.
As far as I can see, the .gitattributes is properly configured as in:
*.psd binary
This is a separate issue from git-lfs
.
If the file is marked as binary, Git will assume it cannot usefully diff nor merge the contents. Every time you change the file Git will store a complete copy of the file. This will obviously eat up a lot more disk space.
Even if the file is "binary" (ie. not plain text), Git may be able to store only the change if you don't mark it as binary. However, if the file is already compressed this effectively randomizes the file contents and makes diffing impossible. Many image formats are compressed.
Alexander Gogl did some experiments in their answer and it seems Git will store the whole .psd file.
Upvotes: 9