Reputation: 3787
A project that I am working on has grown organically, and the size, number of files, type of files etc. in the repo have grown way too much. I have searched for several optimizations to git, and nothing seems to perfectly fit my situation. Here is what I want.
Manually track files - When I will edit a file, I will manually do git add <file-name>
. Git's assume-unchanged
won't help, since I will have to do a --no-assume-unchanged
before every add.
Git commit should only add the files I have staged in the index
and not worry about any other file. I have seen git taking too much time even after using core.ignoreStat
.
A sparse checkout should not download the entire repository first (it is a very big repository, even if I use --depth 1
). (However, it may not be possible with git)
My repository is such that, although there are a lot of directories, I only work in a small set of directories for a time, and then in other set at a later time. All the directories are rarely required at a time. It would be good if there could be a command, say git hide <directory>
which hides the directory in the working tree, and relieves git from tracking it until I need it again.
I am already using core.ignoreStat
,status.showUntrackedFiles
,commit.status
. Here is my git config.
user.email=xxx@xxx
core.repositoryformatversion=0
core.filemode=true
core.bare=false
core.logallrefupdates=true
core.ignorestat=true
core.showuntrackedfiles=no
remote.git_ch.url=file:////home/xxx/git_server/linux-namespaces.git
remote.git_ch.fetch=+refs/heads/*:refs/remotes/git_ch/*
branch.master.remote=git_ch
branch.master.merge=refs/heads/master
status.showuntrackedfiles=no
commit.status=false
The repository is still too slow.
Additionally, can you suggest the possible reason for it being so slow, out of these?
git add a.txt && git commit -m "a.txt"
, where a.txt
is a small file, it takes ages to complete.There are several git extensions like git annex
, Google's git repo
, etc. Will using any of these be of help, or will it be better to switch to another VCS?
I am using Ubuntu Gnome 16.04.1.
Upvotes: 1
Views: 1280
Reputation: 1323753
Note: Git 2.13 might help alleviate the support for large index repos:
See commit b460139, commit b2dd1c5, commit c3a0082, commit de6ae5f, commit c0441f7, commit b968372 (06 Mar 2017), and commit 77d6797, commit 0d59ffb, commit 6a5e6f5, commit e77cf4e, commit fcdbd95, commit e6a1dd7, commit 72dcb7b, commit 13c0e4c, commit 66f9e7a, commit b8923bf, commit 6cc1053, commit 4392531, commit cef4fc7, commit 1f44b09 (27 Feb 2017) by Christian Couder (chriscool
).
(Merged by Junio C Hamano -- gitster
-- in commit 94c9b5a, 17 Mar 2017)
The revised git update-index
documentation now includes a Split Index
section:
Split index
This mode is designed for repositories with very large indexes, and aims at reducing the time it takes to repeatedly write these indexes.
In this mode, the index is split into two files,
- $GIT_DIR/index
$GIT_DIR/sharedindex.<SHA-1>
.Changes are accumulated in
$GIT_DIR/index
, the split index, while the shared index file contains all index entries and stays unchanged.All changes in the split index are pushed back to the shared index file when the number of entries in the split index reaches a level specified by the
splitIndex.maxPercentChange
config variable.Each time a new shared index file is created, the old shared index files are deleted if their modification time is older than what is specified by the
splitIndex.sharedIndexExpire
config variable.To avoid deleting a shared index file that is still used, its modification time is updated to the current time every time a new split index based on the shared index file is either created or read from.
So you now (Git 2.13, Q2 2017) have the configurations:
splitIndex.maxPercentChange:
When the split index feature is used, this specifies the percent of entries the split index can contain compared to the total number of entries in both the split index and the shared index before a new shared index is written.
The value should be between 0 and 100.
If the value is 0 then a new shared index is always written,
if it is 100 a new shared index is never written.By default the value is 20, so a new shared index is written if the number of entries in the split index would be greater than 20 percent of the total number of entries.
And:
splitIndex.sharedIndexExpire::
When the split index feature is used, shared index files that were not modified since the time this variable specifies will be removed when a new shared index file is created.
The value "now" expires all entries immediately, and "never" suppresses expiration altogether.
The default value is "
2.weeks.ago
".Note that a shared index file is considered modified (for the purpose of expiration) each time a new split-index file is either created based on it or read from it.
With Git 2.27 (Q2 2020), the code that refreshes the last access and modified time of on-disk packfiles and loose object files have been updated.
See commit 312cd76 (14 Apr 2020) by [email protected] (``).
(Merged by Junio C Hamano -- gitster
-- in commit 51a68dd, 28 Apr 2020)
freshen_file()
: useNULL times
for implicit current-timeSigned-off-by: Luciano Miguel Ferreira Rocha
Update
freshen_file()
to use aNULL
times
, semantically equivalent to the currently setup, with an explicitactime
andmodtime
set to the "current time", but with the advantage that it works with other files not owned by the current user.Fixes an issue on shared repos with a split index, where eventually a user's operation creates a shared index, and another user will later do an operation that will try to update its freshness, but will instead raise a warning:
$ git status warning: could not freshen shared index '.git/sharedindex.bd736fa10e0519593fefdb2aec253534470865b2'
Upvotes: 3
Reputation: 437
adding an SSD wil definetly do the job. I am facing the exact same problem. A colleague has a computer with an SSD and his computer is far more quicker to do every single git action. I tried all you stated but the problem is really low IO perfomance. Git is using a lot of tiny little files (look at your .git directory) to manage different version of files and so the poor IO latency is slowing it all.
Upvotes: 0