Reputation: 1201
As a Devops admin,what are the ways to check git performance in my environment?
After every major change like Git upgrade, I want to run a test which finds out how my git is doing.
How can I achieve this?
Upvotes: 1
Views: 4879
Reputation: 3938
While git-hyperfine
looks interesting, my impression is that it's mainly a tool for git developers. As a user, I think it's easier to just stick with vanilla hyperfine
. E.g.
$ hyperfine --warmup 3 -L rev /usr/bin/git,/opt/homebrew/Cellar/git/2.42.0/bin/git '{rev} status'
Benchmark 1: /usr/bin/git status
Time (mean ± σ): 87.3 ms ± 1.9 ms [User: 32.1 ms, System: 327.4 ms]
Range (min … max): 84.9 ms … 93.7 ms 32 runs
Benchmark 2: /opt/homebrew/Cellar/git/2.42.0/bin/git status
Time (mean ± σ): 76.0 ms ± 3.6 ms [User: 28.9 ms, System: 301.0 ms]
Range (min … max): 72.7 ms … 94.7 ms 38 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
/opt/homebrew/Cellar/git/2.42.0/bin/git status ran
1.15 ± 0.06 times faster than /usr/bin/git status
The main point is to use an actual benchmarking tool, not time
which doesn't provide standard deviations and therefore doesn't provide comparable results unless the difference is huge.
As for the test cases to use: I'm afraid you're on your own on that, since it depends heavily on your use cases and repositories.
Upvotes: 1
Reputation: 1324278
The 2022 answer is to use avar/git-hyperfine/
, a wrapper around sharkdp/hyperfine
, a command-line benchmarking tool.
Illustration:
Git 2.38 (Q3 2022) allows large objects read from a packstream to be streamed into a loose object file straight, without having to keep it in-core as a whole.
The performance improvement is measure with git-hyperfine
.
See commit aaf8122, commit 2b6070a, commit 97a9db6, commit a1bf5ca (11 Jun 2022) by Han Xin (chiyutianyi
).
See commit 3c3ca0b, commit 21e7d88 (11 Jun 2022) by Ævar Arnfjörð Bjarmason (avar
).
(Merged by Junio C Hamano -- gitster
-- in commit 73b9ef6, 14 Jul 2022)
unpack-objects
: usestream_loose_object()
to unpack large objectsHelped-by: Ævar Arnfjörð Bjarmason
Helped-by: Derrick Stolee
Helped-by: Jiang Xin
Signed-off-by: Han Xin
Signed-off-by: Ævar Arnfjörð Bjarmason
Make use of the
stream_loose_object()
function introduced in the preceding commit to unpack large objects.
Before this we'd need to malloc() the size of the blob before unpacking it, which could cause OOM with very large blobs.We could use the new streaming interface to unpack all blobs, but doing so would be much slower, as demonstrated e.g. with this benchmark using
git-hyperfine
:rm -rf /tmp/scalar.git && git clone --bare https://github.com/Microsoft/scalar.git /tmp/scalar.git && mv /tmp/scalar.git/objects/pack/*.pack /tmp/scalar.git/my.pack && git hyperfine \ -r 2 --warmup 1 \ -L rev origin/master,HEAD -L v "10,512,1k,1m" \ -s 'make' \ -p 'git init --bare dest.git' \ -c 'rm -rf dest.git' \ './git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/scalar.git/my.pack'
Here we'll perform worse with lower
core.bigFileThreshold
settings with this change in terms of speed, but we're getting lower memory use in return:Summary './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' ran 1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' 1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' 1.01 ± 0.02 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'HEAD' 1.02 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' 1.09 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'HEAD' 1.10 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD' 1.11 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
A better benchmark to demonstrate the benefits of that this one, which creates an artificial repo with a 1, 25, 50, 75 and 100MB blob:
rm -rf /tmp/repo && git init /tmp/repo && ( cd /tmp/repo && for i in 1 25 50 75 100 do dd if=/dev/urandom of=blob.$i count=$(($i*1024)) bs=1024 done && git add blob.* && git commit -mblobs && git gc && PACK=$(echo .git/objects/pack/pack-*.pack) && cp "$PACK" my.pack ) && git hyperfine \ --show-output \ -L rev origin/master,HEAD -L v "512,50m,100m" \ -s 'make' \ -p 'git init --bare dest.git' \ -c 'rm -rf dest.git' \ '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum'
Using this test we'll always use >100MB of memory on
origin/master
(around ~105MB), but max out at e.g. ~55MB if we setcore.bigFileThreshold=50m
.The relevant "Maximum resident set size" lines were manually added below the relevant benchmark:
'/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master' ran Maximum resident set size (kbytes): 107080 1.02 ± 0.78 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master' Maximum resident set size (kbytes): 106968 1.09 ± 0.79 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master' Maximum resident set size (kbytes): 107032 1.42 ± 1.07 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD' Maximum resident set size (kbytes): 107072 1.83 ± 1.02 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD' Maximum resident set size (kbytes): 55704 2.16 ± 1.19 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD' Maximum resident set size (kbytes): 4564
This shows that if you have enough memory this new streaming method is slower the lower you set the streaming threshold, but the benefit is more bounded memory use.
An earlier version of this patch introduced a new "
core.bigFileStreamingThreshold
" instead of re-using the existing "core.bigFileThreshold
" variable.
As noted in a detailed overview of its users in this thread, using it has several different meanings.Still, we consider it good enough to simply re-use it.
While it's possible that someone might want to e.g. consider objects "small" for the purposes of diffing but "big" for the purposes of writing them such use-cases are probably too obscure to worry about.
We can always split up "core.bigFileThreshold
" in the future if there's a need for that.
Upvotes: 1
Reputation: 1324278
Another way to test Git performance is by relying on the not-so-old perf folder: the "Performance testing framework" introduced in 2012 with Git 1.7.10, with commit 342e9ef
- the 'run' script lets you specify arbitrary build dirs and revisions.
- It lets you specify which tests to run; or you can also do it manually
- Two different sizes of test repos can be configured, and the scripts just copy one or more of those.
So... make perf
Git 2.14 (Q3 2017) is still adding to that framework, with a test showing that runtimes of the wildmatch()
function used for globbing in git grow exponentially in the face of some pathological globs.
See commit 62ca75a, commit 91de27c (11 May 2017) by Ævar Arnfjörð Bjarmason (avar
).
(Merged by Junio C Hamano -- gitster
-- in commit 140921c, 30 May 2017)
Upvotes: 0
Reputation: 61
Yes, as @DevidN pointed out its depend on various parameter like configuration, Network. I also had same Q when we were migrating from SVN to git and stats after migration.
I have used 'time' with combination of different git commands and written a script to monitor all those commands from server.
Eg:
$ time git clone http://####@#########.git
Cloning into '#####'...
remote: Counting objects: 849, done.
remote: Compressing objects: 100% (585/585), done.
remote: Total 849 (delta 435), reused 0 (delta 0)
Receiving objects: 100% (849/849), 120.85 KiB | 0 bytes/s, done.
Resolving deltas: 100% (435/435), done.
Checking connectivity... done.
real 0m4.895s
user 0m0.140s
sys 0m0.046s
Upvotes: 1