Reputation: 145
I am trying to achieve a combination of
The best I have been able to come up with experimenting with a lot of different combination of git commands after reading through man git
, random posts online, and llm tools as a last resort, I came up with the below, which still ends up with the behaviour where a network fetch is being done with each historical commit.
Is what I'm after possible in git
? I am operating in an environment where cloning the full repos in question can take 20+ minutes each, and there are certain files that could be hundreds of GB, but I want to get all files of certain extension types/matching certain patterns - so sparse checkout is my best best to only get the files matching the above conditions. However, I also want the full blob data for those files that I DO decide to sparsely fetch, back to a certain date. Ideally, I want to historically fetch that blob data after I have already done the sparse checkout, as I need the initial sparse checkout clone to determine the date from which I want all historical blob data.
Here is my progress so far - any tips on where my problem is in failing to achieve my goal would be much appreciated!
The use of README.md
is just for testing purposes to test if I have achieved my goal - this particular file isn't of any specific significance.
echo "cloning"
git clone --shallow-since="Sun Jun 2 00:05:53 2024 +0200" --no-checkout --filter=blob:none https://github.com/juice-shop/juice-shop.git
cd juice-shop
echo "setting up sparse checkout"
echo "README.md" | git sparse-checkout set --no-cone --stdin
git rev-parse --verify origin/master
git read-tree -mu master
git update-ref HEAD master
echo "starting blame"
git blame --line-porcelain --date iso -M -C -- README.md --since="Sun Jun 2 00:05:53 2024 +0200" # this is still triggering a network request for each historical commit
cd -
Upvotes: 0
Views: 33