Sally Richter
Sally Richter

Reputation: 271

How to clone/fetch a repo getting only the history

Is it possible to download a repository's commits, branches, and tags, excluding blobs and trees? I would like to be able to view the history and whatnot without downloading the files (this is for the Chromium repo, which is multiple gigs). Obviously I will not be able to see which files were affected by a commit, but that's fine.

Upvotes: 14

Views: 1679

Answers (4)

VonC
VonC

Reputation: 1323223

We are building ghuser.io (enhanced GitHub profile pages) and any way to get the commit history without files would help us tremendously to scale.

Then you would need to setup a mirror server with GVFS (Git Virtual File System) / VFS For Git support.

Since June 2016 (the OP question) and now (Q4 2018), VFS For Git (since issue 72 is soon to be resolved) has been proposed by Microsoft (Feb. 2017), and allows you to develop with TeraBytes repos(!) without having the files downloaded.

GitHub itself should support it soon.

See more at gvfs.io, although I suspect that a domain name which is now renamed to reflect the new "VFS For Git" name: https://vfsforgit.org.
(Microsoft/VFSForGit.WWW issue 9 is closed, Nov. 28th 2018)

Note: (Feb. 2021), the certificate issue regarding https://vfsforgit.org finally got resolved: see microsoft/VFSForGit issue 1705.

Upvotes: 5

Gary van der Merwe
Gary van der Merwe

Reputation: 9523

The "Partial Clone" feature was added in git 2.19.

Documentation here: https://www.git-scm.com/docs/partial-clone

In order to use it:

  • You need git >= 2.19 on both the server and the client
  • On the server, you need to enable the feature: git config --global uploadpack.allowFilter true
  • git clone --filter=tree:0 REMOTE_URL

Upvotes: 0

Rohit Shedage
Rohit Shedage

Reputation: 25840

You can achieve this with github apis.

https://developer.github.com/v3/repos/commits/#list-commits-on-a-repository

Upvotes: -1

torek
torek

Reputation: 487755

No, or at least, not using any ordinary access. Some sites offer web access, through which you can obtain the contents of every commit object without also obtaining tree and blob objects, but the normal process of receiving objects or thin packs is either truncated at the commit level (via --depth) or is complete.

You can of course see all visible tags with git ls-remote as well as through any sensible web interface (it would be weird to provide something like GitHub's fancy API if you didn't provide the tags that way :-) ).

Note that traversing all commits via a web API may be tremendously slow, either due to having to stop and wait (if you program it synchronously rather than as a streaming process) or due to rate limiting software on the host (GitHub and Bitbucket both seem to do rate limiting).

Upvotes: 9

Related Questions