g_inherit
g_inherit

Reputation: 20431

Download a single folder or directory from a GitHub repository

How can I download only a specific folder or directory from a remote Git repository hosted on GitHub?

Say the example GitHub repository lives here:

git@github.com:foobar/Test.git

Its directory structure:

Test/
├── foo/
│   ├── a.py
│   └── b.py
└── bar/
    ├── c.py
    └── d.py

I want to download only the foo folder and not clone the whole Test project.

Upvotes: 2041

Views: 1818695

Answers (30)

matt
matt

Reputation: 519

i thought i would add a more complete solution here .. at the end :)

none of the other answers actually describe how to do it


download a zipped directory right here in the browser by running this code snippet

the path can be empty, ( therefore it will download all files in the repo )

body     {display:flex;flex-direction:column;font-family:arial;margin:0}
#title   {margin:0;font-size:12px;display:flex}
section  {display:grid;grid-template-columns:auto 1fr;gap:3px;align-items:center;margin:3px}
input    {font-size:16px;padding:2px 5px}
button   {cursor:pointer;width:200px;font-size:16px;padding:2px}
#output  {display:none;font-size:16px;margin:0}
<div id=title>download a directory from a github repository<span style=flex:1></span><span><a href='https://github.com/javascript-2020/stackoverflow' style=text-decoration:none>try : javascript-2020 : stackoverflow : main</a></span></div>
<section>
      <div>owner</div><input class=owner>
      <div>repo</div><input class=repo>
      <div>branch</div><input class=branch>
      <div>path</div><input class=path>
</section>
<button onclick=download()>download</button>
<pre id=output></pre>

<script type=module>

        import jszip from 'https://cdn.jsdelivr.net/npm/jszip/+esm';
        var $   = sel=>document.querySelector(`.${sel}`).value;
        
        window.download=async function(){
        
              var owner   = $('owner');
              var repo    = $('repo');
              var branch  = $('branch');
              var path    = $('path');
              
              if(path[0]=='/')path    = path.slice(1);
              if(path && path.slice(-1)!='/')path  += '/';

              var url     = `https://api.github.com/repos/${owner}/${repo}/git/trees/${branch}?recursive=true`;
              var json    = await fetch(url).then(res=>res.json());
              var file    = `${path.split('/').filter(Boolean).at(-1)||repo}.zip`;                            
              var zip     = new jszip();                    
              
              await Promise.all(json.tree.map(async item=>{
              
                    if(!item.path.startsWith(path))return;
                    
                    var fn    = item.path.slice(path.length);
                    if(item.type=='tree'){
                          zip.folder(fn);
                    }else{
                      var url     = `https://raw.githubusercontent.com/${owner}/${repo}/${branch}/${item.path}`;
                      var blob    = await fetch(url).then(res=>res.blob());
                          zip.file(fn,blob);
                    }
                    
              }));
                        
              var blob                = await zip.generateAsync({type:'blob'});
              output.style.display    = 'block';
              var url                 = window.URL.createObjectURL(blob);
              var a                   = document.createElement('a');
              a.href                  = url;
              a.download              = file;
              a.textContent           = file+' ( right-click - save link as )';
              output.append(a,'\n');
              
        }//download

</script>

and a link to a github webpage for a more complete example ( work in progress )

javascript-2020.github.io : download a directory from a GitHub repository





as per the documentation REST API endpoints for Git trees : Get a tree

a complete list of files for a repo can be downloaded in json format by adding ?recursive=true to the endpoint

https://api.github.com/repos/${owner}/${repo}/git/trees/${branch}?recursive=true

each file can be downloaded individually from raw.githubusercontent.com

https://raw.githubusercontent.com/${owner}/${repo}/${branch}/${path}

and the whole process only requires 1 github api request which helps with the github api rate restrictions, 60 p/hr anonymous, 5000 p/hr authorised, there is however a delay between updating a repo and those changes being reflected at raw.githubusercontent.com, this can be 5-10mins

individual files can also be downloaded from, ( changes reflected immediately )

https://api.github.com/repos/${owner}/${repo}/contents/${path}

to use authorized requests with the github api, a http header needs to be added to the request

authorization : bearer ${token}

a github api token can be generated at github.com : Fine-grained personal access tokens, you'll need to add the contents permission to the token

given a successful request this url will return a json object with the file data base64 encoded in the content field, such as

    
    (async ()=>{
    
          var owner     = 'javascript-2020';
          var repo      = 'tmp';
          var branch    = 'main';
          var path      = 'myfile.js';
          
          var token     = '';
          var headers   = {};
          if(token){
                headers   = {authorization:`bearer ${token}`};
          }
          
          var url       = `https://api.github.com/repos/${owner}/${repo}/contents/${path}`;                                
          var json      = await fetch(url,{headers}).then(res=>res.json());
          var txt       = window.atob(json.content);
          console.log(txt);
    
    })();
    

this structure can then be coupled with npm : jszip to create a zip file containing the files in the folder

<script type=module>

        import jszip from 'https://cdn.jsdelivr.net/npm/jszip/+esm';

        var owner   = 'javascript-2020';
        var repo    = 'stackoverflow';
        var branch  = 'main';
        var path    = '';

        if(path[0]=='/')path    = path.slice(1);
        if(path && path.at(-1)!='/')path  += '/';

        var file    = `${path.split('/').filter(Boolean).at(-1)||repo}.zip`;
        var zip     = new jszip();
        var url     = `https://api.github.com/repos/${owner}/${repo}/git/trees/${branch}?recursive=true`;
        var json    = await fetch(url).then(res=>res.json());

        await Promise.all(json.tree.map(async item=>{

              if(!item.path.startsWith(path))return;

              var fn    = item.path.slice(path.length);
              if(item.type=='tree'){
                    zip.folder(fn);
              }else{
                    var url     = `https://raw.githubusercontent.com/${owner}/${repo}/${branch}/${item.path}`;
                    var blob    = await fetch(url).then(res=>res.blob());
                    zip.file(fn,blob);
              }

        }));

        var blob      = await zip.generateAsync({type:'blob'});
        var url       = window.URL.createObjectURL(blob);
        var a         = document.createElement('a');
        a.href        = url;
        a.download    = file;
        a.click();

</script>

for those wishing to do all this in nodejs, the following will write directly to disk

//  download-repo-dir.node.mjs

import fs from 'fs';

var owner   = 'javascript-2020';
var repo    = 'stackoverflow';
var branch  = 'main';
var path    = '';

if(path[0]=='/')path    = path.slice(1);
if(path && path.slice(-1)!='/')path  += '/';

var file    = `${path.split('/').filter(Boolean).at(-1)||repo}/`;
fs.mkdirSync(file);    
var url     = `https://api.github.com/repos/${owner}/${repo}/git/trees/${branch}?recursive=true`;
var json    = await fetch(url).then(res=>res.json());

json.tree.forEach(async item=>{

      if(!item.path.startsWith(path))return;
      
      var fn    = item.path.slice(path.length);
      if(item.type=='tree'){
            fs.mkdirSync(file+fn);
      }else{
            var fh        = fs.createWriteStream(file+fn);
            var stream    = new WritableStream({write:data=>fh.write(data)});
            var url       = `https://raw.githubusercontent.com/${owner}/${repo}/${branch}/${item.path}`;
            fetch(url).then(res=>res.body.pipeTo(stream));
      }
      
});




to get jszip working in nodejs this seems to work

    var sandbox   = {};
    sandbox.cjs   = txt=>Promise.resolve(eval(`(()=>{var exports={},module={};${txt};return module.exports})()`));
    var url       = 'https://raw.githubusercontent.com/stuk/jszip/main/dist/jszip.min.js';
    var JSZip     = await fetch(url).then(res=>res.text().then(sandbox.cjs));
    console.log(JSZip);

there is also the node --experimental-network-imports flag node.js : HTTPS and HTTP imports

//test.node.mjs

    import JSZip from 'https://cdn.jsdelivr.net/npm/jszip/+esm';
    console.log(JSZip);

and then run with

node --experimental-network-imports --no-warnings .\test.node.mjs




not really to do with the question but i include a url to download an entire repo

https://github.com/${owner}/${repo}/archive/refs/heads/${branch}.zip

find me in the stackoverflow javascript chat room if anything isnt working and ill update it

.

Upvotes: 0

git clone --filter downloads only the required folders

E.g., to clone only objects required for subdirectory small/ of this repository: https://github.com/cirosantilli/test-git-partial-clone-big-small-no-bigtree notably ignoring subdirectory big/ which contains large files, I can do:

git clone -n --depth=1 --filter=tree:0 \
  https://github.com/cirosantilli/test-git-partial-clone-big-small-no-bigtree
cd test-git-partial-clone-big-small-no-bigtree
git sparse-checkout set --no-cone /small
git checkout

The --filter option was added together with an update to the remote protocol, and it truly prevents objects from being downloaded from the server.

I have covered this in more detail at: How do I clone a subdirectory only of a Git repository?

Tested on git 2.30.0 on January 2021.

Upvotes: 53

nick
nick

Reputation: 19824

Update April 2021: there are a few tools created by the community that can do this for you:

Note: if you're trying to download a large number of files, you may need to provide a token to these tools to avoid rate limiting.


Original (manual) approach: Checking out an individual directory is not supported by Git natively, but GitHub can do this via Subversion (SVN). If you checkout your code with Subversion, GitHub will essentially convert the repository from Git to Subversion on the backend, and then serve up the requested directory.

Update November 2024: The Subversion support has been removed after January 8, 2024: https://github.blog/news-insights/product-news/sunsetting-subversion-support/. The rest of this answer is outdated and describes the functionality in the past.

Here's how you can use this feature to download a specific folder. I'll use the popular JavaScript library Lodash as an example.

  1. Navigate to the folder you want to download. Let's download /test from master branch.

    GitHub repository URL example

  2. Modify the URL for subversion. Replace tree/master with trunk.

    https://github.com/lodash/lodash/tree/master/test

    https://github.com/lodash/lodash/trunk/test

  3. Download the folder. Go to the command line and grab the folder with SVN.

    svn checkout https://github.com/lodash/lodash/trunk/test
    

You might not see any activity immediately because GitHub takes up to 30 seconds to convert larger repositories, so be patient.

Full URL format explanation:

  • If you're interested in master branch, use trunk instead. So the full path is trunk/foldername
  • If you're interested in foo branch, use branches/foo instead. The full path looks like branches/foo/foldername
  • Pro tip: You can use svn ls to see available tags and branches before downloading if you wish

That's all! GitHub supports more Subversion features as well, including support for committing and pushing changes.

Upvotes: 1808

Minhas Kamal
Minhas Kamal

Reputation: 22206

Go to DownGit → Enter Your URL → Download!

You can directly download or create download link for any GitHub public directory or file from DownGit:


DownGit


You may also configure properties of the downloaded file—detailed usage.


Disclaimer: I fell into the same problem as the question-asker and could not find any simple solution. So, I developed this tool for my own use first, and then opened it for everyone :)

Upvotes: 1159

rocktimsaikia
rocktimsaikia

Reputation: 175

github-dlr is a command line tool specifically created for this Job. Here is how it can be used:

github-dlr <github_path>

# Basic Example
github-dlr https://github.com/linuxdotexe/nordic-wallpapers/tree/master/dynamic-wallpapers/Coast

More options on the README Of the project.

Upvotes: 3

Kazi Mahbubur Rahman
Kazi Mahbubur Rahman

Reputation: 155

I have developed a tool that might be exactly what you need:

  1. Visit: https://techhelpbd.com/gitdown

  2. Paste your GitHub folder link

  3. Then you can easily download your GitHub folder

Give it a try and let me know how it works for you

Upvotes: 7

Rainb
Rainb

Reputation: 2465

After so many annoying attempts I got this. This will allow you to download any directory/file from any branch/reference from any Git repository, to any target you want even if you are already on a Git repository, so this uses a combination of Git archive in case it is supported, but obviously GitHub does not support it, so it falls back to a smart Git clone that filters blobs, and then checks out. This creates a directory in a temporary directory and moves the files.

# Function to download specific files or directories from a Git repository without history
function download_from_git() {
  local git_remote="$1"
  local git_ref="$2"
  local git_path="$3"
  local target="$4"

  # Create a temporary directory next to the current directory
  temp_dir=$(mktemp -d temp-git-repo-XXXXXXXX)

  # Attempt to download the specified file or directory using 'git archive'
  if git archive --remote="$git_remote" "$git_ref" "$git_path" | tar -x -C "$temp_dir"; then
    # Check if the git_path is a directory or a file
    git_path_exists=$(ls -d "$temp_dir/$git_path" 2>/dev/null)
    if [ -z "$git_path_exists" ]; then
      echo "Error: The specified Git path does not exist in the repository."
      return 1
    fi

    if [ -d "$temp_dir/$git_path" ]; then
      mkdir -p "$target"
      mv "$temp_dir/$git_path/"* "$target/"
    else
      mkdir -p "$(dirname "$target")"
      mv "$temp_dir/$git_path" "$target"
    fi
  else
    # Clone the repository with the specified branch or ref in the temporary directory
    git clone --depth 1 --branch "$git_ref" --filter=blob:none "$git_remote" "$temp_dir"
    cd "$temp_dir"

    git restore --source "$git_ref" -- "$git_path"

    cd - > /dev/null

    # Check if the git_path is a directory or a file
    git_path_exists=$(ls -d "$temp_dir/$git_path" 2>/dev/null)
    if [ -z "$git_path_exists" ]; then
      echo "Error: The specified Git path does not exist in the repository."
      return 1
    fi

    if [ -d "$temp_dir/$git_path" ]; then
      mkdir -p "$target"
      mv "$temp_dir/$git_path/"* "$target/"
    else
      mkdir -p "$(dirname "$target")"
      mv "$temp_dir/$git_path" "$target"
    fi
  fi

  # Clean up the temporary directory
  rm -rf "$temp_dir"
}

Usage

download_from_git "https://github.com/rustwasm/wasm-pack.git" "master" "npm" "./node-modules/wasm-pack/"

in this case you might notice it fetches the last head instead of just these files, while it is possible to just fetch the last files it downloads way way slower! but in case you really want to just download the files, use:

    git clone -n --depth 1 --filter=blob:none "$git_remote" "$temp_dir"
    cd "$temp_dir"

    # Set the sparse-checkout to only include the required directory
    git sparse-checkout init --cone
    echo "$git_path" >> .git/info/sparse-checkout

    # Download the required objects
    git checkout "$git_ref" -- "$git_path"

instead.

Hopefully this saves time to someone.

Upvotes: 2

jabacchetta
jabacchetta

Reputation: 50248

2019 Summary

There are a variety of ways to handle this, depending on whether or not you want to do this manually or programmatically.

There are four options summarized below. And for those that prefer a more hands-on explanation, I've put together a YouTube video: Download Individual Files and Folders from GitHub.

Also, I've posted a similar answer on Stack Overflow for those that need to download single files from GitHub (as opposed to folders).


1. GitHub User Interface

  • There's a download button on the repository's homepage. Of course, this downloads the entire repository, after which you would need to unzip the download and then manually drag out the specific folder you need.

2. Third-party Tools

  • There are a variety of browser extensions and web application that can handle this, with DownGit being one of them. Simply paste in the GitHub URL to the folder (e.g., https://github.com/babel/babel-eslint/tree/master/lib) and press the Download button.

3. Subversion

  • GitHub does not support git-archive (the Git feature that would allow us to download specific folders). GitHub does however, support a variety of Subversion features, one of which we can use for this purpose. Subversion is a version control system (an alternative to Git). You'll need Subversion installed. Grab the GitHub URL for the folder you want to download. You'll need to modify this URL, though. You want the link to the repository, followed by the word "trunk", and ending with the path to the nested folder. In other words, using the same folder link example that I mentioned above, we would replace "tree/master" with "trunk". Finally, open up a terminal, navigate to the directory that you want the content to get downloaded to, type in the following command (replacing the URL with the URL you constructed): svn export https://github.com/babel/babel-eslint/trunk/lib, and press enter.

4. GitHub API

  • This is the solution you'll need if you want to accomplish this task programmatically. And this is actually what DownGit is using under the hood. Using GitHub's REST API, write a script that does a GET request to the content endpoint. The endpoint can be constructed as follows: https://api.github.com/repos/:owner/:repo/contents/:path. After replacing the placeholders, an example endpoint is: https://api.github.com/repos/babel/babel-eslint/contents/lib. This gives you JSON data for all of the content that exists in that folder. The data has everything you need, including whether or not the content is a folder or file, a download URL if it's a file, and an API endpoint if it's a folder (so that you can get the data for that folder). Using this data, the script can recursively go through all content in the target folder, create folders for nested folders, and download all of the files for each folder. Check out DownGit's code for inspiration.

Upvotes: 23

Yogesh Chawla
Yogesh Chawla

Reputation: 1603

Our team wrote a Bash script to do this, because we didn't want to have to install SVN on our bare-bones server.

https://github.com/ojbc/docker/blob/master/java8-karaf3/files/git-download.sh

It uses the GitHub API and can be run from the command line like this:

git-download.sh https://api.github.com/repos/ojbc/main/contents/shared/ojb-certs

Upvotes: 4

aesede
aesede

Reputation: 5703

I work with CentOS 7 servers on which I don't have root access, nor Git, SVN, etc. (nor want to!), so I made a Python script to download any GitHub folder: https://github.com/andrrrl/github-folder-downloader

Usage is simple, just copy the relevant part from a github project, let's say the project is https://github.com/MaxCDN/php-maxcdn/, and you want a folder where some source files are only, then you need to do something like:

python gdownload.py "/MaxCDN/php-maxcdn/tree/master/src" /my/target/dir/ (will create target folder if doesn't exist)

It requires lxml library, can be installed with easy_install lxml. If you don't have root access (like me) you can create a .pydistutils.py file into your $HOME directory with these contents:

[install] user=1

And easy_install lxml will just work (ref: https://stackoverflow.com/a/33464597/591257).

Upvotes: 3

Kino
Kino

Reputation: 7293

Two options for this feature:

Option 1: GitZip Browser Extension

Chrome Extension, Edge Extension, Firefox Addon

Usage:

  1. Browse any GitHub repository page.
  2. Two ways to download:
    1. Choose the items:
      1. In default, you can double click on items or check the checkbox on the front of items.
      2. Click the download button at the bottom-right of the page.
    2. In the context menu:
      1. Click "GitZip Download" → "Whole Repository" or "Current Folder".
      2. Move the mouse cursor on the item and click "GitZip Download" → "Selected Folder/File".
      3. Click "GitZip Download" → "Checked Items" after doing 2-1-1.
  3. See the progress dashboard and wait for the browser trigger download.
  4. Get the ZIP file.

Get a token:

  1. Click the GitZip Extension icon on your browser.
  2. Click the "Normal" or "Private" link besides "Get Token".
  3. Authorize GitZip permissions on the GitHub authentication page.
  4. Back to the repository page of the beginning.
  5. Continue to use.

Option 2: GitHub / GitHub Pages

http://kinolien.github.io/gitzip by using the GitHub API, and JSZip, FileSaver.js libraries.

Step 1: Input the GitHub URL in the field at the top-right.
Step 2: Press Enter or click Download for downloading the ZIP file directly or click search for viewing the list of subfolders and files.
Step 3: Click the "Download Zip File" or "Get File" button to get the files.

In most cases, it works fine, except that the folder contains more than 1,000 files, because of the GitHub Trees API limitation (refers to GitHub API#Contents).

And it also can support private/public repositories and upgrade the rate limit, if you have a GitHub account and use the "get token" link on this site.

Upvotes: 696

محسن عباسی
محسن عباسی

Reputation: 2444

To export a directory from GitHub, replace "/tree/master/" in the directory's URL with "/trunk/".

For example, to export the directory from the following URL:

https://github.com/liferay/liferay-plugins/tree/master/portlets/sample-hibernate-portlet

run the following command:

svn export https://github.com/liferay/liferay-plugins/trunk/portlets/sample-hibernate-portlet

Upvotes: 4

Tommie C.
Tommie C.

Reputation: 13181

There is nothing wrong with other answers, but I just thought I'd share step-by-step instructions for those wandering through this process for the first time.

How to download a single folder from a GitHub repository (Mac OS X):

~ To open Terminal just click spotlight and type terminal then hit Enter

  1. On a Mac, you likely already have SVN (to test just open terminal and type "svn" or "which svn" ~ without the quote marks)
  2. On GitHub: Locate the GitHub path to your Git folder (not the repository) by clicking the specific folder name within a repository
  3. Copy the path from the address bar of the browser
  4. Open Terminal and type: svn export
  5. Next paste in the address (e.g.): https://github.com/mingsai/Sample-Code/tree/master/HeadsUpUI
  6. Replace the words: tree/master
  7. with the word: trunk
  8. Type in the destination folder for the files (in this example, I store the target folder inside of the Downloads folder for the current user)
  9. Here space is just the spacebar, not the word (space) ~/Downloads/HeadsUpUI
  10. The final terminal command shows the full command to download the folder (compare the address to step 5) svn export https://github.com/mingsai/Sample-Code/trunk/HeadsUpUI ~/Downloads/HeadsUpUI

BTW - If you are on Windows or some other platform, you can find a binary download of Subversion (SVN) at http://subversion.apache.org

~ If you want to checkout the folder rather than simply download it try using the SVN help (tldr: replace export with checkout)

Regarding the comment on resuming an interrupted download/checkout. I would try running svn cleanup followed by svn update. Please search Stack Overflow for additional options.

Upvotes: 35

janos
janos

Reputation: 124804

If you have Subversion (SVN), you can use svn export to do this:

svn export https://github.com/foobar/Test.git/trunk/foo

Notice the URL format:

  • The base URL is https://github.com/
  • /trunk appended at the end

Before you run svn export, it's good to first verify the content of the directory with:

svn ls https://github.com/foobar/Test.git/trunk/foo

Upvotes: 196

zeeawan
zeeawan

Reputation: 6905

Another specific example:

Like I want to download 'iOS Pro Geo' folder from the URL https://github.com/alokc83/APRESS-Books-Source-Code-/**tree/master**/%20Pro%20iOS%20Geo

and I can do so via

svn checkout https://github.com/alokc83/APRESS-Books-Source-Code-/trunk/%20Pro%20iOS%20Geo

Note trunk in the path.

Yes, using export instead of checkout would give a clean copy without extra Git repository files.

svn export https://github.com/alokc83/APRESS-Books-Source-Code-/trunk/%20Pro%20iOS%20Geo

If tree/master is not there in the URL, then fork it and it will be there in the forked URL.

Upvotes: 9

Mohammed Jafar
Mohammed Jafar

Reputation: 543

Whoever is working on a specific folder, he/she needs to clone that particular folder itself. To do so, please follow the below steps by using a sparse checkout.

  1. Create a directory.

  2. Initialize a Git repository (git init)

  3. Enable sparse checkouts. (git config core.sparsecheckout true)

  4. Tell Git which directories you want (echo 2015/brand/May(refer to folder you want to work on) >> .git/info/sparse-checkout)

  5. Add the remote (git remote add -f origin https://jafartke.com/mkt-imdev/DVM.git)

  6. Fetch the files (git pull origin master )

Upvotes: 34

pcv
pcv

Reputation: 2181

If you need to do it programmatically and you don't want to rely on SVN, you can use GitHub API to download all the contents recursively.

For inspiration, here's my Ruby gist.

Upvotes: 1

RobW
RobW

Reputation: 10621

For a generic Git repository:

If you want to download files, not clone the repository with history, you can do this with git-archive.

git-archive makes a compressed ZIP or tar archive of a Git repository. Some things that make it special:

  1. You can choose which files or directories in the Git repository to archive.
  2. It doesn't archive the .git/ folder, or any untracked files in the repository it's run on.
  3. You can archive a specific branch, tag, or commit. Projects managed with Git often use this to generate archives of versions of the project (beta, release, 2.0, etc.) for users to download.

An example of creating an archive of the docs/usage directory from a remote repository you're connected to with ssh:

# In a terminal
git archive --format tar --remote ssh://server.org/path/to/Git HEAD docs/usage > /tmp/usage_docs.tar

More information in this blog post and the Git documentation.

Note on GitHub repositories:

GitHub doesn't allow git-archive access. ☹️

Upvotes: 104

johnny
johnny

Reputation: 4164

If you truly just want to just "download" the folder and not "clone" it (for development), the easiest way to simply get a copy of the most recent version of the repository (and therefore a folder/file within it), without needing to clone the whole repository or even install Git in the first place, is to download a ZIP archive (for any repository, fork, branch, commit, etc.) by going to the desired repository/fork/branch/commit on GitHub (e.g. http(s)://github.com/<user>/<repo>/commit/<Sha1> for a copy of the files as they were after a specific commit) and selecting the Downloads button near the upper-right.

This archive format contains none of the git-repo magic, just the tracked files themselves (and perhaps a few .gitignore files if they were tracked, but you can ignore those :p)—that means that if the code changes and you want to stay on top, you'll have to manually redownload it, and it also means you won't be able to use it as a Git repository...

I am not sure if that's what you're looking for in this case (again, "download"/view vs "clone"/develop), but it can be useful nonetheless...

Upvotes: 18

michel-lind
michel-lind

Reputation: 9766

You cannot; unlike Subversion (SVN), where each subdirectory can be checked out individually, Git operates on a whole-repository basis.

For projects where finer-grained access is necessary, you can use submodules—each submodule is a separate Git project, and thus can be cloned individually.

It is conceivable that a Git front-end (e.g., GitHub's web interface, or GitWeb) could choose to provide an interface for you to extract a given folder, but to my knowledge none of them do that (though they do let you download individual files, so if the folder does not contain too many files, that is an option)

GitHub actually offers access via SVN, which would allow you to do just this (as per comment). See Improved SVN is here to stay, and old SVN is going away for latest instructions on how to do this.

Upvotes: 25

Asenar
Asenar

Reputation: 7020

If the directory you want to download is a separated library, it's better to create an other Git repository, and then to use the Git submodule function.

Of course, you have to be the owner of the initial repository you want.

Upvotes: -9

Pinecone
Pinecone

Reputation: 421

For whatever reason, the svn solution does not work for me, and since I have no need of svn for anything else, it did not make sense to spend time trying to make it, so I looked for a simple solution using tools I already had. This script uses only curl and awk to download all files in a GitHub directory described as "/:user:repo/contents/:path".

The returned body of a call to the GitHub REST API "GET /repos/:user:repo/contents/:path" command returns an object that includes a "download_url" link for each file in a directory.

This command-line script calls that REST API using curl and sends the result through AWK, which filters out all but the "download_url" lines, erases quote marks and commas from the links, and then downloads the links using another call to curl.

curl -s https://api.github.com/repos/:user/:repo/contents/:path | awk \
     '/download_url/ { gsub("\"|,", "", $2); system("curl -O " $2); }'

Upvotes: 7

R.M. Reza
R.M. Reza

Reputation: 1009

I have created a simple application that supports download directories, files, and repositories (Private/Public).

Application: https://downdir.vercel.app/

GitHub: https://github.com/renomureza/downdir

Upvotes: 4

nsrCodes
nsrCodes

Reputation: 1138

Just add ss to the start of the GitHub URL: (github.com -> ssgithub.com)

I built this simple webpage that does this for you, so just:

  1. Navigate to the directory/file you want to download on GitHub
  2. Add ss to the start of the URL in the address bar

Clicking on Download should zip just the contents of that directory and download them to your device.

[Demo video]

Upvotes: 33

angordeyev
angordeyev

Reputation: 869

Download the Git repository folder to a current directory and delete Git files.

#!/bin/sh

function download_git_folder() {
  repo_url=$1
  branch=$2
  repo_subfolder_path=$3

  repo_folder=$(basename $repo_url)
  git init
  git remote add -f origin ${repo_url}
  git config core.sparseCheckout true
  echo "${repo_subfolder_path}" >> .git/info/sparse-checkout
  git pull origin ${branch}
  mv "${repo_subfolder_path}"/* ./

  readarray -td/ root_subfolder <<<"${repo_subfolder_path}"; declare -p root_subfolder;
  rm -rf ./.git ${root_subfolder[0]}
}

Usage

download_git_folder "git@github.com:foobar/Test.git" "master" "Test/bar"

Upvotes: 1

Avinash Thakur
Avinash Thakur

Reputation: 2039

After trying all the answers, the best solution for me was:

GitHub's Visual Studio Code-based editor.

Pros:

  1. doesn't require any extra tool like SVN or API tokens.
  2. No limit on size of content
  3. Saves as a directory or file, and not archive.

Instructions

  1. Go to any repository. (example: https://github.com/RespiraWorks/Ventilator/tree/master/software)

  2. Press . or replace .com with .dev in the URL to open the repository in GitHub's internal editor

  3. In the Explorer pane (left side or press Ctrl+Shift+E), right-click on the required file/folder and select Download.

  4. In the Select Folder dialog box, choose the directory on your disk under which you want the selected file/folder to exist.

Note

I tried other solutions like in accepted answer but,

  1. I don't want to install and learn SVN only for this.

  2. Other tools like Download Directory, Refined GitHub, GitZip, DownGit either require API tokens or cannot download large directories.

Other options

  • Visual Studio Code with Remote Repositories extension to open the repository and download the file/folder.

Upvotes: 113

ETeddy
ETeddy

Reputation: 171

This answer is for a special case when you want a certain file from a repository.

A short answer can be found here. You should change the URL to this format:

https://raw.github.com/user/repository/branch/file.name

To explain it simply, enter your desired URL from GitHub. Add raw before GitHub in the URL address, and delete the blob from the address. For example, suppose you want to grab the CSV file in this address:

https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv

You should change the URL to this one:

https://raw.github.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv

Upvotes: -2

Gray Programmerz
Gray Programmerz

Reputation: 609

To be unique, I must say you can also download GitHub folders without SVN, Git, or any API.

GitHub supports raw links, which you can exploit to download only those files and folders which you need.

I noticed many things. Below is my research collection:

Mechanism

  • Crawl all hyperlinks <a> from webpage and get its href="value" values

  • if an href value contains /tree/master/ or /tree/main/ then it is folder link: https://github.com/graysuit/GithubFolderDownloader /tree/main/ GithubFolderDownloader

  • else if the href value contains /blob/master/ or /blob/main/ then it is a file link: https://github.com/graysuit/GithubFolderDownloader /blob/main/ GithubFolderDownloader.sln

  • Afterwards, replace github.com with raw.githubusercontent.com and remove /blob/ from the file: https://raw.githubusercontent.com/graysuit/GithubFolderDownloader/main/GithubFolderDownloader.sln

  • It would become a raw link. Now you can download it.

Tool

On the basis of above research, I created a minimalist tool in C# that can grab folders. graysuit/GithubFolderDownloader

Note: I am author. You can comment if any thing missing or unclear.

Upvotes: 2

Meir Gabay
Meir Gabay

Reputation: 3316

This is how I do it with Git v2.25.0, and it was also tested with v2.26.2. This trick doesn't work with v2.30.1.

TLDR

git clone --no-checkout --filter=tree:0 https://github.com/opencv/opencv
cd opencv

# Requires Git 2.25.x to 2.26.2
git sparse-checkout set data/haarcascades

You can use Docker to avoid installing a specific version of Git:

git clone --no-checkout --filter=tree:0 https://github.com/opencv/opencv
cd opencv

# Requires Git 2.25.x to 2.26.2
docker run --rm -it -v $PWD/:/code/ --workdir=/code/ alpine/git:v2.26.2 sparse-checkout set data/haarcascades

Full solution

# Bare minimum clone of OpenCV
git clone --no-checkout --filter=tree:0 https://github.com/opencv/

Output:

opencv
...
Resolving deltas: 100% (529/529), done.

# Downloaded only ~7.3MB , takes ~3 seconds
# du = disk usage, -s = summary, -h = human-readable

And:

du -sh opencv

Output:

7.3M    opencv/

And:

# Set target directory
cd opencv
git sparse-checkout set data/haarcascades

Output:

...
Updating files: 100% (17/17), done.
# Takes ~10 seconds, depending on your specifications

And:

# View downloaded files
du -sh data/haarcascades/

Output:

9.4M    data/haarcascades/

And:

ls data/haarcascades/

Output:

haarcascade_eye.xml                      haarcascade_frontalface_alt2.xml      haarcascade_licence_plate_rus_16stages.xml  haarcascade_smile.xml
haarcascade_eye_tree_eyeglasses.xml      haarcascade_frontalface_alt_tree.xml  haarcascade_lowerbody.xml                   haarcascade_upperbody.xml
haarcascade_frontalcatface.xml           haarcascade_frontalface_default.xml   haarcascade_profileface.xml
haarcascade_frontalcatface_extended.xml  haarcascade_fullbody.xml              haarcascade_righteye_2splits.xml
haarcascade_frontalface_alt.xml          haarcascade_lefteye_2splits.xml       haarcascade_russian_plate_number.xml

References

Upvotes: 12

admin
admin

Reputation: 169

Try it.

https://github.com/twfb/git-directory-download

usage: gitd [-h] [-u URL] [-r] [-p] [--proxy PROXY]

Optional arguments:
  -h, --help         Show this help message and exit
  -u URL, --url URL  GitHub URL, split by ",", example: "https://x, http://y"
  -r, --raw          Download from a raw URL
  -p, --parse        Download by parsing HTML
  --proxy PROXY      Proxy configuration. Example: "socks5://127.0.0.1:7891"

Example:

  1. download by a raw URL: gitd -u "https://github.com/twfb/git-directory-download"
  2. download by a raw URL: gitd -r -u "https://github.com/twfb/git-directory-download"
  3. download by parsing: gitd -p -u "https://github.com/twfb/git-directory-download"
  4. download by raw URL with a proxy: gitd -r -u "https://github.com/twfb/git-directory-download" --proxy "socks5://127.0.0.1:7891"

Upvotes: 3

Related Questions