songei2f
songei2f

Reputation: 659

Pull All Gists from Github?

Is there an API call or any scripts that I failed to overturn that would pull all my Gists from Github to an outside git repo or just return me a list of their names? I know each one is a separate git repo, so I have assumed the best I can do is get the latter, then script to get all of them onto my local box.

EDIT 1: I know about pulling and pushing git repos from one service to another, I am specifically looking for people who have the 411 on collecting an authoritative list of all Gists I have, private and public. I also thought this might be useful to others. It is not so much about migration, but a backup strategy . . . of sorts.

EDIT 2: So, it appears this might not be possible. I apparently did not Google hard enough to search the updated Github/Gist API. The other API calls work with simple curl commands, but not the v1 API for Gist. Still, the API says TBD for all private and public Gists, so I think that puts the cabash on the whole thing unless an enlightened soul hooks a brotha up.

$ curl http://github.com/api/v2/json/repos/show/alharaka
{"repositories":[{"url":"https://github.com/alharaka/babushka","has_wiki":true,"homepage":"http:
... # tons of more output
echo $?
0
$ 

This one does not work so hot.

$ curl https://gist.github.com/api/v1/:format/gists/:alharaka
$ echo $?
0
$

EDIT 3: Before I get asked, I noticed there is a difference in the API versioning; this "brilliant hack" did not help either. Still very cool though.

$ curl https://gist.github.com/api/v2/:format/gists/:alharaka # Notice v2 instead of v1
$ echo $?
0
$

Upvotes: 28

Views: 8801

Answers (12)

hlorand
hlorand

Reputation: 1406

You can do this with the GitHub CLI and some Bash scripting. The goal is to download every gist in separate directories with readable names.

  1. Install GitHub CLI: sudo apt install gh or brew install gh then log in with gh auth login
  2. Define a function that slugifies strings. It is useful when creating folders for the gists. See the slugify() function below. For example: slugify "hello world" becomes hello-world
  3. Loop through all the gists and clone it in separate folders with readable names.
# function that creates a slug from a text

slugify(){ echo "$1" | iconv -t ascii//TRANSLIT | sed -r s/[^a-zA-Z0-9]+/-/g | sed -r s/^-+\|-+$//g | tr A-Z a-z; }

# initializes a counter and lists every gist in reverse order
# then clones all of them in a directory named: COUNTER-gist-description

cnt=0; gh gist list --limit 1000 | cut -f1,2 | tac | while read id name; do ((cnt++)); gh gist clone $id $cnt-`slugify "$name"`; done

The result:

1-my-first-gist/
2-my-second-gist/
3-my-third-gist/
...

Upvotes: 4

And what about the GitHub CLI?

brew install gh

gh auth login

gh gist list [flags]

  Options:

    -L, --limit int   Maximum number of gists to fetch (default 10)
    --public          Show only public gists
    --secret          Show only secret gists

gh gist clone <gist> [<directory>] [-- <gitflags>...]

Upvotes: 0

Bryan C Guner
Bryan C Guner

Reputation: 1

I use this and it works like a charm!


# first: mkdir user && cd user && cp /path/to/get_gists.py .
# python3 get_gists.py user
import requests
import sys
from subprocess import call

user = sys.argv[1]

r = requests.get('https://api.github.com/users/{0}/gists'.format(user))

for i in r.json():
    call(['git', 'clone', i['git_pull_url']])

    description_file = './{0}/description.txt'.format(i['id'])
    with open(description_file, 'w') as f:
        f.write('{0}\n'.format(i['description']))


Upvotes: 0

juanMSFT
juanMSFT

Reputation: 11

March 2021 update (Python3)

If a user has a ton of gists with the same file name, this works great.

import requests, json, time, uuid
headers = {"content-type" : "application/json"}
url =  'https://api.github.com/users/ChangeToYourTargetUser/gists?per_page=100&page='

for page in range(1,100):  #do pages start at 1 or 0?
    print('page: ' + str(page))
    r = requests.get(url+str(page), headers = headers)
    metadata_file = './data/my_gist_list.json'
    # Getting metadata
    prettyJson = json.dumps(r.json(), indent=4, sort_keys=True)
    f = open(metadata_file, 'w')
    f.write(prettyJson)

    print('Metadata obtained as {}'.format(metadata_file))

    # Downloading files
    data = r.json()
    counter = 0
    for i in data:
        time.sleep(1.1)
        files_node = i['files']
        file_name = [k for k in files_node][0]
        r = requests.get(files_node[file_name]['raw_url'])
        f = open('./data/{}'.format(str(uuid.uuid4())), 'w')
        f.write(r.text)
        f.close()
        print('Download' + str(i))
        counter += 1

    print('{} files successfully downloaded.'.format(counter))

Upvotes: 0

HVS
HVS

Reputation: 2487

If all you need to do is to download all gists from a particular user, then this simple python script will help.

The gists information for a particular user is exposed via API

"https://api.github.com/users/" + username + "/gists"

You can simply loop through the JSON exposed by API, get list of gists, perform cloning, or simply download the gists using the raw url specified. The simple script below loops through the JSON, pulls out the file name and raw url and downloads all gists and saves it in local folder.

import requests

# Replace username with correct username
url = "https://api.github.com/users/" + username + "/gists"

resp = requests.get(url)
gists = resp.json()

for gist in gists:
    for file in gist["files"]:
        fname = gist["files"][file]["filename"]
        furl = gist["files"][file]["raw_url"]
        print("{}:{}".format(fname, furl)) # This lists out all gists

        Use this to download all gists
        pyresp = requests.get(furl)

        with open("../folder/" + fname, "wb") as pyfile:
            for chunk in pyresp.iter_content(chunk_size=1024):
                if chunk:
                    pyfile.write(chunk)
        print("{} downloaded successfully".format(fname))

Upvotes: 0

Chris Arndt
Chris Arndt

Reputation: 2158

Based on the hint in this answer, I wrote this simple Python script, which does the trick for me.

This is very minmal code, with hardly any error checking, and clones all the user's gists into the current directory.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""Clone all gists of GitHub username given on the command line."""

import subprocess
import sys
import requests

if len(sys.argv) > 1:
    gh_user = sys.argv[1]
else:
    print("Usage: clone-gists.py <GitHub username>")
    sys.exit(1)

req = requests.get('https://api.github.com/users/%s/gists' % gh_user)

for gist in req.json():
    ret = subprocess.call(['git', 'clone', gist['git_pull_url']])
    if ret != 0:
        print("ERROR cloning gist %s. Please check output." % gist['id'])

See https://gist.github.com/SpotlightKid/042491a9a2987af04a5a for a version that handles updates as well.

Upvotes: 5

Fedir RYKHTIK
Fedir RYKHTIK

Reputation: 9974

There is an adaptation in API v3 of nicerobot's script, which was initially written for API v1 :

#!/usr/bin/env python
# Clone or update all a user's gists
# curl -ks https://raw.github.com/gist/5466075/gist-backup.py | USER=fedir python
# USER=fedir python gist-backup.py

import json
import urllib
from subprocess import call
from urllib import urlopen
import os
import math
USER = os.environ['USER']

perpage=30.0
userurl = urlopen('https://api.github.com/users/' + USER)
public_gists = json.load(userurl)
gistcount = public_gists['public_gists']
print "Found gists : " + str(gistcount)
pages = int(math.ceil(float(gistcount)/perpage))
print "Found pages : " + str(pages)

f=open('./contents.txt', 'w+')

for page in range(pages):
    pageNumber = str(page + 1)
    print "Processing page number " + pageNumber
    pageUrl = 'https://api.github.com/users/' + USER  + '/gists?page=' + pageNumber + '&per_page=' + str(int(perpage))
    u = urlopen (pageUrl)
    gists = json.load(u)
    startd = os.getcwd()
    for gist in gists:
        gistd = gist['id']
        gistUrl = 'git://gist.github.com/' + gistd + '.git' 
        if os.path.isdir(gistd):
            os.chdir(gistd)
            call(['git', 'pull', gistUrl])
            os.chdir(startd)
        else:
            call(['git', 'clone', gistUrl])
        if gist['description'] == None:
            description = ''
        else:
            description = gist['description'].encode('utf8').replace("\r",' ').replace("\n",' ')
        print >> f, gist['id'], gistUrl, description

Upvotes: 15

saranicole
saranicole

Reputation: 2453

A version of @Fedir 's script that accounts for Github pagination (if you have a few hundred gists):

#!/usr/bin/env python
# Clone or update all a user's gists
# curl -ks https://raw.github.com/gist/5466075/gist-backup.py | USER=fedir python
# USER=fedir python gist-backup.py

import json
import urllib
from subprocess import call
from urllib import urlopen
import os
import math
USER = os.environ['USER']

perpage=30.0
userurl = urlopen('https://api.github.com/users/' + USER)
public_gists = json.load(userurl)
gistcount = public_gists['public_gists']
print "Found gists : " + str(gistcount)
pages = int(math.ceil(float(gistcount)/perpage))
print "Found pages : " + str(pages)

f=open('./contents.txt', 'w+')

for page in range(pages):
    pageNumber = str(page + 1)
    print "Processing page number " + pageNumber
    pageUrl = 'https://api.github.com/users/' + USER  + '/gists?page=' + pageNumber + '&per_page=' + str(int(perpage))
    u = urlopen (pageUrl)
    gists = json.load(u)
    startd = os.getcwd()
    for gist in gists:
        gistd = gist['id']
        gistUrl = 'git://gist.github.com/' + gistd + '.git' 
        if os.path.isdir(gistd):
            os.chdir(gistd)
            call(['git', 'pull', gistUrl])
            os.chdir(startd)
        else:
            call(['git', 'clone', gistUrl])

Upvotes: 5

sanusart
sanusart

Reputation: 1527

In addition to Thomas Traum's few answers up. It seems that user agent is an must now: http://developer.github.com/v3/#user-agent-required.

So I did exercise of my own at: https://github.com/sanusart/gists-backup. It is aware of paging, duplicate descriptions and missing descriptions too.

Upvotes: 3

Thomas Traum
Thomas Traum

Reputation: 277

I wrote a quick node.js script as an exercise, downloads all gists and saves them with the same filename as the original gist in a folder which matches the "gist description" name. https://gist.github.com/thomastraum/5227541

var request = require('request')
    , path = require('path')
    , fs = require('fs')
    , url = "https://api.github.com/users/thomastraum/gists"
    , savepath = './gists';

request(url, function (error, response, body) {

    if (!error && response.statusCode == 200) {

        gists = JSON.parse( body );
        gists.forEach( function(gist) {

            console.log( "description: ", gist.description );
            var dir = savepath + '/' + gist.description;

            fs.mkdir( dir, function(err){
                for(var file in gist.files){

                    var raw_url = gist.files[file].raw_url;
                    var filename = gist.files[file].filename;

                    console.log( "downloading... " + filename );
                    request(raw_url).pipe(fs.createWriteStream( dir + '/' + filename ));
                }
            });
        });

    }

});

Upvotes: 2

studiomohawk
studiomohawk

Reputation: 420

This ruby gem seems help your problem. I haven't tried it yet, but looks promising.

First

gem install gisty

And you need to put

export GISTY_DIR="$HOME/dev/gists"

in your .bashrc or .zshrc This dir is where your gists saved.

you need to

git config --global github.user your_id
git config --global github.token your_token

add above config on your .gitconfig

Usage

  • gisty post file1 file2 ...

    posts file1 and file2 to your gist

  • gisty private_post file1 file2 ...

    posts file1 and file2 privately

  • gisty sync

    Sync to all of your gists

  • gisty pull_all

    Pull to local repo

  • gisty list

    List cloned local gist repos

Upvotes: 1

Koraktor
Koraktor

Reputation: 42893

Version 3 of the GitHub API allows this in a pretty simple way:

https://api.github.com/users/koraktor/gists

gives you a list of all Gists of the user and that list offers a various amount of URLs including the API URLs to the individual Gists like

https://api.github.com/gists/921286

See the Gists API v3 documentation.

Upvotes: 22

Related Questions