Abdul Rehman
Abdul Rehman

Reputation: 5684

Create a Zip Archive from a cloned GitHub repo in Python

I'm working on a project using python(3.6) in which I need to create a zip archive which includes all files from a GitHub repo. The user will provide the git repo URL then I will need to clone this repo and create a zip archive which includes all files from GitHub repo.

Here's what I have tried:

                ARCHIVE_NAME = func_obj.fname + '.zip'
                zip_archive = zipfile.ZipFile(ARCHIVE_NAME, "w")
                # Write files to the archive
                zip_archive.write(git.Repo.clone_from(func_obj.sc_github, to_path=os.path.join(ARCHIVE_NAME)))
                zip_archive.close()
                file_path = os.path.join(IGui.settings.BASE_DIR, ARCHIVE_NAME)

Here's the updated code, which cloned the repo and generates the zip archive but it has another problem as describes below:

                ARCHIVE_NAME = func_obj.fname + '.zip'
                zip_archive = zipfile.ZipFile(ARCHIVE_NAME, "w")
                # Write files to the archive
                tempdir = tempfile.mkdtemp()
                # Ensure the file is read/write by the creator only
                saved_umask = os.umask(0o077)
                temppath = os.path.join(tempdir)
                print(temppath)
                git.Repo.clone_from(func_obj.sc_github, to_path=temppath)
                dirList = os.listdir(temppath)
                for file in dirList:
                    get_file = str(os.path.join(temppath, file))
                    print(get_file)
                    zip_archive.write(get_file)
                os.umask(saved_umask)
                shutil.rmtree(tempdir)

Problem is: **
for example, if the tempath is: /var/folders/lf/pc01_3zj38q0qv1vq9r6rxs00000gn/T/tmpca2fv8eg then the zip archives creates as: when we extract the zip archive it includes var directory, then inside var dir we have folders directory, then inside folders directory we have lf dir and till to the tmpca2fv8eg directory then inside this directory we have our repo files, But I need to have my repo files directly in zip archive when we extract it we get all files, not any directory.**

Help me, please!

Thanks in advance!

Upvotes: 0

Views: 1769

Answers (3)

Atto Allas
Atto Allas

Reputation: 610

You're trying to save the git repo into the ZipFile directly from the git.Repo.clone_from command. This will not work, as the git library cannot save a repo instantly into a zip file. What you'll have to do is choose a temporary path to save the repo onto, and then give that path to zip_archive.write.

What you want is:

tempPath = "/Users/abdul/temp/temp_zip" # you can change this, this is temporary

git.Repo.clone_from(func_obj.sc_github, to_path=os.path.join(tempPath))

files = os.listdir(tempPath)

for singleFile in files:
    zip_archive.write(os.path.join(tempPath, singleFile), singleFile)

# you can now delete the folder at tempPath

Instead of:

zip_archive.write(git.Repo.clone_from(func_obj.sc_github, to_path=os.path.join(ARCHIVE_NAME)))

A sample output from your git repo (https://github.com/arycloud/sis-testing.git):

Root directory of the zip file

Note: this is the root directory of the zip file, no directories in between. This is using this exact code:

import git, os, zipfile

zip_archive = zipfile.ZipFile("C:\\Users\\Attoa\\Desktop\\testos.zip", "w")

tempPath = "C:\\Users\\Attoa\\AppData\\Local\\Temp\\temp_zip\\" # you can change this, this is temporary

git.Repo.clone_from("https://github.com/arycloud/sis-testing.git", to_path=os.path.join(tempPath))

files = os.listdir(tempPath)

for singleFile in files:
    zip_archive.write(os.path.join(tempPath, singleFile), singleFile)

I hope this helps!

Upvotes: 1

Edmund Dipple
Edmund Dipple

Reputation: 2444

Rather than creating the zip file yourself, you can download an archive of the repository directly from GitHub

The URL you need to call is http://github.com/user/repository/archive/master.zip

You can do the same with tags and branch names, by replacing master in the URL above with the name of the branch or tag.

Upvotes: 0

Borys Serebrov
Borys Serebrov

Reputation: 16182

Alternative way to create the repository archive is to use the git archive command which supports zip and tar formats:

import git
import tempfile
import os

tempdir = tempfile.mkdtemp()
temppath = os.path.join(tempdir)
print(temppath)
repo = git.Repo.clone_from(
    'https://github.com/serebrov/nodejs-typescript.git',
    to_path=temppath)

with open("archive.zip", "wb") as zipfile:
    repo.archive(zipfile, format='zip')

The resulting archive.zip contains the repository files:

(venv) $ unzip -l archive.zip 
Archive:  archive.zip
c55ff81ef2934670cb273b5fadd555d932081f2e
  Length      Date    Time    Name
---------  ---------- -----   ----
       18  2017-11-10 22:57   .gitignore
      552  2017-11-10 22:57   README.md
        0  2017-11-10 22:57   client/
      305  2017-11-10 22:57   client/client.ts
      146  2017-11-10 22:57   client/tsconfig.json
      586  2017-11-10 22:57   package.json
        0  2017-11-10 22:57   server/
      488  2017-11-10 22:57   server/app.ts
      195  2017-11-10 22:57   server/tsconfig.json
        0  2017-11-10 22:57   views/
      169  2017-11-10 22:57   views/index.html
---------                     -------
     2459                     11 files

Upvotes: 1

Related Questions