user3932031
user3932031

Reputation: 461

How to upload folder on Google Cloud Storage using Python API

I have successfully uploaded single text file on Google Cloud Storage. But when i try to upload whole folder, It gives permission denied error.

filename = "d:/foldername"   #here test1 is the folder.


Error:
Traceback (most recent call last):
  File "test1.py", line 142, in <module>
    upload()
  File "test1.py", line 106, in upload
    media = MediaFileUpload(filename, chunksize=CHUNKSIZE, resumable=True)
  File "D:\jatin\Project\GAE_django\GCS_test\oauth2client\util.py", line 132, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "D:\jatin\Project\GAE_django\GCS_test\apiclient\http.py", line 422, in __init__
    fd = open(self._filename, 'rb')
IOError: [Errno 13] Permission denied: 'd:/foldername'

Upvotes: 16

Views: 24541

Answers (9)

Toldry
Toldry

Reputation: 951

Another option is to use gsutil, the command-line tool for interacting with Google Cloud:

gsutil cp -r ./my/local/directory gs://my_gcp_bucket/foo/bar

The -r flag tells gsutil to copy recursively. Link gsutil to documentation.

Invoking gsutil in Python can be done like this:

import subprocess

subprocess.check_call('gsutil cp -r ./my/local/directory gs://my_gcp_bucket/foo/bar')

Upvotes: -1

Tsvi Sabo
Tsvi Sabo

Reputation: 675

I just came across the gcsfs library which seems to be also about better interfaces

You could copy an entire directory into a gcs location like this:


def upload_to_gcs(src_dir: str, gcs_dst: str):
    fs = gcsfs.GCSFileSystem()
    fs.put(src_dir, gcs_dst, recursive=True)

Upvotes: 1

Deepanshu Yadav
Deepanshu Yadav

Reputation: 11

Here is my recursive implementation . we need to create a file named gdrive_utils.py and write the following.


from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from apiclient.http import MediaFileUpload, MediaIoBaseDownload
import pickle
import glob
import os


# The following scopes are required for access to google drive.
# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive.metadata.readonly',
          'https://www.googleapis.com/auth/drive.metadata',
          'https://www.googleapis.com/auth/drive',
          'https://www.googleapis.com/auth/drive.file',
          'https://www.googleapis.com/auth/drive.appdata']


def get_gdrive_service():
    """
    Tries to authenticate using a token. If token expires or not present creates one.
    :return: Returns authenticated service object
    :rtype: object
    """
    creds = None
    # The file token.pickle stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'keys/client-secret.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)
    # return Google Drive API service
    return build('drive', 'v3', credentials=creds)


def createRemoteFolder(drive_service, folderName, parent_id):
    # Create a folder on Drive, returns the newely created folders ID
    body = {
        'name': folderName,
        'mimeType': "application/vnd.google-apps.folder",
        'parents': [parent_id]
    }

    root_folder = drive_service.files().create(body = body, supportsAllDrives=True, fields='id').execute()
    return root_folder['id']


def upload_file(drive_service, file_location, parent_id):
    # Create a folder on Drive, returns the newely created folders ID
    body = {
        'name': os.path.split(file_location)[1],
        'parents': [parent_id]
    }

    media = MediaFileUpload(file_location,
                            resumable=True)

    file_details = drive_service.files().create(body = body,
                                                media_body=media,
                                                supportsAllDrives=True,
                                                fields='id').execute()
    return file_details['id']


def upload_file_recursively(g_drive_service, root, folder_id):

    files_list = glob.glob(root)
    if files_list:
        for file_contents in files_list:
            if os.path.isdir(file_contents):
                # create new _folder
                new_folder_id = createRemoteFolder(g_drive_service, os.path.split(file_contents)[1],
                                                   folder_id)
                upload_file_recursively(g_drive_service, os.path.join(file_contents, '*'), new_folder_id)
            else:
                # upload to given folder id
                upload_file(g_drive_service, file_contents, folder_id)

After that use the following

import os

from gdrive_utils import createRemoteFolder, upload_file_recursively, get_gdrive_service

g_drive_service = get_gdrive_service()
FOLDER_ID_FOR_UPLOAD = "<replace with folder id where you want upload>"
main_folder_id = createRemoteFolder(g_drive_service, '<name_of_main_folder>', FOLDER_ID_FOR_UPLOAD)

And finally use this

upload_file_recursively(g_drive_service, os.path.join("<your_path_>", '*'), main_folder_id)

Upvotes: 0

shrikanth singh
shrikanth singh

Reputation: 95

The solution can also be used for windows systems. Simply provide the folder name to upload the destination bucket name.Additionally, it can handle any level of subdirectories in a folder.

import os
from google.cloud import storage
storage_client = storage.Client()
def upload_files(bucketName, folderName):
"""Upload files to GCP bucket."""
bucket = storage_client.get_bucket(bucketName)
for path, subdirs, files in os.walk(folderName):
    for name in files:
        path_local = os.path.join(path, name)
        blob_path = path_local.replace('\\','/')
        blob = bucket.blob(blob_path)
        blob.upload_from_filename(path_local)

Upvotes: 0

Frederik Bode
Frederik Bode

Reputation: 2744

A version without a recursive function, and it works with 'top level files' (unlike the top answer):

import glob
import os 
from google.cloud import storage

GCS_CLIENT = storage.Client()
def upload_from_directory(directory_path: str, dest_bucket_name: str, dest_blob_name: str):
    rel_paths = glob.glob(directory_path + '/**', recursive=True)
    bucket = GCS_CLIENT.get_bucket(dest_bucket_name)
    for local_file in rel_paths:
        remote_path = f'{dest_blob_name}/{"/".join(local_file.split(os.sep)[1:])}'
        if os.path.isfile(local_file):
            blob = bucket.blob(remote_path)
            blob.upload_from_filename(local_file)

Upvotes: 11

Maor88
Maor88

Reputation: 199

This works for me. Copy all content from a local directory to a specific bucket-name/full-path (recursive) in google cloud storage:

import glob
from google.cloud import storage

def upload_local_directory_to_gcs(local_path, bucket, gcs_path):
    assert os.path.isdir(local_path)
    for local_file in glob.glob(local_path + '/**'):
        if not os.path.isfile(local_file):
           upload_local_directory_to_gcs(local_file, bucket, gcs_path + "/" + os.path.basename(local_file))
        else:
           remote_path = os.path.join(gcs_path, local_file[1 + len(local_path):])
           blob = bucket.blob(remote_path)
           blob.upload_from_filename(local_file)


upload_local_directory_to_gcs(local_path, bucket, BUCKET_FOLDER_DIR)

Upvotes: 17

Ganesh Kharad
Ganesh Kharad

Reputation: 341

Refer - https://hackersandslackers.com/manage-files-in-google-cloud-storage-with-python/

from os import listdir
from os.path import isfile, join

...

def upload_files(bucketName):
    """Upload files to GCP bucket."""
    files = [f for f in listdir(localFolder) if isfile(join(localFolder, f))]
    for file in files:
        localFile = localFolder + file
        blob = bucket.blob(bucketFolder + file)
        blob.upload_from_filename(localFile)
    return f'Uploaded {files} to "{bucketName}" bucket.'

Upvotes: 0

Nikita Uchaev
Nikita Uchaev

Reputation: 1174

A folder is a cataloging structure containing references to files and directories. The library will not accept a folder as an argument.

As far as I understand, your use case is to make an upload to GCS preserving a local folder structure. To accomplish that you can use the os python module and make a recursive function (e.g process_folder) that will take path as an argument. This logic can be used for the function:

  1. Use os.listdir() method to get a list of objects within the source path (will return both files and folders).
  2. Iterate over a list from step 1 to separate files from folders via os.path.isdir() method.
  3. Iterate over files and upload them with adjusted path (e.g. path+ “/“ + file_name).
  4. Iterate over folders making a recursive call (e.g. process_folder(path+folder_name)).

It’ll be necessary to work with two paths:

  1. Real system path (e.g. “/Users/User/…/upload_folder/folder_name”) used with os module.
  2. Virtual path for GCS file uploads (e.g. “upload”+”/“ + folder_name + ”/“ + file_name).

Don’t forget to implement exponential backoff referenced at [1] to deal with 500 errors. You can use a Drive SDK example at [2] as a reference.

[1] - https://developers.google.com/storage/docs/json_api/v1/how-tos/upload#exp-backoff
[2] - https://developers.google.com/drive/web/handle-errors

Upvotes: 4

Daedalus Mythos
Daedalus Mythos

Reputation: 575

I assume the sheer filename = "D:\foldername" is not enough info about the source code. Neither am I sure that this is even possible.. via the web interface you can also just upload files or create folders where you then upload the files.

You could save the folders name, then create it (I've never used the google-app-engine, but I guess that should be possible) and then upload the contents to the new folder

Upvotes: 1

Related Questions