Paul Holmes
Paul Holmes

Reputation: 133

Cannot download Excel file through Drive API -

I am trying to download an Excel file using Drive API. Here is my code:

def downloadXlsx(vars, file, creds):
    try:
        service = build('drive', 'v3', credentials=creds)
        fileId = file['id']
        fileName = file['name']
        # request = service.files().get_media(fileId=fileId)
        request = service.files().get_media(fileId=fileId, acknowledgeAbuse=True)
        # request = service.files().get(fileId=fileId,   supportsTeamDrives=True, fields='*').execute()

        fh = io.BytesIO()
        downloader = MediaIoBaseDownload(fh, request)
        done = False
        while done is False:
            status, done = downloader.next_chunk()
            print("Download %d%%." % int(status.progress() * 100))
        fh.seek(0)
        print('%s%s' % (vars.download_directory, fileName))
        with open('%s%s' % (vars.download_directory, fileName), 'wb') as f:
            shutil.copyfileobj(fh, f, length=131072)
    except HttpError as error:
        print(f'An error occurred: {error}')

Every time I run it most files return this error

An error occurred: <HttpError 403 when requesting https://www.googleapis.com/drive/v3/files/xxxxxxxxxxxxxxxxxxxx?acknowledgeAbuse=true&alt=media returned "This file has been identified as malware or spam and cannot be downloaded.". Details: "[{'domain': 'global', 'reason': 'cannotDownloadAbusiveFile', 'message': 'This file has been identified as malware or spam and cannot be downloaded.'}]">

I tried adding the acknowledgeAbuse=True flag but it doesn't change anything. Previously it would give me this error:

An error occurred: <HttpError 403 when requesting https://www.googleapis.com/drive/v3/files/1fr7NwhToKFgvbNgExl0QMgurLJlx8KmV?acknowledgeAbuse=true&alt=media returned "Only the owner can download abusive files.". Details: "[{'domain': 'global', 'reason': 'cannotDownloadAbusiveFile', 'message': 'Only the owner can download abusive files.', 'locationType': 'parameter', 'location': 'acknowledgeAbuse'}]">

But I no longer get this error and I'm not sure why as I haven't changed anything.

I tried using this line:

   request = service.files().get(fileId=fileId,   supportsTeamDrives=True, fields='*').execute()

Which would download the file but it would be corrupted and unable to be opened.

Anyway, does anyone have a clue how I can get around this? Maybe a different method I could try or a way to get the .get() to download the file properly? I don't know why it's saying I'm not the owner - if anyone has knowledge on how Drive API determines 'who' is executing the API that would be helpful.

Edit: I'm looking at the files.get method documentation here and it reads the following:

By default, this responds with a Files resource in the response body. If you provide the URL parameter alt=media, then the response includes the file contents in the response body. Downloading content with alt=media only works if the file is stored in Drive. To download Google Docs, Sheets, and Slides use files.export instead. For further information on downloading files, refer to Download files

Seems like I need to specify alt=media somehow but not sure if that is possible in my situation. Maybe that's referring to get_media?

Upvotes: 1

Views: 680

Answers (2)

Paul Holmes
Paul Holmes

Reputation: 133

Fixed! It was a bug with Google Drive API. https://issuetracker.google.com/issues/238551542

Upvotes: 3

Linda Lawton - DaImTo
Linda Lawton - DaImTo

Reputation: 116968

There is an optional parm that you can send with your file.get request

acknowledgeAbuse boolean Whether the user is acknowledging the risk of downloading known malware or other abusive files. This is only applicable when alt=media. (Default: false)

try

request = service.files().get(fileId=fileId,   supportsTeamDrives=True, fields='*', acknowledgeAbuse='true').execute()

Upvotes: 1

Related Questions