idanshmu
idanshmu

Reputation: 5261

How to import a CSV file using Google Sheets API V4

Background

I'm developing a Python 2.7 script that analyzes data from an SQL table and at the end, generates a CSV file.

Once the file is generated, I'm logging into my google sheet account and use the import option to import my CSV file into the google spreadsheet

The manual labor is kinda stupid and I wish to add this ability to my script.

Google Sheets API V4

So, I followed this guide, Python Quickstart and was able to complete all the steps.

Then I followed Google Sheets API reference and looked into Method: spreadsheets.create. If I understand correctly, it does not provides the options to import from a file.

It seems like there is no API for the import functionality.

Question

How to import a CSV file using Google Sheets API V4? Is their an example/reference that I'm missing?

Upvotes: 21

Views: 47050

Answers (6)

wescpy
wescpy

Reputation: 11167

2024 answer: The Sheets API is primarily for document-oriented functionality, like data entry, applying numeric formulae, adding charts, creating pivot tables, cell formatting, resizing rows/columns, etc.

However, performing file-level access such as uploading/downloading, importing/exporting, copying, moving, renaming, sharing, etc., developers are more likely to use the Drive API instead.

(Yes, you can still create a blank Sheet using the Sheets API as well as populate it, but that wasn't the OP's question, which is about importing CSV files, presumably in their entirety, not one row or cell at a time.)

There are 2 ways to bring a CSV file into Drive using the Drive API:

  1. Upload CSV as-is to Drive (do not convert for Sheets):
FILENAME = 'inventory.csv'
METADATA = {'name': FILENAME}
rsp = DRIVE.files().create(body=METADATA, media_body=FILENAME).execute()
if rsp:
    print('Uploaded %r to Drive (file ID: %s)' % (FILENAME, rsp['id']))
  1. Import CSV to Drive for Sheets (do conversion):
DST_FILENAME = 'inventory'
SRC_FILENAME = DST_FILENAME + '.csv'
SHT_MIMETYPE = 'application/vnd.google-apps.spreadsheet'
METADATA = {'name': DST_FILENAME, 'mimeType': SHT_MIMETYPE}
rsp = DRIVE.files().create(
        body=METADATA, media_body=SRC_FILENAME).execute()
if rsp:
    print('Imported %r as Sheets to %r (file ID: %s)' % (
            SRC_FILENAME, DST_FILENAME, rsp['id']))

The key difference between both is whether you want to convert for Sheets. If you do, then add the Sheets MIMEtype to the METADATA. Otherwise it'll just do a straight upload as CSV.

Full versions of both are available in my blog post (and linked to the GitHub repo), and all code is Python 2-3 compatible. There are versions that use the older/deprecated auth library (oauth2client) as well as the newer/current one (google.auth) in case you prefer one over the other. If I'm feeling frisky, one day I'd like to add equivalent samples that use service account auth. (There are also equivalent versions in Node.js if you don't do Python.)

If you're new to the Drive API, I created a few more examples of how to use it:

(*) - TL;DR: upload plain text file to Drive, import/convert to Google Docs format, then export that Doc as PDF. Post above uses Drive API v2; this follow-up post describes migrating it to Drive API v3, and here's a developer video combining both "poor man's converter" posts.

Upvotes: 0

Ufos
Ufos

Reputation: 3305

I've spent couple of hours trying to make any of the other answers work. Libraries do not explain the authentication well, and don't work with google-provided way of handling credentials. On the other hand, Sam's answer doesn't elaborate on the details of using the API, which might be confusing at times. So, here is a full recipe of uploading CSVs to gSheets. It uses both Sam's and CapoChino's answers plus some of my own research.

  1. Authenticate/Setup. Generally, refer to the docs
    • Big blue button will get you credentials.json with no extra steps
    • quickstart.py can easily be adapted into authenticate.py
    • scopes should contain https://www.googleapis.com/auth/spreadsheets

Hopefully by now you have your credentials stored, so let's move to the actual code

  1. Recipe that should work out of the box:
import pickle
from googleapiclient.discovery import build

SPREADSHEET_ID = '1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms' # Get this one from the link in browser
worksheet_name = 'Sheet2'
path_to_csv = 'New Folder/much_data.csv'
path_to_credentials = 'Credentials/token.pickle'


# convenience routines
def find_sheet_id_by_name(sheet_name):
    # ugly, but works
    sheets_with_properties = API \
        .spreadsheets() \
        .get(spreadsheetId=SPREADSHEET_ID, fields='sheets.properties') \
        .execute() \
        .get('sheets')

    for sheet in sheets_with_properties:
        if 'title' in sheet['properties'].keys():
            if sheet['properties']['title'] == sheet_name:
                return sheet['properties']['sheetId']


def push_csv_to_gsheet(csv_path, sheet_id):
    with open(csv_path, 'r') as csv_file:
        csvContents = csv_file.read()
    body = {
        'requests': [{
            'pasteData': {
                "coordinate": {
                    "sheetId": sheet_id,
                    "rowIndex": "0",  # adapt this if you need different positioning
                    "columnIndex": "0", # adapt this if you need different positioning
                },
                "data": csvContents,
                "type": 'PASTE_NORMAL',
                "delimiter": ',',
            }
        }]
    }
    request = API.spreadsheets().batchUpdate(spreadsheetId=SPREADSHEET_ID, body=body)
    response = request.execute()
    return response


# upload
with open(path_to_credentials, 'rb') as token:
    credentials = pickle.load(token)

API = build('sheets', 'v4', credentials=credentials)

push_csv_to_gsheet(
    csv_path=path_to_csv,
    sheet_id=find_sheet_id_by_name(worksheet_name)
)

Good thing about directly using batchUpdate is that it uploads thousands of rows in a second. On a low level gspread does the same and should be as performant. Also there is gspread-pandas.

p.s. the code is tested with python 3.5, but this thread seemed to be most appropriate to submit it to.

Upvotes: 22

CapoChino
CapoChino

Reputation: 79

I like Burnash's gspread library, but the import_csv function in his answer is limited. It always starts the paste at A1 of the first worksheet (tab) and deletes all other tabs.

I needed to paste starting at a particular tab and cell, so I took Sam Berlin's suggestion to use a PasteDataRequest. Here's my function:

def pasteCsv(csvFile, sheet, cell):
    '''
    csvFile - path to csv file to upload
    sheet - a gspread.Spreadsheet object
    cell - string giving starting cell, optionally including sheet/tab name
      ex: 'A1', 'MySheet!C3', etc.
    '''
    if '!' in cell:
        (tabName, cell) = cell.split('!')
        wks = sheet.worksheet(tabName)
    else:
        wks = sheet.sheet1
    (firstRow, firstColumn) = gspread.utils.a1_to_rowcol(cell)

    with open(csvFile, 'r') as f:
        csvContents = f.read()
    body = {
        'requests': [{
            'pasteData': {
                "coordinate": {
                    "sheetId": wks.id,
                    "rowIndex": firstRow-1,
                    "columnIndex": firstColumn-1,
                },
                "data": csvContents,
                "type": 'PASTE_NORMAL',
                "delimiter": ',',
            }
        }]
    }
    return sheet.batch_update(body)

Note that I used a raw pasteData request rather than the higher-level update_cells method to take advantage of Google's automatic (correct) handling of input data that contains quoted strings, which may contain non-delimeter commas.

Upvotes: 5

Burnash
Burnash

Reputation: 3321

Another alternative to Sam Berlin's answer. If you're using Python, you can use the Drive API via gspread to import a CSV file. Here's an example:

import gspread

# Check how to get `credentials`:
# https://github.com/burnash/gspread

gc = gspread.authorize(credentials)

# Read CSV file contents
content = open('file_to_import.csv', 'r').read()

gc.import_csv('<SPREADSHEET_ID>', content)

Related question: Upload CSV to Google Sheets using gspread

Upvotes: 7

Jordie Bell&#233;
Jordie Bell&#233;

Reputation: 81

An alternative to Sam Berlin's answer, you can turn your CSV into a list of lists and set that to your POST payload.

Such a function looks something like this:

def preprocess(table):
    table.to_csv('pivoted.csv') # I use Pandas but use whatever you'd like
    _file = open('pivoted.csv')
    contents = _file.read()
    array = contents.split('\n')
    master_array = []
    for row in array:
        master_array.append(row.split(','))
    return master_array

That master array gets thrown into the following:

body = {
      'values': newValues
}

    result2 = service.spreadsheets().values().update(spreadsheetId=spreadsheetId, range=rangeName + str(len(values) + start + 1), valueInputOption="USER_ENTERED", body=body).execute()

It works just fine for me.

Upvotes: 0

Sam Berlin
Sam Berlin

Reputation: 3773

You have two options for importing g CSV file. You can use the Drive API to create a spreadsheet from a CSV, or you can use the Sheets API to create an empty spreadsheet and then use spreadsheets.batchUpdate with a PasteDataRequest to add CSV data.

Upvotes: 25

Related Questions