LucSpan
LucSpan

Reputation: 1971

Get all records from 2 columns, starting from specific row

Set-up

Via gspread I have access to a Google sheet containing data.

Normally, I use df = pd.DataFrame(wsheet.get_all_records()) to dump all data into a pandas dataframe.


Issue

I only need the data of 5 specific sequential columns, i.e. all columns including and between for example column 1 and 5 of the Google sheet.

Moreover, I only need the data starting from the 5th row in the Google sheet.

I've tried my regular df = pd.DataFrame(wsheet.get_all_records()) and then drop columns and rows in pandas. However, I think due to the markup I use in the first 4 rows in the Google sheet, the resulting dataframe has some oddities – adjusting in pandas gives strange results.


Question

Given the markup, I suspect it's easier to just dump all data incl. and between column 1 and 5 in a dataframe, starting from row 5.

But how do I do this?

Upvotes: 3

Views: 3310

Answers (1)

Tanaike
Tanaike

Reputation: 201553

  • You want to retrieve the values from the columns "A" and "E" after the row 5 from the Google Spreadsheet.
  • You want to achieve this using gspread with python.
  • You have already been able to get and put values for Spreadsheet using Sheets API.

Modification points:

  • In this modification, at first, the values are retrieved with get_all_values() as a slice. And the retrieved values are processed and convert it to the dataframe.

Modified script:

When your script is modified, it becomes as follows. In this case, it supposes that wsheet can be used.

From:

df = pd.DataFrame(wsheet.get_all_records())

To:

v = [[e[0], e[4]] for e in wsheet.get_all_values()]
df = pd.DataFrame(v[4:], columns=v[0])
  • In this case, df is the values retrieved the columns "A" and "E" after the row 5.

Reference:

Added:

If you want to retrieve the values from the columns "A" to "E" after the row 5 from the Google Spreadsheet, how about the following modification?

From:

df = pd.DataFrame(wsheet.get_all_records())

To:

v = [e[0:5] for e in wsheet.get_all_values()]
df = pd.DataFrame(v[4:], columns=v[0])

Upvotes: 2

Related Questions