Reputation: 1971
Set-up
Via gspread I have access to a Google sheet containing data.
Normally, I use df = pd.DataFrame(wsheet.get_all_records())
to dump all data into a pandas dataframe.
Issue
I only need the data of 5 specific sequential columns, i.e. all columns including and between for example column 1 and 5 of the Google sheet.
Moreover, I only need the data starting from the 5th row in the Google sheet.
I've tried my regular df = pd.DataFrame(wsheet.get_all_records())
and then drop columns and rows in pandas. However, I think due to the markup I use in the first 4 rows in the Google sheet, the resulting dataframe has some oddities – adjusting in pandas gives strange results.
Question
Given the markup, I suspect it's easier to just dump all data incl. and between column 1 and 5 in a dataframe, starting from row 5.
But how do I do this?
Upvotes: 3
Views: 3310
Reputation: 201553
get_all_values()
as a slice. And the retrieved values are processed and convert it to the dataframe.When your script is modified, it becomes as follows. In this case, it supposes that wsheet
can be used.
df = pd.DataFrame(wsheet.get_all_records())
v = [[e[0], e[4]] for e in wsheet.get_all_values()]
df = pd.DataFrame(v[4:], columns=v[0])
df
is the values retrieved the columns "A" and "E" after the row 5.If you want to retrieve the values from the columns "A" to "E" after the row 5 from the Google Spreadsheet, how about the following modification?
df = pd.DataFrame(wsheet.get_all_records())
v = [e[0:5] for e in wsheet.get_all_values()]
df = pd.DataFrame(v[4:], columns=v[0])
Upvotes: 2