ankit7540
ankit7540

Reputation: 315

Getting public Google sheets as panda dataframe

I have been trying to obtain the Statewise sheet from this public googlesheet link as a python dataframe.

The URL of this sheet is different from the URL of other examples for achieving the goal of getting sheet as dataframe, seen on this website.

URL is this : https://docs.google.com/spreadsheets/d/e/2PACX-1vSc_2y5N0I67wDU38DjDh35IZSIS30rQf7_NYZhtYYGU1jJYT6_kDx4YpF-qw0LSlGsBYP8pqM_a1Pd/pubhtml#

One standard way maybe the following,

import pandas

googleSheetId = '<Google Sheets Id>'
worksheetName = '<Sheet Name>'
URL = 'https://docs.google.com/spreadsheets/d/{0}/gviz/tq?tqx=out:csv&sheet={1}'.format(
    googleSheetId,
    worksheetName
)

df = pandas.read_csv(URL)
print(df)

But in the present URL I do not see a structure used here. Can someone help to clarify. Thanks.

Upvotes: 0

Views: 1272

Answers (2)

Serge Ballesta
Serge Ballesta

Reputation: 149185

The Google spreadsheet is actually an html thing. So you should use read_html to load it into a list of pandas dataframes:

dfs = pd.read_html(url, encoding='utf8')

if lxml is available or, if you use BeautifulSoup4:

dfs = pd.read_html(url, flavor='bs4', encoding='utf8')

You will get a list of dataframes and for example dfs[0] is:

     0   1                                                  2                3
0    1  id                                             Banner  Number_Of_Times
1    2   1  Don't Hoard groceries and essentials. Please e...                2
2    3   2  Be compassionate! Help those in need like the ...                2
3    4   3  Be considerate : While buying essentials remem...                2
4    5   4  Going out to buy essentials? Social Distancing...                2
5    6   5  Plan ahead! Take a minute and check how much y...                2
6    7   6  Plan and calculate your essential needs for th...                2
7    8   7  Help out the elderly by bringing them their gr...                2
8    9   8  Help out your workers and domestic help by not...                2
9   10   9  Lockdown means LOCKDOWN! Avoid going out unles...                1
10  11  10           Panic mode : OFF! ❌ESSENTIALS ARE ON! ✔️                1
11  12  11  Do not panic! ❌ Your essential needs will be t...                1
12  13  12  Be a true Indian. Show compassion. Be consider...                1
13  14  13  If you have symptoms and suspect you have coro...                1
14  15  14  Stand Against FAKE News and WhatsApp Forwards!...                1
15  16  15  If you have any queries, Reach out to your dis...                1

Upvotes: 2

J-H
J-H

Reputation: 1869

You can use the following snippet:

from io import BytesIO
import requests

r = requests.get(URL)
data = r.content

df = pd.read_csv(BytesIO(data), index_col=0, error_bad_lines=False)

Upvotes: 0

Related Questions