Parse Table data from a public google doc using Python

Question

I have a URL to a public google doc which is published (It says published using Google Docs at the top). It has a URL in the form of https://docs.google.com/document/d/e//pub

Please note that this is not a spreadsheet (Google sheet), but a doc. This doc contains some explanatory text at the beginning and then a table I need to read. How do I accomplish this using Python and only the URL? I don't have much knowledge of Google APIs, etc. I don't want the text at the beginning, but only the table data in some popular format like a Pandas dataframe, etc. The table data could also contain Unicode characters.

I tried following some steps in the Docs API quickstart guide (https://developers.google.com/docs/api/quickstart/python). After I followed the instructions, the given code (copy-pasted as it is) worked. Still, it involved some steps about creating a new Google project, enabling the API, configuring the OAuth screen and then authorizing credentials for a desktop application. However, when I replaced the example document ID (the string inside the quotes

DOCUMENT_ID = "195j9eDD3ccgjQRttHhJPymLJUCOUjs-jmwTrekvdjFE")

with the ID of the document I need to access, I got this error:

https://docs.googleapis.com/v1/documents/?alt=json returned "Requested entity was not found.". Details: "Requested entity was not found.">

I just want a simple solution which uses only the published doc's URL, since the doc is already public. I don't want to go through some authentication steps. I need that even if I send the code to someone else, they can also run the same code and get the same results without any authentication issues. Please help me with this.

Parse Table data from a public google doc using Python

Answers (1)

Related Questions