Vivian Mascarenhas
Vivian Mascarenhas

Reputation: 183

Scraping data from Tableau

I have to scrape data from the tabulae workbook to csv file. https://public.tableau.com/views/2020_04_06_COVID19_India/Dashboard_India_Cases?:embed=y&:showVizHome=no&:host_url=https%3A%2F%2Fpublic.tableau.com%2F&:embed_code_version=3&:tabs=no&:toolbar=yes&:animate_transition=yes&:display_static_image=no&:display_spinner=no&:display_overlay=yes&:display_count=yes&publish=yes&:loadOrderID=0

I have tried the following but i am getting no output.

main.py

import requests
from bs4 import BeautifulSoup


 r = requests.get("https://public.tableau.com/views/2020_04_06_COVID19_India/Dashboard_India_Cases?:embed=y&:showVizHome=no&:host_url=https%3A%2F%2Fpublic.tableau.com%2F&:embed_code_version=3&:tabs=no&:toolbar=yes&:animate_transition=yes&:display_static_image=no&:display_spinner=no&:display_overlay=yes&:display_count=yes&publish=yes&:loadOrderID=0")

     soup = BeautifulSoup(r.content, "html.parser")

     for td in soup.findAll("table"):

     for a in td.findAll("tr"):
      print(a.find('td'))

Upvotes: 1

Views: 2193

Answers (1)

Bertrand Martel
Bertrand Martel

Reputation: 45443

I've made this python tableau scraper library that lists worksheets and exports data into a pandas dataframe for each worksheet. For example, the following gets the table you're looking for :

from tableauscraper import TableauScraper as TS

url = "https://public.tableau.com/views/2020_04_06_COVID19_India/Dashboard_India_Cases"

ts = TS()
ts.loads(url)
dashboard = ts.getDashboard()

for t in dashboard.worksheets:
    #show worksheet name
    print(f"WORKSHEET NAME : {t.name}")
    #show dataframe for this worksheet
    print(t.data)

run this code on repl.it

Upvotes: 2

Related Questions