PMarsh
PMarsh

Reputation: 13

Scraping a public tableau dashboard

This is a follow-up question to How to scrape a public tableau dashboard? and the use of the impressive Tableau scraper library. The library has the ability to select an item in a worksheet, however it fails to recognize the requested value.

The tableau dashboard is here: https://tableau.ons.org.br/t/ONS_Publico/views/DemandaMxima/HistricoDemandaMxima?:embed=y&:display_count=y&:showAppBanner=true&:showVizHome=y

And my code is:

from tableauscraper import TableauScraper as TS

url = 'https://tableau.ons.org.br/t/ONS_Publico/views/DemandaMxima/HistricoDemandaMxima'
ts = TS()
ts.loads(url)
wb = ts.getWorkbook()

# Set units
wb.setParameter("Selecione DM Simp 4", "Demanda Máxima Instântanea (MW)")

# Set to daily resolution
wb.setParameter("Escala de Tempo DM Simp 4", "Dia")

# Set the start date
wb.setParameter("Início Primeiro Período DM Simp 4","01/01/2017")

# Set the end date
wb = wb.setParameter("Fim Primeiro Período DM Simp 4","12/31/2017")

# Retrieve daily worksheet
ws = wb.getWorksheet("Simples Demanda Máxima Semana Dia")

# Select subsystem
ws.select("ATRIB(Subsistema)", "Norte")

(This is where I am warned "tableauScraper - ERROR - 'Norte' is not in list")

# show data
print(ws.data)

# export data
ws.data.to_csv('C:\Temp\Data.csv')

Any help would be appreciated.

Upvotes: 1

Views: 709

Answers (1)

Bertrand Martel
Bertrand Martel

Reputation: 45443

I've updated the library to make it work (I'm the author of TableauScraper library). There was multiple issues with this usecase :

  • the parameters/filters were not persisted between API calls, in this case, the setParameter didn't return the initial filters
  • the format of the parameter value for Fim Primeiro Período DM Simp 4 is 31/12/2017 (DD/MM/YYYY)
  • it's actually a filter and not a select API call that must be performed. In this case it's:
wb = ws.setFilter("Subsistema", "N")

Norte is a label, you can get the possible values using ws.getFilters()

  • This dashboard uses storypoints and the filters are embedded into it. Parsing filters inside storypoints was not implemented until now
  • The filter API call specifies the storyboard and the storypointId which were necessary to be implemented in order to make the API call work (specific to storypoints)

Also, the worksheet used to get/set the filter is Simples Demanda Máxima Ano (even though the filter is set to daily)

Using the latest release, the following works:

from tableauscraper import TableauScraper as TS

url = 'https://tableau.ons.org.br/t/ONS_Publico/views/DemandaMxima/HistricoDemandaMxima'
ts = TS()
ts.loads(url)

wb = ts.getWorkbook()

# Set units
wb.setParameter("Selecione DM Simp 4", "Demanda Máxima Instântanea (MW)")

# Set to daily resolution
wb.setParameter("Escala de Tempo DM Simp 4", "Dia")

# # Set the start date
wb.setParameter("Início Primeiro Período DM Simp 4", "01/01/2017")

# Set the end date
wb = wb.setParameter("Fim Primeiro Período DM Simp 4", "31/12/2017")

# Retrieve daily worksheet
ws = wb.getWorksheet("Simples Demanda Máxima Semana Dia")

print(ws.data[['Data Escala de Tempo 1 DM Simp 4-value',
               'SOMA(Selecione Tipo de DM Simp 4)-value', 'ATRIB(Subsistema)-alias']])

ws = wb.getWorksheet("Simples Demanda Máxima Ano")
print(ws.getFilters())

# Select subsystem
wb = ws.setFilter("Subsistema", "N")
ws = wb.getWorksheet("Simples Demanda Máxima Semana Dia")

print(ws.data[['Data Escala de Tempo 1 DM Simp 4-value',
               'SOMA(Selecione Tipo de DM Simp 4)-value', 'ATRIB(Subsistema)-alias']])

repl.it: https://replit.com/@bertrandmartel/TableauONSDemandaMaxima

Upvotes: 1

Related Questions