Rakesh Agarwal
Rakesh Agarwal

Reputation: 31

Creation of Tableau Dashboard using Impala as Datasource

I have imapla table which contains voluminous records(39885593) and need to create the dashboard using the impala table via Tableau.

I tried to acheive this requirement in multiple ways as below

1) Extracted the data from impala table in Tableau extract and then creating the dashboard. 2) Use the Data extract initially and then switch the connection to Live connection 3) Live Connection

Approach 1 : Able to create the dashboard with data extract. Gives the good performance. Problem with this appoach is data is transactional data which grows everyday so if I go via this approach data extract is going to take more space in the Tableau Server.

Approach2: Using this approach I am able to design the dashboard efficiently however when I switch the connection from data extract to Live and publish the dashboard it takes lot of time to publish the dashboard also when I view the dashboard via Tableau server for opening the dashboard in browser takes more time.

Approach 3: Live Connection gives very slow performance while designing and publishing the dashboard.

If anyone has come across this kind of requiremnt can you please provide me the suggestion for the same.

Thanks

Upvotes: 2

Views: 726

Answers (2)

Alex Blakemore
Alex Blakemore

Reputation: 11921

Unless you need up to the minute live access to millions of transaction records, I recommend working with extracts (possible multiple extracts)

But reduce the size of your extracts to the minimum needed to support your visualization. You can add data source filters, hide unused fields, rollup data to aggregate in the extract to just the level of detail needed for your view.

For large data sets, don't try to make a single extract that is just a copy of your entire data set, make several smaller ones that each support just the information needed for one (or a small set) of related views. Think of an extract like a materialized view.

If a view only displays 100 marks, then strive to have only 100 records in the extract that it uses, even if those are 100 records summarize info from 100 million in the underlying data source.

Then you can have a larger extract or even a live source for people to use when drilling down into a (filtered) detail view, and the first views of your dashboard can launch quickly.

This way interactivity, refreshes and publishing can be fast.

For this approach to work, you may need to get used to having multiple data sources in your workbook, even if based on the same database. And also using filter actions, parameters and calculated fields to filter and link across data sources.

Upvotes: 1

maxymoo
maxymoo

Reputation: 36555

You say that a live connection gives slow performance, maybe you could try aggregating the data in Impala with a Custom SQL query before bringing into Tableau?

Upvotes: 0

Related Questions