NxC
NxC

Reputation: 330

Tableau limited Data Extract slow connection

I am designing visualization in Tableau and my data is in Hive/ hadoop, and the data is huge,

while i am trying to design the visualization, the query runs very very slow as every time it tries to pull data from hadoop.

so for any visualization it typically takes 4 mins for simple drag drop and visualization may have 10s of drag drop so i am ending up spending lot of time waiting. I tried to use Data Extract option, however its taking forever to data extract (38 mins and was still going on)

Question: is there way i can extract only 1000 records so i can work on these 1000 records to create viz and then later switch to Live connection when design is done. I tried to look in tableau community help but so far no luck

Upvotes: 0

Views: 713

Answers (2)

NxC
NxC

Reputation: 330

copy all the DATA in XL and connected my Tableue with XL and got my dahsboard done within few minutes and since the XL and hive had exactly same fields, i could replace the xl connection with Hive and it just worked. Its complaining about the calculated fields on some of sheet but i guess i can redo that part on hive and get around.

Upvotes: 1

Stephen ODonnell
Stephen ODonnell

Reputation: 4466

One option may be to turn off auto-update so it does not reload data each time you drag and drop:

https://onlinehelp.tableau.com/current/pro/desktop/en-us/queries_autoupdates.html

Another thing you could try is the following. In Hive, you could create a smaller version of the table with only a few 1000 rows. Then create a view over the table and point Tableau at the view. Design your Viz against the view, and then when you are done recreate the view in Hive to point at the real table. This may help, but if Hive tries to kick off a map reduce job for each drag and drop, it is still going to be frustratingly slow.

In my experience with Tableau, you want to get your dataset down to a size where you can use the extract option. Any interactive dashboard that has a live connection to Hive, is going to be slow. However, if you can aggregate the dataset down to a manageable size, if you perform an extract it can work very well. I don't work with Tableau any more, but in the past I had extracts that took 30 - 60 minutes to refresh and loaded low millions of rows and it worked well.

Upvotes: 0

Related Questions