Reputation:
According to tableau there is a way to optimize R querying. By addressing the partitioning of data: http://kb.tableau.com/articles/knowledgebase/r-implementation-notes
The solution is not clear to me. Does anyone know of an example of this as I would love to see how this works
Upvotes: 1
Views: 441
Reputation: 21641
The recommendation is to pass the values as a vector (column / row in Tableau) instead of a single cell in order to reduce the number of RServe calls. If your table in Tableau is structured to do calculations along cells, each cell becomes a partition. In order to compute the result of a calculation that would apply to a whole column, Tableau calls Rserve for each cell.
Here's what is happening (from the official documentation):
If your table calculations are set to Cell
, Tableau makes one call to Rserve per partition:
Cell
This option sets the addressing to the individual cells in the table. All fields become partitioning fields. This option is generally most useful when computing a percent of total calculation.
Instead of a call for every row / column:
Optimizing R scripts
SCRIPT_ functions in Tableau are table calculations functions, so addressing and partitioning concepts apply. Tableau makes one call to Rserve per partition. Because connecting to Rserve involves some overhead, try to pass values as vectors rather than as individual values whenever possible. For example if you set addressing to Cell (that is, Set Calculate the difference along in the Table Calculation dialog box to Cell), Tableau will make a separate call per row to Rserve; depending on the size of the data, this can result in a very large number of individual Rserve calls. If you instead use a column that identifies each row that you would use in level of detail, you could "compute along" that column so that Tableau could pass those values in a single call.
Upvotes: 2