Reputation: 421
I had a good discussion with one of my colleagues and he mentioned creating a temporary table degrades the performance in Azure Synapse because Synapse creates the temporary table first in the master node then distribute them to child node. Is it true? He recommended me to create create permanent table instead of temporary table.
Upvotes: 2
Views: 1955
Reputation: 11625
That’s not correct. Temp tables don’t necessarily funnel through the control node. Let’s say you are selecting from a table distributed on ProductKey and loading it into a #temp table distributed on ProductKey. The data will never leave each compute node since it’s a distribution compatible insert.
On the other hand, if you run a query that uses a ROW_NUMBER
function, for example, that would have to be calculated on the control node and then the data would be sent back to the compute nodes to be stored in the distributed temp table. But that only happens in the presence of some types of functions and some types of queries. It is not the norm. If you are worried about a particular query then add the word EXPLAIN
to the front of it and paste the explain plan XML into your question so we can help you interpret it.
If you load a #temp table with a SELECT INTO
statement you can’t specify the table geometry so it will be a round robin distributed columnstore. Usually this isn’t ideal since it takes extra time and memory to compress a columnstore and because round robin distribution isn’t ideal unless there is no good distribution key. Usually the next query which uses the round robin distributed temp table will just reshuffle it so it’s best to properly hash distribute a temp table initially. To do this do a CTAS statement as described here.
Upvotes: 4