shantanuo
shantanuo

Reputation: 32306

How to read the redshift query plan?

I have this query that takes hours to complete. How do I optimize the query?

# explain select setup from user_logs_dlr_sept_oct2020 as a inner join beba_eme_receiver_dlr_2014_198_eme_sept2020 as b on a.messageid=b.batched

                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
 XN Hash Join DS_BCAST_INNER  (cost=1940783.40..62105890177235.56 rows=195463968 width=393)
   Hash Cond: (("outer".messageid)::text = ("inner".batched)::text)
   ->  XN Seq Scan on user_logs_dlr_sept_oct2020 a  (cost=0.00..5198612.48 rows=519861248 width=423)
   ->  XN Hash  (cost=1552626.72..1552626.72 rows=155262672 width=30)
         ->  XN Seq Scan on beba_eme_receiver_dlr_2014_198_eme_sept2020 b  (cost=0.00..1552626.72 rows=155262672 width=30)
(5 rows)

"show create table user_logs" is the command that I use in MySQL. How do I check if the table contains distkey or sortkey in redshift?

Upvotes: 0

Views: 4366

Answers (1)

Bill Weiner
Bill Weiner

Reputation: 11032

I like to read these from the bottom up. Here's what I see:

  1. Scan of beba_eme_receiver_dlr_2014_198_sept2020: Reading table from disk. 155M rows and 30 columns. Redshift doesn't think this will take too long.
  2. Hash performed on this tables data to get ready for the join
  3. Scan of user_logs_dlr_sept_oct2020: Reading table from disk. 519M rows and 423 columns. Redshift predicts this takes a bit longer than the other table but very long.
  4. Hash join of these tables on the condition - (("outer".messageid)::text = ("inner".batched)::text). This is where things go sideways. See the cost value on that step? It grows to 62105890177235.56, that's a big number.

Based on this plan I'm surprised that the query only takes hours and not days but this points out an important point - this is just an analysis of the pre-execution plan. This is not what actually happened. On the console you can look at the query's "actual execution" statistics to see how long each step is taking but I expect it will show the problem to be in the hash join step.

So what is going on and how can it be improved? First off you need to remember that Redshift is built on a cluster of compute nodes and there is network connections between them. Which data is on which node is determined by the distribution keys of the tables (I hope you didn't set these up as EVEN distribution). So when a join needs to be performed data that matches on the join condition need to be "moved" to the same node to perform the join. In your plan you see "Hash Join DS_BCAST_INNER", this means that all the data from one of your tables is being network broadcast to all other nodes - not good. Ideally your tables would already have the table data distributed so that no movement is needed. You may have this set up correctly but there is another issue that is blocking.

The blocking issue is that you are casting the values to text for both of the join on fields. You see normally Redshift would define a plan that would set up which node is responsible for which values of the join condition and data would be "distributed" (not broadcast) to the responsible node. But since you have casts on both sides of the join on condition the planner has no idea how to set this up and needs to broadcast all the table data to all the nodes - a network expensive operation.

Things I would work on to improve this query:

  1. Remove the cast from the join on condition (at least one side and the larger table side if possible). This may require looking at your table schemas.
  2. Re-evaluate the distribution of these tables and see if they can be distributed on similar keys (or use DISTSTYLE ALL for smaller tables)
  3. Re-evaluate the need to have this many rows and columns flowing into the join. Adding where clauses to the table scans can greatly reduce the time and expense for a network based join.

And I would look at the actually execution statistics as you attempt improvements as this may be different that the cost expectations of the query planner.

Upvotes: 3

Related Questions