Log analytics query optimization

Question

I run this query on log analytics

                                        Perf
                                        | where TimeGenerated >= ago(5m) 
                                        | join kind = inner
                                            ( 
                                                Heartbeat  
                                                | where TimeGenerated >= ago(5m)
                                                | summarize arg_max(TimeGenerated, *) 

                                                by SourceComputerId
                                            ) on Computer
                                        | summarize arg_max(TimeGenerated, *) by SourceComputerId, CounterName
                                        | extend  Indicator = strcat(ObjectName,'-', CounterName)
                                        | summarize dict = make_dictionary
                                        (   
                                            pack
                                            (      
                                                  'WorkspaceId' 
                                                ,  TenantId
                                                ,  Indicator       
                                                ,  CounterValue
                                                ,  'TimeGenerated'   
                                                ,  TimeGenerated
                                                ,  'Computer'
                                                ,  Computer
                                            )
                                        )  by SourceComputerId
                                        | evaluate bag_unpack(dict)

But it's a little bit slow. Is there any way to optimize it, I want the fastest possible query to achieve the same results.

Yoni L. · Accepted Answer

It's somewhat challenging to say without you mentioning the size of the data (e.g. record count) for each of the join legs and the cardinality of the SourceComputerId column.

I would recommend that you go over the query best practices document which covers several techniques for optimization, and see if that helps

Update: Explicitly mentioning best practices which may be helpful in your case: (for you to verify)

When using join operator - choose the table with less rows to be the first one (left-most).
When left side is small (up to 100,000 records) and right side is big then it is recommended to use the hint.strategy=broadcast.
When both sides of the join are too big and the join key is with high cardinality, then it is recommended to use the hint.strategy=shuffle.
When the group by keys of the summarize operator are with high cardinality (best practice: above 1 million) then it is recommended to use the hint.strategy=shuffle.

Log analytics query optimization

Answers (1)

Related Questions