DataWriter
DataWriter

Reputation: 1040

How can I optimize a SQL query without access to a showplan?

I'm writing complex ad-hoc queries to extract data from a client's SQL Server. Some of the queries take a half-hour or more to run and I need them to execute in 5 minutes or less. Unfortunately, I don't have permissions to view a showplan or on any of the sys.dm_ views to help me optimize them.

I do have access to all of the INFORMATION_SCHEMA and sys catalog views, so I know what indexes exist, and I can also use STATISTICS TIME and IO to help me measure the effectiveness of the changes.

Since I don't have the ability to compare showplans, how can I best use these tools to guide my intuitions to minimize the time-consuming trial and error? Or does anyone have any innovative solutions that have worked for them in the past.

EDIT: I only have db_DataReader permissions--I can't create new objects or indexes, but I can create temp tables and index them.

Upvotes: 4

Views: 3305

Answers (3)

kalaolani
kalaolani

Reputation: 322

Study the internals of SQL Server and practice the art of query plan prediction.

Practice by writing queries against data you know cold (cardinality, density, distribution, etc). Make sure the queries are complex enough that it's a challenge. Then see if you can picture the plan in your head. Does that compare to the actual. When you start to figure out when SQL will pick LOOP v HASH v MERGE join types and other such operations like sorts, tops, etc ... Then you are starting to get a feel for it.

Regarding Internals documentation, there's never been a book to top this one: https://www.amazon.com/Gurus-Guide-SQL-Server-Boxed/dp/0321287509/. Since Ken is no longer with us :(, I'm not sure what the best current book is. The way I learned how to do this by feel is studying internals.

You can download a free version of SQL Server (many versions) so that you can look at query plans: https://www.microsoft.com/en-us/download/details.aspx?id=42299. Use that for practice.

There are many well documented sample databases for practice: https://www.codeproject.com/Articles/20987/HowTo-Install-the-Northwind-and-Pubs-Sample-Databa https://blogs.msdn.microsoft.com/samlester/2012/08/23/finding-the-correct-version-of-the-adventureworks-sql-server-sample-database/

Before the days of graphical query plans, we used time and io stats and showplan.

https://learn.microsoft.com/en-us/sql/t-sql/statements/set-statistics-io-transact-sql https://learn.microsoft.com/en-us/sql/t-sql/statements/set-statistics-time-transact-sql https://learn.microsoft.com/en-us/sql/t-sql/statements/set-showplan-all-transact-sql

    SET STATISTICS IO ON
SET STATISTICS TIME ON
go
SET SHOWPLAN_ALL ON
go

SELECT o.*, od.*
    FROM [dbo].[Orders] o
        INNER JOIN [dbo].[Order Details] od
            ON o.OrderID = od.OrderID
    WHERE od.OrderID = 10248
go


    SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
StmtText                  StmtId      NodeId      Parent      PhysicalOp                     LogicalOp                      Argument DefinedValues EstimateRows  EstimateIO    EstimateCPU   AvgRowSize  TotalSubtreeCost OutputList Warnings Type                                                             Parallel EstimateExecutions
------------------------- ----------- ----------- ----------- ------------------------------ ------------------------------ -------- ------------- ------------- ------------- ------------- ----------- ---------------- ---------- -------- ---------------------------------------------------------------- -------- ------------------
SET STATISTICS IO ON      1           1           0           NULL                           NULL                           1        NULL          NULL          NULL          NULL          NULL        NULL             NULL       NULL     SET STATS                                                        0        NULL
SET STATISTICS TIME ON    2           2           0           NULL                           NULL                           2        NULL          NULL          NULL          NULL          NULL        NULL             NULL       NULL     SET STATS                                                        0        NULL

(2 row(s) affected)

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
StmtText               StmtId      NodeId      Parent      PhysicalOp                     LogicalOp                      Argument DefinedValues EstimateRows  EstimateIO    EstimateCPU   AvgRowSize  TotalSubtreeCost OutputList Warnings Type                                                             Parallel EstimateExecutions
---------------------- ----------- ----------- ----------- ------------------------------ ------------------------------ -------- ------------- ------------- ------------- ------------- ----------- ---------------- ---------- -------- ---------------------------------------------------------------- -------- ------------------
SET SHOWPLAN_ALL ON    1           1           0           NULL                           NULL                           1        NULL          NULL          NULL          NULL          NULL        NULL             NULL       NULL     SET ON/OFF                                                       0        NULL

(1 row(s) affected)

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
StmtText                                                                                                                                              StmtId      NodeId      Parent      PhysicalOp                     LogicalOp                      Argument                                                                                                              DefinedValues                                                                                                                                                                                                                                          EstimateRows  EstimateIO    EstimateCPU   AvgRowSize  TotalSubtreeCost OutputList                                                                                                                                                                                                                                                       Warnings Type                                                             Parallel EstimateExecutions
----------------------------------------------------------------------------------------------------------------------------------------------------- ----------- ----------- ----------- ------------------------------ ------------------------------ --------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ ------------- ------------- ------------- ----------- ---------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------- ---------------------------------------------------------------- -------- ------------------

SELECT o.*, od.*
    FROM [dbo].[Orders] o
        INNER JOIN [dbo].[Order Details] od
            ON o.OrderID = od.OrderID
    WHERE od.OrderID = 10248            1           1           0           NULL                           NULL                           1                                                                                                                     NULL                                                                                                                                                                                                                                                   3             NULL          NULL          NULL        0.00658094       NULL                                                                                                                                                                                                                                                             NULL     SELECT                                                           0        NULL
  |--Nested Loops(Inner Join)                                                                                                                         1           2           1           Nested Loops                   Inner Join                     NULL                                                                                                                  NULL                                                                                                                                                                                                                                                   3             0             1.254E-05     254         0.00658094       [o].[OrderID], [o].[CustomerID], [o].[EmployeeID], [o].[OrderDate], [o].[RequiredDate], [o].[ShippedDate], [o].[ShipVia], [o].[Freight], [o].[ShipName], [o].[ShipAddress], [o].[ShipCity], [o].[ShipRegion], [o].[ShipPostalCode], [o].[ShipCountry], [od].[Ord NULL     PLAN_ROW                                                         0        1
       |--Clustered Index Seek(OBJECT:([Northwind].[dbo].[Orders].[PK_Orders] AS [o]), SEEK:([o].[OrderID]=(10248)) ORDERED FORWARD)                  1           3           2           Clustered Index Seek           Clustered Index Seek           OBJECT:([Northwind].[dbo].[Orders].[PK_Orders] AS [o]), SEEK:([o].[OrderID]=(10248)) ORDERED FORWARD                  [o].[OrderID], [o].[CustomerID], [o].[EmployeeID], [o].[OrderDate], [o].[RequiredDate], [o].[ShippedDate], [o].[ShipVia], [o].[Freight], [o].[ShipName], [o].[ShipAddress], [o].[ShipCity], [o].[ShipRegion], [o].[ShipPostalCode], [o].[ShipCountry]  1             0.003125      0.0001581     231         0.0032831        [o].[OrderID], [o].[CustomerID], [o].[EmployeeID], [o].[OrderDate], [o].[RequiredDate], [o].[ShippedDate], [o].[ShipVia], [o].[Freight], [o].[ShipName], [o].[ShipAddress], [o].[ShipCity], [o].[ShipRegion], [o].[ShipPostalCode], [o].[ShipCountry]            NULL     PLAN_ROW                                                         0        1
       |--Clustered Index Seek(OBJECT:([Northwind].[dbo].[Order Details].[PK_Order_Details] AS [od]), SEEK:([od].[OrderID]=(10248)) ORDERED FORWARD)  1           4           2           Clustered Index Seek           Clustered Index Seek           OBJECT:([Northwind].[dbo].[Order Details].[PK_Order_Details] AS [od]), SEEK:([od].[OrderID]=(10248)) ORDERED FORWARD  [od].[OrderID], [od].[ProductID], [od].[UnitPrice], [od].[Quantity], [od].[Discount]                                                                                                                                                                   3             0.003125      0.0001603     29          0.0032853        [od].[OrderID], [od].[ProductID], [od].[UnitPrice], [od].[Quantity], [od].[Discount]                                                                                                                                                                             NULL     PLAN_ROW                                                         0        1

(4 row(s) affected)

Upvotes: 0

George Zhang
George Zhang

Reputation: 391

The practical way is: instead of running complex queries in client production databases, extract the necessary data from them. And run the rest of your query to your own instance or testing environment.

Upvotes: 0

Tab Alleman
Tab Alleman

Reputation: 31785

Here's the strategy I would explore. It's based on the rule of thumb that the overwhelming majority of query optimization is achieved by using covering indexes to avoid Table Scans and Hash Match joins.

Look at each of the tables you need to include in your query, and the columns that you need to use in either the JOIN ON clause, or the WHERE clause. If the table has an index that includes all those columns, then you can just go ahead and use that table in your query. You can even use a query hint to force the query to use that covering index, but it shouldn't be needed.

If table doesn't have such an index, then you need to populate a temp table with the minimum amount of data you can get using an existing index.

For a super-simple example, say there's a table with ten columns and a million rows that only has an Index on Column1. In your query, you need to JOIN to that table on Column2 and include Column3 (only) in your resultset.

But on top of that, maybe for your final resultset, you're only interested in data that has Column1 values between 1-100.

I would create a temp table that only has Column2 & Column3, with a Clustered Index on Column2, and populate it with an INSERT..SELECT that gets data from the original table using a WHERE clause filter on Column1 to get only the rows I need.

This way you are building indexes on the smallest possible table which hopefully will get you a noticeable performance gain over using queries that scan the original tables.

Upvotes: 1

Related Questions