Dreamer
Dreamer

Reputation: 7549

SQL Server can optimize away columns from subselect if not referred by parentselect?

I am running reports on Power BI that direct query from SQL server, there is one section from Microsoft here

Basically Power BI can auto generate query behind the scene, however what puzzles me is that the pattern use sub query and even if you select two columns from two different tables, the subquery select all the columns from each table.

enter image description here

My question is eventually how many columns of data will pull from SQL server? Two columns or all columns from SQL server? This can be major impact on performance and network throughput

In the doc there is a statement

Use of subselects in this manner hasn't been found to impact performance for the data sources so far supported for DirectQuery. Data sources like SQL Server optimize away the references to the other columns.

It sounds like SQL server will filter out the columns that are not in the main select, is that true?

Upvotes: 0

Views: 47

Answers (1)

Preben Huybrechts
Preben Huybrechts

Reputation: 6111

Regarding network I/O only the bytes are transferred from the main select. You can test this in SSMS by turning on client statistics, and look at bytes received from server.

Regarding filtering out columns in the execution plan: SQL server will try to do its best to only create outputs for the rows that are used in the query, in select, where, order by, join, ...

You can see this if you open the execution plan and look at the output lists of all the steps in the execution plan.

Here you can see what columns are being used in the output list: enter image description here

If the columns are not in the output list, they are filtered out by the engine.

Upvotes: 3

Related Questions