Performance of SELECT * in an Inline Table-Valued Function

Question

Today, the lead DBA at work said that I should not be using an ITVF to wrap a view completely, but from my rudimentary benchmarks I am left skeptical. It seems like SQL Server just sorts out those columns it actually needs (based on what is requested from the function) at query time. I say that because I am seeing very similar execution times between the two examples below.

`uf_GetCustomersByCity_A`

In this example, I create an ITVF which does a SELECT *, returning a filtered CustomerView.

CREATE FUNCTION [dbo].[uf_GetCustomersByCity_A] (@idCity INT)
RETURNS TABLE
AS RETURN

    SELECT CustView.*
      FROM [dbo].[CustomerView] CustView
     WHERE CustView.idCity = @idCity

GO

`uf_GetCustomersByCity_B`

CREATE FUNCTION [dbo].[uf_GetCustomersByCity_B] (@idCity INT)
RETURNS TABLE
AS RETURN

    SELECT CustView.idCustomer
         , CustView.cFullName
         , CustView.cCityName
         , CustView.fBalance
      FROM [dbo].[CustomerView] CustView
     WHERE CustView.idCity = @idCity

GO

My question is whether or not this is a valid observation, or simply a side effect of debugging for many many hours (assuming SQL Server optimizes with use). There was a lot of value in offering everything in the View that is needed instead of specifying each column specifically within the ITVF.

Amateur Benchmarks

So both work pretty well, with a yield of ~500k rows over 4-5 seconds (note: there are way complex clauses lending to the lengthy exec time, and these examples hardly illustrate the purpose here). The View has something like 70 or 80 columns, many of which are formatted or manipulated inline.

-- Around 500k rows in ~3-4 seconds:

SELECT idCustomer, cCityName
FROM [dbo].[uf_GetCustomersByCity_A](93)

-- Around 500k rows, again ~3-4 seconds:

SELECT idCustomer, cCityName
FROM [dbo].[uf_GetCustomersByCity_B](93)

Same performance on the dev box, but there's nobody else using it currently. Let's say that cFullName is a concatenation of cGivenName and cFamilyName, whereas cCityName is returned exactly as stored. Adding cCityName to the query has a tangibly lower impact than cFullName, leading me to believe that it's not delivery time to SSMS that I am noticing.

-- Around 500k rows, ~6 seconds:

SELECT idCustomer, cFullName
FROM [dbo].[uf_GetCustomersByCity_A](93)

-- Around 500k rows, ~6 seconds:

SELECT idCustomer, cFullName
FROM [dbo].[uf_GetCustomersByCity_B](93)

My thinking is that if the SELECT * mattered within the ITVF, then it would spend a bunch of time determining values for columns it doesn't use. From the quick benchmarks I worked out, I don't see much of a difference at all when I wrap a whole View via SELECT * rather than specifying the columns one at a time, restating the structure of the View in essence. Is my hunch valid here?

Gottfried Lesigang · Accepted Answer

The i in iTVF is for inlined - as you know. That means, that the engine will try to find the best execution plan as if the statement was written into the query directly.

From this point of view there should be no difference whether you use

SELECT * FROM YourView WHERE idCity=@idCity

or

SELECT * FROM YourITVF(@idCity)

The engine should be clever enough to handle only needed columns, but - in general - it is better to use a fix list of columns. (See the link in @a_horse_with_no_name's comment.)

Hint: When you wrap a view (as you want this) with SELECT * FROM ... you should keep in mind, that you have to recompile this iTVF if you alter your view.

The problem might be, that the engine has troubles to resolve deeply nested structures and might not find the best plan in the end (and even might not see, that an expensive computed column is not needed in the end result).

If your view was built on sub-views and these sub-views were built from sub-sub-views, other iTVFs and so on, this will lead to sub-optimal plans.

Some days ago I had to tune a slow view which turned out as a view with 9(!) call levels, covering views in views in views in ... and a lot of computed columns and so on. The engine was not able to look through this jungle anymore.

So in short:

Try not to nest to deeply.
An iTVF can lead to more readable code (less repetition, speaking names)
An iTVF can lead to better performance (as it is pre-compiled with a fix set of parameters, but be aware of parameter sniffing)
I would not use an iTVF just to pass in a simple filter variable...

Performance of SELECT * in an Inline Table-Valued Function

`uf_GetCustomersByCity_A`

`uf_GetCustomersByCity_B`

Amateur Benchmarks

Answers (1)

Related Questions