sqlsql-serversql-server-2005t-sqluser-defined-functions

Reputation: 5661

SQL Server 2005 Table-valued Function weird performance

I have a huge difference of time execution between a 1-minute query and the same one in a table-valued function.

But the most weired thing is that running the UDF with another (valid) company_id argument gives me a result in ~40 seconds and as soon as I change this company_id for 12 (valid again), it never stops. The execution plans of these two queries are absolutely not the same and of course, the long one is the most complicated. BUT the execution plan between the batch version and the UDF version are the same AND the batch version is fast...!

If I do the following query "by hand", the execution time is 1min36s with 306 rows:

 SELECT
  dbo.date_only(Call.date) AS date,
  count(DISTINCT customer_id) AS new_customers
 FROM
  Call
 LEFT OUTER JOIN
  dbo.company_new_customers(12, 2009, 2009) new_customers
   ON dbo.date_only(new_customers.date) = dbo.date_only(Call.date)
 WHERE
  company_id = 12
  AND year(Call.date) >= 2009
  AND year(Call.date) <= 2009
 GROUP BY
  dbo.date_only(Call.date)

I stored this exactly same query in a function and ran it like that :

SELECT * FROM company_new_customers_count(12, 2009, 2009)

13 minutes for now that it is running... And I am sure that it will never give me any result.

Yesterday, I had the exact same infinite-loop-like behaviour during more than 4h (so I stopped it).

Here is the definition of the function:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION company_new_customers_count 
( 
 @company_id int, 
 @start_year int,
 @end_year int
)
RETURNS TABLE 
AS
RETURN 
(
 SELECT
  dbo.date_only(Call.date) AS date,
  count(DISTINCT customer_id) AS new_customers
 FROM
  Call
 LEFT OUTER JOIN
  dbo.company_new_customers(@company_id, @start_year, @end_year) new_customers
   ON dbo.date_only(new_customers.date) = dbo.date_only(Call.date)
 WHERE
  company_id = @company_id
  AND year(Call.date) >= @start_year
  AND year(Call.date) <= @end_year
 GROUP BY
  dbo.date_only(Call.date)
)
GO

I would be very happy to understand what is going on.

Thanks

Additional:

Definition of company_new_customers:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Description: Create the list of new customers of @company_id
--          in the given period.
-- =============================================
CREATE FUNCTION company_new_customers 
(   
    @company_id int, 
    @start_year int,
    @end_year   int
)
RETURNS TABLE 
AS
RETURN 
(
    SELECT
        customer_id,
        date
    FROM
    (   -- select apparition dates of cutomers before @end_year
        SELECT
            min(date)       AS date,
            customer_id
        FROM
            Call
        JOIN
            Call_Customer ON Call_Customer.call_id = Call.call_id
        WHERE
            company_id = @company_id
            AND year(date) <= @end_year
        GROUP BY
            customer_id
    ) new_customers
    WHERE
        year(date) >= @start_year -- select apparition dates of cutomers after @start_year
)
GO

Definition of date_only:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author:      Julio Guerra
-- Create date: 14/10/2010
-- Description: Return only the date part of a datetime value
-- Example:         date_only('2010-10-25 13:00:12') returns 2010-10-25
-- =============================================
CREATE FUNCTION date_only
(
    @datetime datetime
)
RETURNS datetime
AS
BEGIN
    RETURN dateadd(dd, 0, datediff(dd, 0, @datetime))
END
GO

Execution Plan of SELECT * FROM company_new_customers_count(8, 2009, 2009) alt text

Execution Plan of SELECT * FROM company_new_customers_count(12, 2009, 2009) alt text

Upvotes: 2

Answers (4)

Dave Markle

Reputation: 97821

There are a couple of parts of the answer here. For the first part, I'm going to attempt answer the question (you didn't ask) about why none of the queries are particularly fast. This has relevance to the question you actually did ask, so bear with me.

Your date criteria are generally not SARGable as you've specified them -- for example in your company_new_customers function. That means that the server can't use its statistics to figure out how selective your criteria are. That means that your query plans are going to be very, very sensitive to how selective your customer_id criterion is, no matter how many date rows you have per customer.

Using a range query with an index on date and call_id in your call table should vastly improve performance in all cases and reduce the sensitivity of your query to the selectivity of the customer_id. Assuming that the date is on your call table, I'd rewrite your inner UDF something like this and tweak the input parameters to use dates instead. Doing so will make your UDF more versatile, too:

CREATE FUNCTION company_new_customers 
(   
    @company_id INT, 
    @start_date DATETIME,
    @end_date   DATETIME
)
RETURNS TABLE 
AS
RETURN 
(    
        SELECT
            MIN(c.[date]) AS [date],
            c.customer_id
        FROM dbo.[Call] c
        JOIN dbo.[Call_Customer] cc
            ON cc.call_id = c.call_id
        WHERE c.company_id = @company_id
        AND   c.[date]    <= @end_date
        AND   NOT EXISTS (
                SELECT *
                FROM  dbo.[Call] c1
                WHERE c1.customer_id = c.customer_id
                AND   c1.[date] <= @start_date              
        )    
        GROUP BY
            c.customer_id          
)
GO

Same goes for your other view. By using year() and your date_only() functions, you make any statistics or indexes you have on your dates all but useless (though the optimizer can use them to limit the amount of data scanned, but that's a bigger discussion).

Now then -- why does your UDF take forever? Because it calls another UDF and you use date_only() as a join argument, it pretty much can't "know" anything about what to expect in the UDF subquery, so it has chosen to loop join. It's likely choosing that plan because it's the appropriate one for some values of customer_id. It's likely that you have run a query against one of these selective customer_ids shortly after creating the UDF and the plan for that query has been cached -- even though it's not appropriate for other values of customer_id.

Why does a stored proc not take forever? Because the first time you run it, the stored proc generates a plan based on the first criteria you give it. Perhaps the first time you ran the SP, you used the non-selective customer ID and the stored proc has chosen to hash join. Same thing with the ad-hoc query. The optimizer is "noticing" that you have passed it a non-selective customer_id and is choosing to create a hash join for you.

Either way, unless you get the date-SARGability issue under control, you're going to find that all of your queries like this are going to be very sensitive to your customer_id input, and depending on your usage patterns, they could blow up in your face in terms of performance -- UDFs or not.

Hope this helps!

Upvotes: 1

Pent Ploompuu

Reputation: 5414

From these query plans it looks like you could benefit from an index like this (if I inferred your DB schema right):

CREATE INDEX IX_call_company_date ON call (company_id, date)

In general this seems to be a standard query optimization problem and the table-valued functions aren't making a difference here actually.

Upvotes: 2

Sir Wobin

Reputation: 1140

I've seen this with SQL Server 2005. When we used a table value function for our particular query we reliably got awful performance. Took the exact same text of the query, parameters and all, put them into a stored proc and reliably got a marvelous query plan. Calling the function with the same parameters as the stored proc produced different behaviour (we started both from a cold cache). Very disappointing!

Sadly we didn't have the time to diagnose this strange behaviour any more deeply and moved the project on avoiding table value functions on 2005.

This may indicate a bug in SQL Server 2005.

Upvotes: 0

Remus Rusanu

Reputation: 294427

The short plan uses HashJoin and clustered index scans on PK_CALL. The long plan uses NestedLoops and repeated seeks in UK_Pair_.... Most likely the cardinality estimates for '12, 2009, 2009' rule out HashJoin due to insufficient system memory, so you end up with a worse plan (despite the seeks instead of scans). Probably company_id 12 has way more customers than company_id 8.

To give a solution is impossible w/o exact information on all factors involved (the exact schema used, including every index, and the exact statistics and cardinality of every table involved). A simple avenue to pursue would be to use a plan guide, see Designing and Implementing Plan Guides.

Upvotes: 2

SQL Server 2005 Table-valued Function weird performance

Answers (4)

Related Questions