Tejas
Tejas

Reputation: 21

SQL processor runs out of internal resources due to a huge number of partitions created

I have a read query (not a stored procedure) within my Perl script for choosing the latest TestHistoryID value for a Test against the TestID. A Test (or simply a piece of code) runs multiple times as a part of regression process. While the TestID remains same for a test, the test receives a new TestHistoryID after each run. The pass/fail status, time-took-for-completing-run etc is stored against TestHistoryID (TestHistoryID is an integer that increases monotonously).

Since there are multiple TestHistoryID against each TestID (DB keeps history of last 50 runs and deletes the oldest entries if it goes beyond 50), I want to get the resultset containing latest TestHistoryIDs against each TestID.

I used the following partition query to obtain the row with the highest TestHistoryID (followed this question on stackoverflow to create the query - How to query SQL Table and remove duplicate rows from a result set)

SELECT TestID, TestHistoryID, TestLabel
FROM 
(
    SELECT 
        TestID,
        TestHistoryID,
        TestLabel,
    row_number() over(partition by TestID order by TestHistoryID DESC) rn
    FROM TestHistoryView
    WHERE TestID IN (@test_ids)
) content where rn = 1

@test_ids contains huge number array e.g. 504954, 504955, 504956, 504957, 504958, 504959, 504960, 504961, 504962 .... (40k numbers)

Snap of Table columns here that contains duplicates, I need to remove the duplicates and get the latest value of TestHistoryID for each TestID (https://i.sstatic.net/nJ59C.png)

The query works for up to approximately 10k TestIDs, if I try with 15k, I get the following error:

The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.

I need help creating an efficient query that doesn't eat up all the resources for 40k parameters passed and still returns the intended results. Thanks a lot in advance.

Upvotes: 0

Views: 220

Answers (1)

David Browne - Microsoft
David Browne - Microsoft

Reputation: 89141

You can't actually pass a list of values to a single parameter. Some tools, like SQL Server Reporting Services, will rewrite your query in this scenario to add multiple literal values or parameters.

In any case using an IN list embeds the data in the query itself, and will eventually cause this issue. So while it may appear to you that you're not embedding 15,000 separate values in the SQL query, you actually are.

To avoid it you need to pass the values separately from the query. You can do this using a Table-Valued Parameter, JSON, XML, or by loading a temp table ahead-of-time. Which to use depends on your client platform. I'm not sure what your options are in Perl.

For XML the batch would look something like this:

declare @ids varchar(max) = '<IDs><id>1</id><id>2</id><id>3</id><id>4</id><id>5</id><id>6</id><id>7</id><id>8</id></IDs>'

declare @doc xml = cast(@ids as xml)

SELECT TestID, TestHistoryID, TestLabel
FROM 
(
    SELECT 
        TestID,
        TestHistoryID,
        TestLabel,
    row_number() over(partition by TestID order by TestHistoryID DESC) rn
    FROM TestHistoryView
    WHERE TestID IN (select n.value('.','int') Id from @doc.nodes('/IDs/id') d(n))
) content 
where rn = 1

Although you would leave out the first line and bind @ids as a varchar(max) parameter. Even pasting the XML in the TSQL query as a literal should avoid your current issue, as parsing a long string literal to an XML document doen't create the same query processing resource issue as embedding a vary large number of string literals in the query.

Upvotes: 3

Related Questions