Steven Sproat
Steven Sproat

Reputation: 4578

SQL Server R2 2008 Full Text Search causing 'wait timeouts' randomly

we have an ASP MVC site deployed to production which is a pretty big/popular site, and it gets quite a bit of traffic.

We're using Full Text Search to drive out search facility.

Very randomly, I see a bunch of errors in our ELMAH log, "The wait operation timed out". When this error happens, it seems to happen to many of our customers are the same time - e.g. we get 4 or 5 instances of this error logged all within the same timestamp. It never seem to happen in isolation.

Basically, on our catalogue pages, there's 5-6 queries we execute (e.g. 'get all sub-categories', 'get all manufacturers', 'get min/max price range', 'get actual products') and I can see from looking at the stack track that it'll be one of these methods causing the exception.

However, the common thing is that they take the user's filter options and applies them to each query to show relevant information.

So it seems this only fails (very infrequently) when a search term is present. I've not seen any other timeout issues when a customer views a catalogue page without a search term, for example.

Could anyone point me in a direction of where potential deadlocks could occur? Our full text index is on a View, which is pulling data from our product table. This does get updated many times a day with stock information etc, but the columns that the view pulls in don't change their data that frequently. Because it's so random it's hard to actually pinpoint where it's happening - and when I've navigated to a URL which I can see had previously caused an exception, the page loads really quick (as if the search results have been cached to some degree).

Bit more info: 54,612 rows in the view (2 columns - product ID / search text) SearchText column which is full text indexed has largest row with 95 characters, so it's not a massive amount of data. For the most part the catalog pages take about 600ms to render, with these queries taking about 20-30ms usually. However, sometimes this balloons up massively - I've seen traces where it takes about 6s to execute each query, other times where one of the 6 will timeout and just very random spikes. I've never really experienced this slowdown myself but I don't browse the site too often, but we have new relic transactions (and error logs) showing frequent slowdowns & timeouts.

We're querying using

@Search = '("some*" AND "search*" AND "terms*")'

SELECT TOP ? * FROM (SELECT p.Score, [columns], [Rank] as SearchRank, ROW_NUMBER() OVER 
( ORDER BY p.Score * [Rank] DESC, p.Id DESC) AS row_number 
FROM Product p 
INNER JOIN Manufacturer AS m ON p.ManufacturerId = m.Id 
INNER JOIN ProductCategory pcm ON p.Id = pcm.ProductId 
INNER JOIN CONTAINSTABLE(vw_SearchProducts, FullSearch, @Search) AS FTS ON p.Id = FTS.[KEY] 
WHERE p.Deleted = ? AND pcm.CategoryId = @CategoryId) p WHERE p.row_number > ? ORDER BY p.Score * SearchRank DESC, p.Id DESC 

and our view looks like such, with ProductID as the FTS key:

SELECT p.Id AS ProductId, Colour + ' ' + Size ' ' +  m.Name + ' ' + Categories AS FullSearch 
FROM Product 
INNER JOIN Manufacturer m ON m.Id = p.ManufacturerId
WHERE Deleted = 0 

Our product table contains some de-normalised data for performace purposes. Our 6 main queries for the catalogue page are all similar to above, but just pulling different bits of data for products which match the filtering/search criteria.

We're actually upgrading to SQL Server 2016 soon - unsure if that's going to make things any better.

Cheers

Upvotes: 1

Views: 806

Answers (1)

andrews
andrews

Reputation: 2173

Could you start SQL Server Profiler and leave it running for several hours or a day, then when you see the timeout in the logs again, locate queries logged in SQL Profiler around that time and check the Duration/Reads/Writes columns values.

Maybe this way you will be able to identify the EXACT stored procedure / statement with specific parameter list which has caused the timeout. If so, you'll be able to debug it further using SSMS and execution plan tab.

Another idea of course is that FTS catalog may be rebuilding too often. This also can be trapped but you need to catch the exact timeout moment and then check FTS catalog status if it's being rebuilt.

Things should improve with regard to FTS and in particular FTS index update speed when you upgrade to SQL 2012-2016. MS claims up to x10 times speed up of such concurrent FTS index updates, see this answer: Are there any Sql Server Full-Text Search (FTS) performance improvements since version 2008 R2?.

55K records of data is really not that much. We run SQL Server FTS against millions of records and do not get timeouts. Most results are under 10 sec but this delay is caused not by FTS. The actual FTS block runs very fast, it's those additional JOINs and filters which we need to apply to the initial FTS result cause extra delay.

Another idea is your VIEW maybe poorly designed which doesn't get execution plan cached by SQL Server or joins to itself etc. To confirm this we need to see the actual VIEW code, especially the JOIN/WHERE part.

Upvotes: 1

Related Questions