Reputation: 321
I consider myself rather proficient with T-SQL and I'm usually able to optimize a query pretty good without loosing readability. In short: I like my SQL short, descriptive, declarative and elegant.
While the following code works, i have two problems with it:
So given the following SP, does anyone see any obvious way to convert this to a plain select using recursive CTEs? I've tried, failed and thought I'd see what the stack overflow community might be able to come up with.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROC [dbo].[usp_GetLastReferers]
(
@Limit int = NULL
)
AS
BEGIN
SET NOCOUNT ON
CREATE TABLE #Referer
(
ID int,
Url nvarchar(500),
Referer nvarchar(500)
)
DECLARE @ID int
DECLARE @Url nvarchar(500)
DECLARE @Referer nvarchar(500)
DECLARE @Count int
SET @Count = 0
DECLARE LogCursor CURSOR FORWARD_ONLY READ_ONLY FOR
SELECT ID, Url, Referer FROM Log WHERE Referer <> '' ORDER BY ID DESC
OPEN LogCursor
FETCH NEXT FROM LogCursor INTO @ID, @Url, @Referer
WHILE @@FETCH_STATUS = 0 AND (@Count < @Limit OR @Limit IS NULL)
BEGIN
DECLARE @Hits int
SELECT @Hits = COUNT(*)
FROM #Referer
WHERE Referer = @Referer
DECLARE @IsLocal bit
SELECT @IsLocal = dbo.IsLocalSite(@Referer)
IF (@Hits = 0 OR @Hits IS NULL) AND @IsLocal = 0
BEGIN
INSERT INTO #Referer(ID,Url,Referer) VALUES (@ID,@Url,@Referer)
SET @Count = @Count + 1
END
FETCH NEXT FROM LogCursor INTO @ID, @Url, @Referer
END
CLOSE LogCursor
DEALLOCATE LogCursor
SELECT *
FROM #Referer
DROP TABLE #Referer
SET NOCOUNT OFF
END
Since it may not be totally obvious, what I'm trying to do here is akin tothe following quasi SQL
SELECT DISTINCT TOP(@Limit) ID, Url, Referer
FROM Log
ORDER BY ID DESC
Basically to get the last unique refers (not unique rows), which often contain duplicates, and in descending order. This is definitely where it gets tricky.
The data is pretty simple HTTP logs. ID field is just a unique row-identifier, Url is the full url requesten, and Referer is the HTTP referer for that request. None of the values can be null, but referer can be empty (ie ''). IsSiteLocal is just a simple filtering function to exclude referers originating from my own sites.
If anyone wants some sample-data to full around, I can upload a small DB-backup so you have something to fool around with.
Sample-data can be found here: http://svada.kjonigsen.net/files/IISLogsDBBackup.zip
Upvotes: 1
Views: 275
Reputation: 1292
Why to convert it to recursive CTE ? There is no reason it can't run as a plain select.
I downloaded the test database, which was missing your dbo.IsLocalSite function, so for my test I created my own same named function and assumed it always returned 0.
This code when run produced the exact same results as the stored procedure given above:
SELECT TOP (@Limit) ID, Url, Referer
FROM (
SELECT ID, Url, Referer, RANK() OVER (PARTITION BY Referer ORDER BY ID DESC) _RANK_
FROM LOG
WHERE dbo.IsLocalSite(Referer) = 0
AND Referer != ''
) TT
WHERE _RANK_ = 1
ORDER BY ID DESC;
Upvotes: 1
Reputation: 9607
you want the max(ID) for each Url,Referer as long as dbo.IsLocalSite(@Referer) = 0? Can you just group by Url,Referer to get the max(ID) and apply your function in the WHERE clause?
Upvotes: 0
Reputation: 6015
Try this:
;with Referers as (
SELECT
row_number() over (order by id desc) rn
,ID, Url, Referer FROM Log
WHERE dbo.IsLocalSite(Referer) = 0
)
select * from Referers
where rn <= @limit
Upvotes: 0