Reputation: 48
So ok, couldn't think of a very great way of expressing this in a title but the scenario is this: -
Your building a report. The report is on an aspx page with a C# code behind which accesses T-SQL databases.
The table your getting the data from is pretty darn large (millions of rows). Two columns you need to do a lookup on another table to get (Group and SuperGroup - see below), and this lookup table just happens to be a good few tens of thousands of rows too (not to mention you actually have to join two tables to create the lookup table properly - see #partGroups below)
Now bearing in mind the page running this will time out after 2 minutes...
Heres a couple of assumptions that have to be made along the way: -
This is what I've basically got so far: -
select rtrim(ltrim(Replace(Replace(csp.PartNo,' ',''), '-',''))) as PartNumber,
csp.[Description], csp.GroupCode, coalesce(cspg.[Group], 'Unknown') as [Group], coalesce(cspg.SuperGroup, 'Unknown') as SuperGroup
into #partGroups
from CoreStockParts as csp
left join CoreStockPartsGroups as cspg on csp.GroupCode = cspg.Code
select p.ID,
rtrim(Replace(Replace(p.Text01,' ',''), '-','')) as PartNumber1,
right(p.Text01, len(p.Text01)-1) as PartNumber2,
p.Numeric01 as CostAmount, p.Numeric02 as SaleAmount, p.Numeric03 as ExtendedCostAmount,
p.Numeric04 as ExtendedSaleAmount, p.Numeric05 as Quantity, p.Date01 as InvoiceDate
INTO #coreParts
FROM Assets as p
WHERE p.Category = 'PART'
and len(p.Text01) > 0
select ID, PartNumber1, PartNumber2, [Description], CostAmount, SaleAmount, ExtendedCostAmount,
ExtendedSaleAmount, Quantity, InvoiceDate, [Group], SuperGroup
from #coreParts as cp
inner join #partGroups as pg on cp.PartNumber1 = pg.PartNumber
union
select ID, PartNumber1, PartNumber2, [Description], CostAmount, SaleAmount, ExtendedCostAmount,
ExtendedSaleAmount, Quantity, InvoiceDate, [Group], SuperGroup
from #coreParts as cp
inner join #partGroups as pg on cp.PartNumber2 = pg.PartNumber
This is currently finishing in about 1 minute and 45 seconds with a medium server load. There are still restrictions that need adding which include but are not limited to filtering based on Group, SuperGroup and a date range based on InvoiceDate. On top of this once I finally HAVE this data I then need to start performing aggregate functions across it to produce graphs of sales quantities/values etc for various Groups/SuperGroups.
Now I'm THINKING if I can keep it to this speed.... that will do though its hardly ideal. If I can speed it up then great! Anything over 15 seconds longer however and we hit a wall however.
So the crux of this question is I guess multiple in that: -
EDIT: Ok some updates on this!
Firstly I was wrong on my assessment of what would be allowed it seems, we've got the authorization to add a snapshot table which can do all the work of getting the data we need together over night for running the report code on as and when the following day.
Special thanks to Blindy and user17594 for your input regarding indexing and bits that would prevent use of indexes. (bits, thats technical language ya know 8D).
Upvotes: 0
Views: 171
Reputation: 69769
My approach to this would be to add computed columns to your tables:
ALTER TABLE CoreStockParts
ADD PartNumber AS RTRIM(LTRIM((REPLACE(REPLACE(PartNo,' ',''), '-','')));
ALTER TABLE CoreParts
ADD PartNumber1 AS RTRIM(REPLACE(REPLACE(p.Text01,' ',''), '-','')),
PartNumber2 AS RIGHT(p.Text01, LEN(p.Text01)-1);
Now that you have these computed columns you can index them:
CREATE NONCLUSTERED INDEX IX_CoreStockParts_PartNumber ON CoreStockParts (PartNumber)
--INCLUDE ([Description], GroupCode)
Not sure of your exact table structure, but including Non-key columns in the index may assist performance, uncomment the second line if you think they will.
CREATE NONCLUSTERED INDEX IX_CoreParts_PartNumber1__part ON CoreParts (PartNumber1)
--WHERE Category = 'PART' AND LEN(Text01) > 0
CREATE NONCLUSTERED INDEX IX_CoreParts_PartNumber2__part ON CoreParts (PartNumber2)
--WHERE Category = 'PART' AND LEN(Text01) > 0
Depending on how you will use the table elsewhere, this index may benefit from being a filtered index, again uncomment if necessary. You may also want further indexes on CoreStockPartsGroups.Code
and 'CoreStockParts.GroupCode'.
Finally I would not use temporary tables, it is usually better to avoid them as you lose the benefit of existing indexes on the source tables:
WITH PartGroups AS
( SELECT csp.PartNumber,
csp.[Description],
csp.GroupCode,
ISNULL(cspg.[Group], 'Unknown') AS [Group],
ISNULL(cspg.SuperGroup, 'Unknown') AS SuperGroup
FROM CoreStockParts AS csp
LEFT JOIN CoreStockPartsGroups AS cspg
ON csp.GroupCode = cspg.Code
)
SELECT p.ID,
p.PartNumber1,
p.PartNumber2,
p.[Description],
CostAmount = p.Numeric01,
SaleAmount = p.Numeric02,
ExtendedCostAmount = p.Numeric03,
ExtendedSaleAmount = p.Numeric04,
Quantity = p.Numeric05,
InvoiceDate = p.Date01,
pg.[Group],
pg.SuperGroup
FROM Assets as p
INNER JOIN partGroups AS pg
ON p.PartNumber1 = pg.PartNumber
WHERE p.Category = 'PART'
AND LEN(p.Text01) > 0 as cp
UNION
SELECT p.ID,
p.PartNumber1,
p.PartNumber2,
p.[Description],
CostAmount = p.Numeric01,
SaleAmount = p.Numeric02,
ExtendedCostAmount = p.Numeric03,
ExtendedSaleAmount = p.Numeric04,
Quantity = p.Numeric05,
InvoiceDate = p.Date01,
pg.[Group],
pg.SuperGroup
FROM Assets as p
INNER JOIN partGroups AS pg
ON p.PartNumber2 = pg.PartNumber
WHERE p.Category = 'PART'
AND LEN(p.Text01) > 0 as cp;
Running in SSMS with display actual plan enabled will suggest futher improvement by adding indexes.
Upvotes: 0
Reputation: 51711
Another idea is to create a Reporting Data service.
This runs over the existing data, populating a separate Database with a structure optimized for reporting, your reports run over this, which should be orders of magnitude faster, given that all the heavy lifting was done by the Reporting service.
One draw back is that the new data will be "stale" depending on how quickly changes populate into the Reporting Database. Lots of businesses won't notice or care that data is 60 or 90 seconds out of date, but it obviously depends on your own use case.
Upvotes: 1
Reputation: 452
You may wish to try a few things to see which, if any, will work. Always note that with any change you need to test, test, and test some more to see if it is improving performance on your systems.
and len(p.Text01) > 0
will prevent the use of an index seek. Try AND p.Text01 IS NOT NULL AND p.Text01 != ''
insteadCREATE UNIQUE CLUSTERED INDEX IX_partGroups_tmp ON #partGroups (PartNumber)
as well as CREATE NONCLUSTERED INDEX IX_coreParts_tmp ON #coreParts (PartNumber1,PartNumber2)
Try to keep your data processing within the database, as the data platform is optimized for this. Filter as much as you can and return only what you need to your client.
Upvotes: 2
Reputation: 15151
Instead of doing your reports in an asp page, use a background service:
Create a service which waits for queries (through pipes, sockets, wcf, whateveryoulike...)
When you need a report, from our asp page ask to the service if that report already exists, if not tell the service to create it.
Show the user a "Loading" message and through Ajax ask your page to ask your service if the report is ready.
When the service has "munched" all the data, present your reports/charts through your asp page.
Is always a good idea to sepparate long-term running processes from web pages to avoid timeouts and leaving the user with a page which looks to be hung (suppose a very nervous user who starts to press F5 because the page took 2 minutes to render... your server will crash with tons of reports being generated).
Upvotes: 3