Reputation: 153
I have created a table in SQL Server 2008 R2 as follows:
CREATE TABLE [dbo].[7And11SidedDiceGame]
(
[Dice11Sides] [INT] NULL,
[Dice7Sides] [INT] NULL,
[WhoWon] [INT] NULL
)
I added the following index:
CREATE NONCLUSTERED INDEX [idxWhoWon]
ON [dbo].[7And11SidedDiceGame] ([WhoWon] ASC)
I then created a WHILE loop to Insert 25 million RANDomly generated rows to tally the results for statistical analysis.
Once I optimized the Insert function (Using BEGIN TRAN and COMMIT TRAN before and after the loop) the While loop ran decent. However, analyzing the data takes a long time. For example: using the following statement takes about 4 minutes to perform:
DECLARE @TotalRows real
SELECT @TotalRows = COUNT(*)
FROM [test].[dbo].[7And11SidedDiceGame]
PRINT REPLACE(CONVERT(VARCHAR, CAST(@TotalRows AS money), 1),'.00','')
SELECT
WhoWon, COUNT(WhoWon) AS Total,
((COUNT(WhoWon) * 100) / @TotalRows) AS PercentWinner
FROM
[test].[dbo].[7And11SidedDiceGame]
GROUP BY
WhoWon
My question is how can I better index the table to speed up retrieval of the data? Or do I need to approach the pulling of the data in a different manner?
Upvotes: 2
Views: 1524
Reputation: 1269773
You may be able to speed this by using window functions:
SELECT WhoWon, count(*) AS Total,
count(*) * 100.0 / sum(count(*)) over () as PercentWinner
FROM [test].[dbo].[7And11SidedDiceGame]
GROUP BY WhoWon;
This does not provide the separate print
statement.
For performance, try an index on (WhoWon)
.
Upvotes: 2
Reputation: 32695
I don't think you can do much here.
The query has to read all 25M rows from the index to count them. Though, 25M rows is not that much and I'd expect it to take less than 4 minutes on a modern hardware. It is only 100MB of data to read (OK, in practice it is more, say, 200MB, still it should not take 4 minutes to read 200MB off the disk).
Is the server under a heavy load? Are there a lot of inserts into this table?
You could make a small improvement by defining WhoWon
column as NOT NULL
in the table. Do you really have NULL values in it?
And then use COUNT(*)
instead of count(WhoWon)
in the query.
If this query runs often, but the data in the table doesn't change too often, you can create an indexed view that would essentially materialise/cache/pre-calculate these Counts, so the query that would run off such view would be much faster.
Upvotes: 4