Reputation: 1021
I'm using SQL Server 2008 R2 on my development machine (not a server box).
I have a table with 12.5 million records. It has 126 columns, half of which are int. Most columns in most rows are NULL. I've also tested with an EAV design which seems 3-4 times faster to return the same records (but that means pivoting data to make it presentable in a table).
I have a website that paginates the data. When the user tries to go to the last page of records (last 25 records), the resulting query is something like this:
select * from (
select
A.Id, part_id as PartObjectId,
Year_formatted 'year', Make_formatted 'Make',
Model_formatted 'Model',
row_number() over ( order by A.id ) as RowNum
FROM vehicles A
) as innerQuery where innerQuery.RowNum between 775176 and 775200
... but this takes nearly 3 minutes to run. That seems excessive? Is there a better way to structure this query? In the browser front-end I'm using jqGrid to display the data. The user can navigate to the next, previous, first, or last page. They can also filter and order data (example: show all records whose Make is "Bugatti").
vehicles.Id is int and is the primary key (clustered ASC). part_id is int, Make and Model are varchar(100) and typically only contain 20 - 30 characters.
Table vehicles is updated ~100 times per day in individual transactions, and 20 - 30 users use the webpage to view, search, and edit/add vehicles 8 hours/day. It gets read from and updated a lot.
Would it be wise to shard the vehicles table into multiple tables only containing say 3 million records each? Would that have much impact on performance?
I see lots of videos and websites talking about people having tables with 100+ million rows that are read from and updated often without issue.
Note that the performance issues I observe are on my own development computer. The database has a dedicated 16GB of RAM. I'm not using SSD or even SCSI for that matter. So I know hardware would help, but 3 minutes to retrieve the last 25 records seems a bit excessive no?
Though I'm running these tests on SQL Server 2008 R2, I could also use 2012 if there is much to be gained from doing so.
Upvotes: 0
Views: 410
Reputation: 146603
Yes there is a better way, even on older releases of MsSQL But it is involved. First, this process should be done in a stored procedure. The stored procedure should take as 2 of it's input parameters, the page requested (@page
)and the page size (number of records per page - @pgSiz
).
In the stored procedure,
Create a temporary table variable and put into it a sorted list of the integer Primary Keys for all the records, with a rowNumber column that is itself an indexed, integer, Primary Key for the temp table
Declare @PKs table
(rowNo integer primary key Identity not null,
vehicleId integer not null)
Insert @PKS (vehicleId)
Select vehicleId from Vehicles
Order By --[Here put sort criteria as you want pages sorted]
--[Try to only include columns that are in an index]
then, based on which page (and the page size), (@page, @pgSiz) the user requested, the stored proc selects the actual data for that page by joining to this temp table variable:
Select [The data columns you want]
From @PKS p join Vehicles v
on v.VehicleId = p.VehicleId
Where rowNo between @page*@pgSiz+1 and (@page+1)*@pgSiz
order by rowNo -- if you want to sort page of records on server
assuming @page
is 0-based. Also, the Stored proc will need some input argument validation to ensure that the @page
, @pgSize
values are reasonable (do not take the code pas the end of the records.)
Upvotes: 1