Reputation: 75
I'm currently facing a performance issue on my website. The situation can be summarized by the following:
Since statistics are not as important as the core system, and I see SQL Server struggling a lot, I thought it may be good to move these statistics tables somewhere else.
The main question is: What is the best way to handle Stats where updates are predominant? The idea is also to keep only one server.
I tried to have a look of what can be achieved to improve the situation:
Any relevant ideas for the current situation will be appreciated.
Thank you
Here are more info about a statistic table I have:
TABLE [dbo].[UserStat](
[Id] [int] IDENTITY(1,1) NOT NULL,
[UserId] [int] NOT NULL,
[Hits] [int] NOT NULL,
[Points] [decimal](38, 6) NOT NULL,
[Date] [datetime] NOT NULL,
[LastHit] [datetime] NOT NULL,
[Ip] [varchar](256) NULL,
)
I update the stat like this:
UPDATE [UserStat] SET Hits = Hits + 1, Points = Points + @Points, LastHit = @Last WHERE UserId = @Id AND [Ip] = @Ip AND [Date] = @Date
If the row for the current user and date does not exists, I create it:
INSERT INTO [UserStat] ([UserId],[Hits],[Points],[Date],[LastHit],[Ip]) VALUES (@UserId,@Hits,@Points,@Date,@LastHit,@Ip)
There are two indices:
1 for getting and aggregating the stats of an user
INDEX [Select_UpdateUserStatNavigation] ON [dbo].[UserStat](
[UserId] ASC,[Date] ASC) INCLUDE ([Id],[Hits],[Points], [LastHit],[Ip])
The server is a VPS. Log and data files are on the same disk. There is no foreign key involved on the table.
Here are all the SELECT queries I found that query this table:
SELECT Points, Hits, [Date] FROM [UserStat] WHERE UserId = @UId AND [Date] >= @date
SELECT Ip FROM [UserStat] WHERE UserId = @UId AND LastHit >= DATEADD(MINUTE,-15,getdate())
SELECT COUNT(Id) FROM [UserStat] WHERE [LastHit] >= DATEADD(MINUTE,-15,getdate())
But I'm not really worried about the SELECT, more about the number of UPDATEs ^^.
Upvotes: 4
Views: 674
Reputation: 1830
RavenDB is extremely easy to get up and running for this scenario. You will get quick writes and potentially quick reads. You also get ACID or as close as your going to get. RavenDB is easy to wire up in MVC. Since you have Mongo experience, the concept of a document should not be foreign to you. Use the RavenDB C# client libraries in your MVC app and within a few hours you should be able to make major progress. Just be sure you understand the limitations. By default queries may be a few microseconds behind updates and like many NoSql or CQRS solutions, if you wipe out the cache for the queries, it may take a few minutes to a few hours to fully rebuild the cache.
Upvotes: 0
Reputation: 16167
This is really a very simple and good example of when a NoSql database is appropriate. NoSql was created for "web-scale" applications, such as this, where the pace and volume of data simply outstrip the ability of SQL databases to keep up (a big weakness of relational DBMS's).
In fact, regular SQL is not a good option for your scenario. There are several reasons for this, including:
Now, let's look at your particular use-case:
Lots of update transactions, with the occasional insert. Most NoSql database platforms use a Set
operation, which updates or inserts as necessary. No need to run two statements to do this every time.
Single Primary Key. NoSql databases are key-value stores, where your key (in this case, UserId
) points to a single record in the database.
Simple stats and indexing. Several NoSql databases offer built-in indexing capability, and some even allow you to do map-reduce on your data to get the detailed stats out. Others do data aggregation automatically and you can write special queries to get the data you need. In this case, your "Stat Id" field is useless and can be dropped (yay, less storage taken!).
Fast and Scalable. This is something you just aren't going to touch with a SQL database. NoSQL was designed for this.
Given the above, your scenario is a textbook example of when to apply a NoSql solution. I can recommend Couchbase, which is an extremely fast in-memory database with disk-based storage (this fulfills your cache and data storage requirement in one shot). You might also consider Elasticsearch for your statistics store as it can perform some very good data aggregations out of the box. Whichever NoSql solution you pick, you will get flexible scalability and ease of maintenance. I daresay your days of being a full-time DBA will be over.
Upvotes: 2
Reputation: 633
Can you confirm that ID is your primary key? If so, then that will be fine as it is a monotonically increasing value and good for Inserts. I think your other index (for the purposes of the updates) should be
INDEX [Select_UpdateUserStatNavigation] ON [dbo].[UserStat](
[UserId] ASC,[IP] ASC, [Date] ASC).
Make sure the columns are ordered from the most selective to the least selective in the index. This should speed up the updates in the sense that rows can be located quicker. We can look at indexes for the SELECT later.
BY VPS, do you mean it is a virtual server? I would take a look at your IO stats to check that IO is not a bottleneck. How much memory is allocated to SQL? That could be another issue. Insufficient memory could result in paging to disk - the slowest part of your IO subsystem.
I would take a look at splitting your log and data disks onto separate disks if possible. Having them on the same disk can cause disk contention - again on the slowest part of your IO subsystem.
Can you post the select queries that are used? I can then give suggested indexes if required.
In addition, you may want to replace your separate insert and update procs with a MERGE such as below.
MERGE UserStat AS TargetTable
USING (SELECT @UserId UserID,@Hits Hits,@Points Points,@Date [Date],@LastHit LastHit,@Ip Ip) AS SourceData
ON SourceData.UserID = TargetTable.UserID
AND SourceData.IP = TargetTable.IP
AND SourceData.[Date] = TargetTable.[Date])
WHEN MATCHED THEN UPDATE SET Hits = Hits + 1, Points = Points + SourceData.Points, LastHit = SourceData.LastHit
WHEN NOT MATCHED THEN INSERT (UserID,Hits,Points,[Date],LastHit,Ip)
VALUES(SourceData.UserID,SourceData.Hits,SourceData.Points,SourceData.[Date],SourceData.LastHit,SourceData.Ip)
Upvotes: 1
Reputation: 6568
Before getting into main problem, some changes must be happen:
You should migrate from Express edition to Enterprise edition or at least Standard Edition(Express vs others)
As you have tons of updates on your data, you should disable your indexes(if there is any)
Try to re-size your table's columns, so your record's cells may be stored in less pages, and this will help your update process to speed-up(for example if you have a table with 20 columns, and you always update just fixed 5 known column, then separate this 5 column from your other 15 column. this may help you to arrange your data in less pages, and when you have less pages, you can find your records in faster pace. of coarse this is not based on normal forms, but it can help your performance)
take a look at your available memory and CPU. these two are cornerstone for performance.
and about your main problem, I need to know some more about your Stats and it's table and also it's usage. do you mean sql server statistics, or you mean some thing else?
Upvotes: -1