Reputation: 524
I have an application that displays posts, and for each post the users can say if they like the post or not. For each post I should display how many users likes it and how many dont like it. Suppose I have these tables:
CREATE TABLE [dbo].[Post](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[Quotation] [text] NOT NULL,
CONSTRAINT [PK_Post] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
CREATE TABLE [dbo].[UserPostAction]( --Every action of the user (like or dislike) is recorded to this table
[PostId] [bigint] NOT NULL,
[UserId] [bigint] NOT NULL,
[ActionValue] [int] NOT NULL, --Like / Dislike
CONSTRAINT [PK_UserPostAction] PRIMARY KEY CLUSTERED
(
[PostId] ASC,
[UserId] ASC,
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
What is the best way, performance wise, to have a select statement that returns a list of posts which contains the like/dislike statistics already in:
A) Add 2 more columns to Post table: TotalLike and TotalDislike, and when inserting a new record to UserPostAction I will update these columns? This way, when selecting posts from Post table I will have the statistics already calculated.
B) Add a third table: PostStat ([PostId], [TotalLike], [TotalDislike]) and update the table in a batch process. The select statement will use Inner Join on Post.Id = PostStat.PostId to retrieve the statistics.
C) Any other way you can think of.
Hopefully, I will have a DB with millions of posts, and the Post table will be queried very often by a lot of users. Keep in mind that the statistics is updated very frequently, because a lot of users will like or dislike a lot of posts. It means (maybe) performing a lot of updates and table locking, which will interfere with the select statement that returns posts to be displayed.
Any idea is welcome.
Upvotes: 1
Views: 145
Reputation: 4653
I faced a similar situation and I choose the strategy A:
Knowing that writes don't block reads (in fact they will just read the previous value until the transaction is completed) I doubt you'll have locking issues putting the columns on the main table.
Upvotes: 1
Reputation: 36126
I will have a DB with millions of posts
don't think this is enough reason to maintain duplicate data on your DB.
You already have all the info you need on your current tables. Its a matter of building the right indexes and you can easily count how many users liked\unliked the post.
If you are really unconformable with this approach, I suggest implementing option A with triggers on the UserPostAction table to maintain your 2 new columns.
Upvotes: 1