Reputation: 1629
I need to develop a voting system for posts. Every user can like a post by clicking on thumb icon and love a post by clicking on hearth icon, in the future maybe I have a star system as alternatives.
I'm looking for better solutions to implement this system, for now, I've thought this two way:
tbl like
id
post_id
user_id
value // value even 1
tbl love
id
post_id
user_id
value // value even 1
tbl star
id
post_id
user_id
value // value between 1 and 5
tbl rate
id
post_id
user_id
type // 'like' or 'love' or 'star'
value // value between 1 and 5
Which is the better solution?
And other thing, when I query multiple posts isn't good SELECT and sum (or avg) every vote, maybe is good to save in posts table the sum (or avg) of current like/love/star to speed up the query, and then update the posts on every new like/love/star?
The last thing, to let the user see if he has already voted some posts, it's a good solution a query like this?
SELECT post_id FROM rate WHERE user_id = <MYUSERID> AND post_id IN (<ARRAYOFPOSTS>)
and then compare the retrieved post_id with the post_id of the main query?
Thank you!
Upvotes: 3
Views: 416
Reputation: 13110
From your original question
…in the future maybe I have a star system as alternatives (sic)
Why are you attempting to solve a problem you don't have yet? Presumably you don't want to run both a reaction system (like|love) and a rating system (star) side by side
I'd also question the value of what a love reaction gives you, and the user, over and above a like. I know facebook does it, but they have a huge amount of data and another 4 reactions (sad|angry|wow|haha)
Honestly, I'd forget about everything except the likes and just start with (from @PaulSpiegel's answer)
user_post_like (
user_id (FK, PK)
post_id (FK, PK)
)
Then move on!
You can always come back and add to this system, change it, migrate the data if the system requires it
Don't give yourself decision fatigue trying to build a super generic system that solves everything
Upvotes: 2
Reputation: 105
My suggestion is kind of method 2.
You can combine rate_type and rate_value in a table. Create all posibilities in for rate, love and star (1 to 5) This is an advantage for future cause, it has less data in tbl rate when record count increased.
I don' t offer 'count based' tables. That kind of tables has lock posibilities in high traffic.
Disadvantage for this combined rate_type style is, it will be a little harder to manage in code.
My proposal;
tbl_rate_type
id tinyint
name nvarchar(5) // if you want
tbl_rate_type_kind
id tinyint,
rate_type_id tinyint,
rate_value tinyint,
name nvarchar(5), // if you want
image_id smallint // Advantage for viewing specific images simply
tbl rate
id bigint
post_id bigint
user_id int
rate_type_kind_id tinyint // numeric value is better for indexing.
Upvotes: 1
Reputation: 407
As Eric mentioned, having multiple tables is not usually a very good idea performance wise.
For each table MySQL is required to maintain internal data structures, a data dictionary, file descriptions and so on...
Go with solution 2.
This would be the best solution in terms of both performance and scalability of your application for the future. Instead of having to support a completely new table when expanding votes, you simply need to allow your application to show a new type of vote in the frontend.
When fetching statistics for the given article, you could as you say, save this every time a user votes to not have to fetch the full count each time, however the performance gain won't really be noticeable at all, instead make some good indexes to speed things up once it's needed.
If you wish to show the user if he has voted for a given post I would recommend simply checking the rate
table you describe in solution 2. If you wish to limit amount of queries from your application you could as you say fetch all posts on the page, then query on those. However the performance gain here is also minimal I would say.
Some good indexes would solve most of your performance concerns for the foreseeable future. Usually the only time you'll need to split the data is when the whole database can't handle all the traffic. This article from the Airbnb team gives an idea of when that might be the case.
A small tip would be to not overthink the application. Usually it's better to start out small and do things the easy way. This way you won't introduce more issues than necessary. Once the application then grows, the issues will come naturally and you'll solve them as they come.
Upvotes: 4
Reputation: 1629
I'm looking for a solutions that match all the suggestions that I've received. I'll try to develop one single column with leverage of the composite index. This is the solutions that I want to try:
Table: rating
----------
user_id : bigint
post_id : bigint
rate_type : varchar (like/love/star)
rate : tinyint (0-5) // default 1 (for like/love)
----------
key: INDEX(user_id, post_id)
no PRIMARY // I don't care about duplicate key that may exist (same post like/love)
In this test table I've added 1000 rows and tried to query:
SELECT * FROM rating WHERE user_id = 21; //EXPLAIN KEY:rate ROWS:4
SELECT * FROM rating WHERE user_id = 21 AND rate_type = 'like'; //EXPLAIN KEY:rate ROWS:4
SELECT * FROM rating WHERE user_id = 21 AND rate_type = 'like'; //EXPLAIN KEY:rate ROWS:4
SELECT * FROM rating WHERE user_id = 21 AND post_id = 179 AND rate_type = 'like'; //EXPLAIN KEY:rate ROWS:1
What do you think? Is this a good solutions? Use a simple INDEX and not a PRIMARY may have consequences? I can't use PRIMARY because may happen that I have this three rows:
user_id: 21 - post_id: 173 - rate_type: 'like'
user_id: 21 - post_id: 173 - rate_type: 'love'
user_id: 21 - post_id: 173 - rate_type: 'star'
Upvotes: 0
Reputation: 142373
Option 3:
user_id ... NOT NULL,
post_id ... NOT NULL,
like TINYINT NULL, -- NULL means no action on LIKEing
love TINYINT NULL,
rate TINYINT NULL,
PRIMARY KEY(user_id, post_id),
INDEX(post_id, user_id)
Or combine like
and love
in some way.
But the real test of how to design the schema comes when you write the queries to set/change/query/summarize the values.
Upvotes: 1
Reputation: 31812
Your second approach is a "simple" implementation of the Entity–attribute–value model. But there is nothing really simple about EAV in practice. Usually EAV is used for user defined attributes. This is not the case here. And I don't see any other good reason in your post. "I don't want to create a new table or column every time I implement a new feature" is a bad one.
There is also no general rule, if one table is better than multiple tables. The pure number of tables says nothing about the quality of a database design. You need to analyse the requirements, the entities and relations. And that brings me to the next point..
Your first two tables are not even entities. "User likes a post" and "User loves a post" are pure relations. The value
column makes no sense, since it can only contain 1
, and thus holds no information. So your tables should actually look like:
user_post_likes (
user_id (FK, PK)
post_id (FK, PK)
)
user_post_loves (
user_id (FK, PK)
post_id (FK, PK)
)
user_post_ratings (
user_id (FK, PK)
post_id (FK, PK)
rating [1-5]
)
The first two tables are clearly different from the third one. Merging them into one table would be a strange idea. And the first two tables should neither be merged, if they are actually two different relations, even when they have the same signature. (It would be similar to merging two functions sum(x,y)
and diff(x,y)
into calc(operator, x, y)
.)
The next point depends on the requirements. I doubt that you ever want a post to be liked and loved at the same time from the same user. If likes and loves are exclusive, then it should be one entity. The table could be something like
user_post_flags (
user_id (FK, PK)
post_id (FK, PK)
flag [like|love]
)
(Sorry - finding good identifiers is a hard task ;-))
You can code 'like' and 'love' with 1
and 2
(or whatever you want). And while it now looks like the ratings table, they are still two different things, and should have separate tables.
Regarding performance: This highly depends on the task/requirement. But I never heard of EAV model being chosen for performance reasons. (Except of myself, but that doesn't count :-).)
Upvotes: 2