Reputation: 25
I was wondering if it is better to add a table column storing the number of X entries associated with the specific ID on a different table or calculate each time the number of entries with a query.
So for example I've got the following tables
Users Table
ID | name | status
------------------
1 | John | 1
2 | Jack | 2
3 | Mary | 1
Posts Table
ID | user_id | text | status
------------------------------
1 | 1 | blabla | 1
2 | 1 | blabla | 1
3 | 2 | blabla | 2
4 | 1 | blabla | 1
5 | 3 | blabla | 1
Is it better to add an extra column on the users table with the number of posts so it will be faster (? and or better?) to show a list of 100 users and number of posts without any extra queries, or else I have to check how many entries with status 1 are associated with user_id X x 100 users for example
Upvotes: 2
Views: 70
Reputation: 191
Yes, selecting a single row based on a User ID will be faster than aggregating rows from the Posts table. At worst the aggregation will have a time complexity of O(n), i.e. the whole table is scanned.
Your proposed solution, a trigger, would definitely speed up this select operation. But there are notable drawbacks. The trigger will essentially add another update operation to any insert/update on the posts table. So UPDATE or INSERT operations on the Posts table would be slowed down.
Honestly, these kind of performance questions only start to matter with large datasets. I suspect that whatever you are working on is not so big that performance is an issue here. If I am correct, I advise you take the simplest approach (don't have an extra column to track the post count). If performance ever becomes an issue you can always add optimisations like this later.
Upvotes: 1