Kotzilla
Kotzilla

Reputation: 1413

SQL : keep count in row or select count from db

example : i have 2 tables
- Categories
- Posts

is it a good way to keep post number in categories like this

categories

 id |  title   | posts
----+----------+--------
 1  | golf     |  50
----+----------+-------
 2  | soccer   |  90
----+----------+-------

posts

 id |  title   | category_id
----+----------+--------------
 1  | news 1   |  1
----+----------+--------------
 2  | news 2   |  2
----+----------+--------------
 3  | news 3   |  1
----+----------+--------------

or i use select count() in queries like this

SELECT c.id,
       c.title,
       count(p.id)
FROM `categories` c
INNER JOIN `posts` p ON c.id=p.category_id
GROUP BY c.id

but the problem is when i keep count in categories table when post change category i have to update posts field in categories table too. it's no problem in small project but for big project what is the good way to deal with count because i concern about database performance

thanks for all answers

Upvotes: 3

Views: 2472

Answers (5)

Moutasem Shahin
Moutasem Shahin

Reputation: 242

I Guess you have to use JOIN between the two tables and GROUP BY for the posts

Upvotes: 1

pbhd
pbhd

Reputation: 4467

This, in general depends on your use-case.

From the pure view, of course, you should not introduce redundancy, so your proposed query would be the way to go. However, you might hit some performance problems.

A second approach would be to have a trigger on the posts-table which maintains the posts-counter in categories, but this might also impact performance if there are lots of inserts/deletes in the posts-table.

Another approach would be to have some dirty-flag, which if set causes an update to the categories-table.

So how to proceed? First try out the pure and clean thing, if that hits performance issues, analyze your usage-profile and act depending on that.

Upvotes: 1

Paul Alan Taylor
Paul Alan Taylor

Reputation: 10680

As you rightly point out, you know that you'll introduce a maintenance issue if you store the count separately.

Ideally, you should be determining counts dynamically. With the proper indexes in place this shouldn't be a huge ask for most systems.

However, there are cases where a pre-calculated count makes a lot of sense. Consider a system that refreshes its data on a chronological cycle. Apart from the refresh activity, nothing else inserts information into the system. This sort of system would be perfect for pre-calculated counts.

Looking at your issue specifically, you don't seem to have that option. This looks to be something rather "bloggy", and as a consequence, is something where the counts could change all the time.

If it were me, I'd start out the dynamic count route, horse-trading down to pre-calculated if and when dynamic becomes a problem.

Upvotes: 0

Bhavik Shah
Bhavik Shah

Reputation: 2291

Dr. Dan's comment is correct. It is indeed a good idea to store the count of posts in categories, but remember that if you do that "You must also have triggers to increment and decrement the count when new post is inserted or existing post is deleted" to maintain the integrity.

Upvotes: 1

Brian Hooper
Brian Hooper

Reputation: 22054

My personal preference would be not to keep duplicated data in any table, until it has been proven necessary. If you are averse to writing JOIN queries, you could define a view that contains the query and you can then forget about it.

I have found in the past that proper indexes usually mean there isn't too much of a performance problem with this.

If you find it necessary to keep a count summary your categories table (for performance or other reasons), consider creating INSERT, UPDATE and DELETE triggers on your posts table so that updates can be done by the database rather than relying on the application programmers to remember what has to be done.

Upvotes: 2

Related Questions