applechief
applechief

Reputation: 6895

Mysql count records grouped by day in multiple tables

My database has news articles and blog posts. The primary key for both is an ItemID that is unique across both tables.

The articles are in a table that has the following fields

The blogposts table has the following fields

both tables have extra fields unique to them.

I have a third table that holds meta information about articles and posts.

The items table has the following fields

every blogpost and article has a record in the items table and a record in its respective table.

What I am trying to do is build a query that will count the number of items posted per day. I can do it for one table using a count grouped by date_posted but how to combine articles and posts count in one query?

Upvotes: 1

Views: 2597

Answers (4)

user359040
user359040

Reputation:

Similar to Dems, but slightly simpler:

select date_posted, count(*)
from (select date_posted from article union all
      select date_posted from blogposts) v
group by date_posted

Upvotes: 3

MatBailie
MatBailie

Reputation: 86735

You can do it two ways.
1. Join everything together and then aggregate (See Tom H's answer).
2. Aggregate each table, UNION them, and aggregate again.

Option 1 may seem shorter, but will mean that you may not benefit from INDEXes on the root tables (As they have to be re-ordered for the JOIN). So I'll show option 2, which is the direction you were headed any way.

SELECT
  date_posted,
  SUM(daily_count) AS daily_count
FROM
  (
   SELECT date_posted, COUNT(*) AS daily_count FROM article   GROUP BY date_posted
   UNION ALL
   SELECT date_posted, COUNT(*) AS daily_count FROM blogposts GROUP BY date_posted
  )
  AS combined
GROUP BY
  date_posted

This should be fastest, provided that you have an index on each table where date_posted is the first field in the index. Other-wise the tables will still need to be re-ordered for the aggregation.

Upvotes: 3

Tom H
Tom H

Reputation: 47464

I would have used a different table design for this, with types and subtypes. Your Items table has a single column primary key and your Blog_Posts and Articles tables' primary keys are the same ID with a foreign key to the Items table. That would make something like this pretty easy to do and also helps to ensure data integrity.

With your existing design, your best bet is probably something like this:

SELECT
    I.item_id,
    I.source_id,
    COALESCE(A.date_posted, B.date_posted) AS date_posted,
    COUNT(*) AS date_count
FROM
    Items I
LEFT OUTER JOIN Articles A ON
    A.item_id = I.item_id AND
    I.source_id = 'A'  -- Or whatever the Articles ID is
LEFT OUTER JOIN Blog_Posts B ON
    B.item_id = I.item_id AND
    I.source_id = 'B'  -- Or whatever the Blog_Posts ID is
GROUP BY
    I.item_id,
    I.source_id,
    COALESCE(A.date_posted, B.date_posted)

You could also try using a UNION:

SELECT
    SQ.item_id,
    SQ.source_id,
    SQ.date_posted,
    COUNT(*) AS date_count
FROM
    (
        SELECT I1.item_id, I1.source_id, A.date_posted
        FROM Items I1
        INNER JOIN Articles A ON A.item_id = I1.item_id
        WHERE I1.source_id = 'A'
        UNION ALL
        SELECT I2.item_id, I2.source_id, B.date_posted
        FROM Items I2
        INNER JOIN Articles B ON B.item_id = I2.item_id
        WHERE I2.source_id = 'B'
    )

Upvotes: 1

Dan
Dan

Reputation: 11069

select item_id, date_posted from blogposts where /* some conditions */
union all select item_id, date_posted from articles where /* some conditions */

You'll probably need to put that into a subquery, and if you so desire, join it with other tables, when running the group by. But the main point is that union is the operator you use to combine like data from different tables. union all tells the database that you don't need it to combine duplicate records, since you know that the two tables will never share an item_id, so it's a little faster (probably).

Upvotes: 0

Related Questions