Jason
Jason

Reputation: 17956

Multiple counts on different tables in same query

We have a tool that allows users to create their own groups. Within these groups, users can write posts. What I am trying to determine is the relationship between size of the group and total number of posts in that group.

I can do SQL statements to get a list of group names and the number of users in that group (Query 1) and a list of group names and the number of posts (Query 2) but I would like for both to be in the same query.

Query 1

select count(pg.personID) as GroupSize, g.GroupName
from Group g inner join PersonGroup pg g.GroupID = pg.GroupID
where LastViewed between @startDate and @enddate and
    g.Type = 0
group by g.GroupID, g.GroupName
order by GroupSize

Query 2

select count(gp.PostID) as TotalPosts, g.GroupName
from Group g inner join GroupPost gp on g.GroupID = gp.GroupID
    inner join Post p on gp.PostID = p.PostID
where g.Type = 0 and
    gp.Created between @startDate and @enddate
group by g.GroupID, g.GroupName
order by TotalPosts

**Note: A person can post the same "post" to multiple groups

I believe from this data I could build a Histogram (# of groups with 10-20 users, 21-30 users, etc..) and incorporate average number of posts for groups in those different bins.

Upvotes: 0

Views: 1939

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269445

Paul's solution assumes that the two sets of groups (by posts and by users) is the same. This may not be true, so either a full outer join or union all is needed.

My preference is the following:

with groups as 
(
   select *
   from Group g
   where g.Type = 0 
     and g.LastViewed between @startDate and @enddate
)
select GroupId, GroupName, SUM(GroupSize) as GroupSize, SUM(TotalPosts) as TotalPosts)
from 
(
  (select groups.GroupId, groups.GroupName, 1 as GroupSize, 0 as TotalPosts
   from groups 
   join PersonGroup pg 
     on pg.GroupId = groups.groupId
   ) 
   union all
   (select groups.GroupId, groups.GroupName, 0 as GroupSize, 1 as TotalPosts
    from groups 
    join GroupPost gp
      on groups.GroupId = gp.GroupId 
    join Post p
      on gp.PostId = p.PostId
    )
)
group by GroupId, GroupName

The "with" clause defines the set of groups that you are using. This places the definition in one place, making it obvious that the two subqueries have the same filtering. The two subqueries simply have flags indicating each of the two variables, which are then aggregated at the higher level. Sometimes it is more efficient to also do the aggregation inside the subqueries, particularly when there are indexes.

Upvotes: 0

Paul Grimshaw
Paul Grimshaw

Reputation: 21014

A simple solution would be to use those queries as Sub queries, and combine them:

SELECT 
    grps.GroupName,
    grps.GroupSize,
    psts.TotalPosts
FROM (
    select count(pg.personID) as GroupSize, g.GroupName, g.GroupID
    from Group g inner join PersonGroup pg g.GroupID = pg.GroupID
    where LastViewed between @startDate and @enddate and
        g.Type = 0
    group by g.GroupID, g.GroupName
    order by GroupSize) grps
JOIN (
    select count(gp.PostID) as TotalPosts, g.GroupName, g.groupID
    from Group g inner join GroupPost gp on g.GroupID = gp.GroupID
        inner join Post p on gp.PostID = p.PostID
    where g.Type = 0 and
        gp.Created between @startDate and @enddate
    group by g.GroupID, g.GroupName
    order by TotalPosts) psts
ON psts.GroupID = grps.GroupID

Upvotes: 2

Related Questions