Reputation: 655
I have two tables named author
and commit_metrics
. Both of them have an id
field. Author has author_name
and author_email
. Commit_metrics has author_id
and author_date
.
I am trying to write a query that will get the number of commits that each author had in a given week, even if that number is 0. Here's what I have so far:
SELECT a.id, a.author_name, a.author_email, c.week_num, COUNT(c.id)
FROM author AS a
CROSS JOIN generate_series(1, 610) AS s(n)
LEFT JOIN (SELECT c.id,
c.author_id,
c.author_date,
WEEK_NUMBER(c.author_date) AS week_num
FROM commit_metrics c) AS c ON s.n = c.week_num AND a.id = c.author_id
WHERE c.week_num IS NOT NULL
GROUP BY a.id, a.author_name, a.author_email, c.week_num
ORDER BY c.week_num DESC, a.author_name;
WEEK_NUMBER
is a function I wrote for this query:
CREATE OR REPLACE FUNCTION WEEK_NUMBER(date TIMESTAMP) RETURNS INTEGER AS
$$
SELECT TRUNC(DATE_PART('day', date - '2008-01-01') / 7)::INTEGER;
$$ LANGUAGE SQL;
Currently, the query works like a charm with one major caveat. It doesn't properly calculate 0 when the author made no commits in a given week. I'm not sure why it doesn't. When I do the query with just the FROM
and CROSS JOIN
, it properly prints the many thousand combined authors/weeks. However, when I add the LEFT JOIN
, it loses any week where the author did not make a commit.
Any help would be greatly appreciated. I'm open to doing away with the generate_series
call if it's unnecessary.
Also, I found this post, but I don't think it's helpful for my case.
Upvotes: 1
Views: 4503
Reputation: 1270411
Remove the filtering condition. Also a subquery is not needed and you want to select s.n
instead of c.week_num
:
SELECT a.id, a.author_name, a.author_email, s.n as week_num, COUNT(c.id)
FROM author a CROSS JOIN
generate_series(1, 610) AS s(n) LEFT JOIN
commit_metrics c
ON s.n = WEEK_NUMBER(c.author_date) AND a.id = c.author_id
GROUP BY a.id, a.author_name, a.author_email, c.week_num
ORDER BY c.week_num DESC, a.author_name;
Upvotes: 0
Reputation: 281
Although you are using a left join, "WHERE c.week_num IS NOT NULL" filters out all of the cases where there is no post. Try this:
SELECT a.id, a.author_name, a.author_email, s.n as week_num, COUNT(c.id) as post_count
FROM author AS a
CROSS JOIN generate_series(1, 610) AS s(n)
LEFT JOIN (SELECT c.id,
c.author_id,
c.author_date,
WEEK_NUMBER(c.author_date) AS week_num
FROM commit_metrics c) AS c ON s.n = c.week_num AND a.id = c.author_id
GROUP BY a.id, a.author_name, a.author_email, s.n
ORDER BY s.n DESC, a.author_name;
Upvotes: 3
Reputation: 116
Your WHERE
clause is excluding the records on commit_metrics
that are null, which is the case when the author has no commits during the week selected. You should just remove this from the WHERE
clause to get your desired output.
If you need the WHERE
clause to eliminate some of the CROSS JOIN
records based on your data, you will need that CROSS JOIN
and WHERE
to be in a sub-select that you LEFT JOIN
to, or create some more complicated logic in the current WHERE
clause.
Upvotes: 1