Reputation: 27
I’m having trouble with counting zero values via an SQL Query. Though similar questions have been asked on the site, I can’t seem to get it working with my query.
I have a database including certain strings, such as '{TICKER|IBM}' to which I will refer as ticker-strings. The target is to count the amount of ticker-strings per day.
My database table 'tweets' includes the rows 'tweet_id', 'created at' (dd/mm/yyyy hh/mm/ss) and 'processed text'. The ticker-strings, such as '{TICKER|IBM}', are within the 'processed text' row.
Thus for counting, I made the following SQL query.
SELECT COUNT( DISTINCT `tweet_id` ) , DATE( `created_at` )
FROM `tweets`
WHERE `processed_text` LIKE '%{TICKER|IBM}%'
GROUP BY DATE( `tweets`.`created_at` )
This query returns the amount of distinct tweet ids per date, but neglects the dates on which the count equals zero. I understand that this can be blamed on not using a 'LEFT JOIN' in my SQL query, but I have still been unable to create a working SQL query for this problem (due to my limited SQL knowledge).
UPDATE thanks to a commenter: I will create a table with the dates (without hh/mm/ss in it). Could someone explain to me how I should link to this table if it is called 'dates'?
Upvotes: 1
Views: 134
Reputation: 18940
I made a suggestion to create an ALMANAC table in response to a similar question, a few years ago.
Structuring Databases for Financial Statements
An ALMANAC table can be enormously useful in a reporting database, even if it isn't a data mart or a data warehouse.
The best way to generate an ALMANAC table is to write a program, using a regular programming language, or perhaps a procedural extension to SQL. If you do this, there are some attributes like "Company holiday" (yes/no) or "Fiscal Quarter" (FY2014Q2) that you might find worth while to add, depending on your reporting needs. Once you have an ALMANAC, Bamar's response using LEFT JOIN should help you.
Upvotes: 0
Reputation: 780869
You need to left join your date table with the subquery that returns the counts:
SELECT d.date, IFNULL(t.count, 0) AS tweet_count
FROM all_dates AS d
LEFT JOIN (
SELECT COUNT(DISTINCT tweet_id) AS count, DATE(created_at) AS date
FROM tweets
WHERE processed_text LIKE '%{TICKER|IBM}%'
GROUP BY date) AS t
ON d.date = t.date
Also, if tweet_id
is a unique column, you don't need COUNT(DISTINCT tweet_id)
, just use COUNT(*)
.
Upvotes: 1