Reputation: 914
I'm running some crontabs
which trigger R-Scripts
where I load Google Analytics Data
for a specific time interval. Usually its the interval:
Today - 1 to Today - 14 days
which corresponds to the following statement:
subset(mydata, date >= Sys.Date()-14 & date <= Sys.Date()-1)
I would like to add some MySQL-Query
to that R-Script
in order to get some data, which uses the same time interval. My tables have the following form:
`pictures` `music` `likes`
id date_of_upload id pictures_id id pictures_id
1 2012-01-16 50 1283 287 12
2 2012-02-17 25 736 2366 39
... ... ... ... ... ...
6000 2016-01-23
My query has the following form where I would like to meet the upper time interval:
SELECT
COUNT(p.id) AS pictures,
COUNT(m.id) AS songs,
COUNT(l.id) AS likes,
CAST(p.date_of_upload AS DATE) AS Posted
FROM pictures p
LEFT JOIN
music m ON p.id = m.pictures_id
LEFT JOIN
likes l ON p.id = l.pictures_id
WHERE p.date_of_upload > DATE_ADD(CURRENT_DATE(), INTERVAL - 14 DAY)
But that doesn't seem to be the right implementation for the time interval.
The required output may look as following:
posted songs likes picture
2016-01-23 20 30 3
2016-01-22 10 8 1
2016-01-21
...
2016-01-07
Upvotes: 0
Views: 431
Reputation: 1269503
I think the simplest solution is to use COUNT(DISTINCT)
:
SELECT COUNT(DISTINCT p.id) AS pictures,
COUNT(DISTINCT m.id) AS songs,
COUNT(DISTINCT l.id) AS likes,
CAST(p.date_of_upload AS DATE) AS Posted
FROM pictures p LEFT JOIN
music m
ON p.id = m.pictures_id LEFT JOIN
likes l
ON p.id = l.pictures_id
WHERE p.date_of_upload > DATE_ADD(CURRENT_DATE(), INTERVAL - 14 DAY)
The problem is probably that you are getting Cartesian products between the two tables -- a separate row for each combination of pictures, music, and likes.
COUNT(DISTINCT)
is the easiest way, but if you have large values, then it is inefficient.
Upvotes: 1