Reputation: 4423
Doing a simple query I'm finding two very different results, and I'm not sure how I'm coming to this conclusion. Below is the example, please let me know if its something I'm just over looking..
SELECT b.fkid as t,
b.timestamp,
count(b.fkid) as hits,
count(distinct(b.fkid)) as num,
DATE_FORMAT( b.timestamp, '%Y-%m-%d' ) AS val1
FROM a, b
WHERE a.id = b.fkid
group by val1
...result:
2 2009-09-25 08:33:42 **27** 3 2009-09-25
...compared to:
SELECT b.fkid as t,
b.timestamp,
count(b.fkid) as hits,
count(distinct(b.fkid)) as num,
DATE_FORMAT( b.timestamp, '%Y-%m-%d' ) AS val1
FROM a, b
WHERE a.id = b.fkid
group by t
..result:
2 2009-09-25 08:33:42 **39** 1 2009-09-25
3 2009-09-25 08:36:59 **6** 1 2009-09-25
10 2009-09-25 22:40:14 **4** 1 2009-09-25
I don't understand how 39+6+4 = 27? I am expecting the first value to be 49 not 27. Also tried:
SELECT b.fkid as t,
b.timestamp,
count(b.fkid) as hits,
count((b.fkid)) as num,
DATE_FORMAT( b.timestamp, '%Y-%m-%d' ) AS val1
FROM a, b
WHERE a.id = b.fkid
group by val1
...which produces:
2 2009-09-25 08:33:42 27 27 2009-09-25
Following recommendation from below I tried eliminating the irrelevant data and made the query:
SELECT count(b.fkid) as hits,
count(distinct(b.fkid)) as num,
DATE_FORMAT( b.timestamp, '%Y-%m-%d' ) AS val1
FROM a, b
WHERE a.id = b.fkid
group by val1
...this produced:
27 3 2009-09-25
I've tried to simplify this down to:
SELECT count(b.fkid) as hits,
count(distinct(b.fkid)) as num
FROM a, b
WHERE a.id = b.fkid
group DATE_FORMAT( b.timestamp, '%Y-%m-%d' )
...this produced:
27 3
and:
SELECT count(b.fkid) as hits,
count(distinct(b.fkid)) as num
FROM a, b
WHERE a.id = b.fkid
group b.fkid
...this produced:
39 1
6 1
4 1
Upvotes: 0
Views: 130
Reputation: 531
By formatting the date for val1, you are taking records with different timestamps (but the same calendar date) and grouping them. AKA the timestamp for "2009-12-01 01:00:00" and "2009-12-01 02:00:00" formatted with the DATE_FORMAT you specify both return "2009-12-01". BUT by not grouping on that date format in the second query, the records are staying separate.
Upvotes: 0
Reputation: 912
In the first query your grouping by val1 which is the same for all the records, so you only see the first value of non aggregate functions.
In the second query you are seeing 3 results -- the number of different fkid's
hope this helps.
Upvotes: 0
Reputation: 229108
In SQL you should group by all the fields that are not aggregates, not just val1.
MySQL lets you get away with not doing so(most other databases will throw an error) but it can leave you with unpredictable behavior like this , especially if val1 does not disinctly identify the aggregates you are producing.
Upvotes: 3
Reputation: 7946
You're counting distinct fkid values. Can there be duplicates of that value? That could change your numbers, I think.
Upvotes: 0