Reputation: 832
In doing some database work I ran across a strange problem. I'm trying to use the built-in average function in Access for a certain population. When I run the query I get a different answer than when I look up the individual persons and then average the values. Anyone notice anything in the SQL below that would lend itself to giving two different answers?
using avg function:
SELECT DISTINCT tblKentuckyCounties.Appalachian, tblKentuckyCounties.Metro, Avg(Visits.admitNumUlcers) AS AvgOfadmitNumUlcers
FROM (tblKentuckyCounties INNER JOIN Person ON tblKentuckyCounties.ID = Person.County) INNER JOIN Visits ON Person.PersonID = Visits.PersonID
GROUP BY tblKentuckyCounties.Appalachian, tblKentuckyCounties.Metro
HAVING (((tblKentuckyCounties.Appalachian)=No) AND ((tblKentuckyCounties.Metro)=No))
;
looking up records individually:
SELECT DISTINCT tblKentuckyCounties.Appalachian, tblKentuckyCounties.Metro, Visits.admitNumUlcers, Person.PersonID
FROM (tblKentuckyCounties INNER JOIN Person ON tblKentuckyCounties.ID = Person.County) INNER JOIN Visits ON Person.PersonID = Visits.PersonID
GROUP BY tblKentuckyCounties.Appalachian, tblKentuckyCounties.Metro, Visits.admitNumUlcers, Person.PersonID
HAVING (((tblKentuckyCounties.Appalachian)=No) AND ((tblKentuckyCounties.Metro)=No));
Upvotes: 0
Views: 765
Reputation: 13066
Without knowing the exact layout of your database it's a little difficult to be sure, but I have a sneaking suspicion that you may be getting duplicate Person
records. What happens if a user has more than one visit
? Then, because one of the queries requests unique PersonId
records and the other does not, you're getting different record counts.
You're going to have to decide if you want the average over all visits, or only over one visit, or what (this is not stated explicitly in your question).
Your query as written feels a little... interesting? Try this instead:
SELECT a.PersonId, AVG(b.admitNumUlcers)
FROM (Person as a
INNER JOIN Visits as b
ON b.PersonId = a.PersonId)
INNER JOIN tblKentuckyCounties as c
ON (c.ID = a.County
AND c.Appalachian = No
AND c.Metro = No)
GROUP BY a.PersonId
Here's what's going on -
This query will get you the average number of ulcers per person, over all of their visits (where the person lives in a county that is not in the appalachians or in a metro area).
It assumes that Appalachian
and Metro
are boolean values (true/false
, or apparently here, yes/no
...) - If this is not the case, things may need to be modified.
Upvotes: 0
Reputation: 2205
While what @mwolfe02 pointed out to makes sense, in your case, the problem is different. There are two issues with your FIRST query:
So, if you use the following query (instead of the first one) you will get matching results:
SELECT tblKentuckyCounties.Appalachian, tblKentuckyCounties.Metro, Visits.admitNumUlcers, Person.PersonID
FROM (tblKentuckyCounties INNER JOIN Person ON tblKentuckyCounties.ID = Person.County) INNER JOIN Visits ON Person.PersonID = Visits.PersonID
WHERE (((tblKentuckyCounties.Appalachian)="No") AND ((tblKentuckyCounties.Metro)="No"))
Upvotes: 1
Reputation: 97111
One possible explanation for the difference you're seeing is this GROUP BY in the second query:
GROUP BY
tblKentuckyCounties.Appalachian,
tblKentuckyCounties.Metro,
Visits.admitNumUlcers,
Person.PersonID
Any rows with duplicates of those 4 fields would be rolled up into a single group. Check by changing the field list for the second query:
SELECT
tblKentuckyCounties.Appalachian,
tblKentuckyCounties.Metro,
Visits.admitNumUlcers,
Person.PersonID,
Count(*) AS rows_in_group
If rows_in_group is > 1, you have at least part of the answer as to why the averages don't agree.
Upvotes: 3
Reputation: 24227
I can't be sure because I don't know exactly what you are doing to calculate your average when you are "looking up records individually," but I believe this answer from @Dale Halliwell might apply:
Upvotes: 2