dzilla
dzilla

Reputation: 832

Average function in Access 2007 returning strange results

In doing some database work I ran across a strange problem. I'm trying to use the built-in average function in Access for a certain population. When I run the query I get a different answer than when I look up the individual persons and then average the values. Anyone notice anything in the SQL below that would lend itself to giving two different answers?

using avg function:

SELECT DISTINCT tblKentuckyCounties.Appalachian, tblKentuckyCounties.Metro, Avg(Visits.admitNumUlcers) AS AvgOfadmitNumUlcers
FROM (tblKentuckyCounties INNER JOIN Person ON tblKentuckyCounties.ID = Person.County) INNER JOIN Visits ON Person.PersonID = Visits.PersonID
GROUP BY tblKentuckyCounties.Appalachian, tblKentuckyCounties.Metro
HAVING (((tblKentuckyCounties.Appalachian)=No) AND ((tblKentuckyCounties.Metro)=No))

;

looking up records individually:

SELECT DISTINCT tblKentuckyCounties.Appalachian, tblKentuckyCounties.Metro, Visits.admitNumUlcers, Person.PersonID
FROM (tblKentuckyCounties INNER JOIN Person ON tblKentuckyCounties.ID = Person.County) INNER JOIN Visits ON Person.PersonID = Visits.PersonID
GROUP BY tblKentuckyCounties.Appalachian, tblKentuckyCounties.Metro, Visits.admitNumUlcers, Person.PersonID
HAVING (((tblKentuckyCounties.Appalachian)=No) AND ((tblKentuckyCounties.Metro)=No));

Upvotes: 0

Views: 765

Answers (4)

Clockwork-Muse
Clockwork-Muse

Reputation: 13066

Without knowing the exact layout of your database it's a little difficult to be sure, but I have a sneaking suspicion that you may be getting duplicate Person records. What happens if a user has more than one visit? Then, because one of the queries requests unique PersonId records and the other does not, you're getting different record counts.
You're going to have to decide if you want the average over all visits, or only over one visit, or what (this is not stated explicitly in your question).
Your query as written feels a little... interesting? Try this instead:

SELECT a.PersonId, AVG(b.admitNumUlcers)
FROM (Person as a
      INNER JOIN Visits as b
      ON b.PersonId = a.PersonId)
      INNER JOIN tblKentuckyCounties as c
      ON (c.ID = a.County
          AND c.Appalachian = No
          AND c.Metro = No) 
GROUP BY a.PersonId

Here's what's going on -
This query will get you the average number of ulcers per person, over all of their visits (where the person lives in a county that is not in the appalachians or in a metro area). It assumes that Appalachian and Metro are boolean values (true/false, or apparently here, yes/no...) - If this is not the case, things may need to be modified.

Upvotes: 0

Igor Turman
Igor Turman

Reputation: 2205

While what @mwolfe02 pointed out to makes sense, in your case, the problem is different. There are two issues with your FIRST query:

  1. DISTINCT
  2. GROUP BY

So, if you use the following query (instead of the first one) you will get matching results:

SELECT tblKentuckyCounties.Appalachian, tblKentuckyCounties.Metro, Visits.admitNumUlcers, Person.PersonID
FROM (tblKentuckyCounties INNER JOIN Person ON tblKentuckyCounties.ID = Person.County) INNER JOIN Visits ON Person.PersonID = Visits.PersonID
WHERE (((tblKentuckyCounties.Appalachian)="No") AND ((tblKentuckyCounties.Metro)="No"))

Upvotes: 1

HansUp
HansUp

Reputation: 97111

One possible explanation for the difference you're seeing is this GROUP BY in the second query:

GROUP BY
    tblKentuckyCounties.Appalachian,
    tblKentuckyCounties.Metro,
    Visits.admitNumUlcers,
    Person.PersonID

Any rows with duplicates of those 4 fields would be rolled up into a single group. Check by changing the field list for the second query:

SELECT 
    tblKentuckyCounties.Appalachian,
    tblKentuckyCounties.Metro,
    Visits.admitNumUlcers,
    Person.PersonID,
    Count(*) AS rows_in_group

If rows_in_group is > 1, you have at least part of the answer as to why the averages don't agree.

Upvotes: 3

mwolfe02
mwolfe02

Reputation: 24227

I can't be sure because I don't know exactly what you are doing to calculate your average when you are "looking up records individually," but I believe this answer from @Dale Halliwell might apply:

An average of average values will not return the same result as a single average over all values, unless all the groups averaged have the same number of items.

Upvotes: 2

Related Questions