George
George

Reputation: 69

SUM is decreasing the amount of records

I'm trying to select from a union of two queries The union of the two queries gives me let's say 90,000 Records but when I select the sum of them it gives me let's say 89,800 records. So, the sum is decreasing the amount of records.

Select Site,building,..., 
SUM(PhysicalCount) as PhysicalResult
From(
    Select Site,building,...,
    Count(TagID) as PhysicalCount 
    FROM Physical
    Where <SomeCondition1>
    Union
    Select Site,building,...,
    Count(TagID) as PhysicalCount 
    FROM Physical
    Where <SomeCondition2>
) T1
Group by Site,building,...,

Upvotes: 1

Views: 196

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1270411

First, your query does not make sense, because you have a count in the subqueries with no corresponding group by. This should return an error.

If you want the sum for the group, but not to aggregate the group, then use window functions instead:

Select Site,building,..., 
       SUM(PhysicalCount) over (partition by site, building) as PhysicalResult
From(
    Select Site,building,...,
    Count(TagID) as PhysicalCount 
    FROM Physical
    Where <SomeCondition1>
    group by site, building
    Union
    Select Site,building,...,
    Count(TagID) as PhysicalCount 
    FROM Physical
    Where <SomeCondition2>
    group by site, building
    ) T1

Although the union removes duplicates, clearly there are combinations of site and building that have different counts.

Perhaps your intention is actually to do this:

    Select Site, building, Count(TagID) as PhysicalCount 
    FROM Physical
    Where (<SomeCondition1>) or (<someCondition2)
    group by site, building;

Upvotes: 2

whytheq
whytheq

Reputation: 35577

That is what you'd expect.

Construct this simple example:

CREATE TABLE #X 
(
[Id] VARCHAR(100),
[Amount] INT
)
INSERT INTO #X VALUES
('michaeljackson',10),
('jim',20),
('jill',20),
('j',30)

CREATE TABLE #Y 
(
[Id] VARCHAR(100),
[Amount] INT
)
INSERT INTO #Y VALUES
('michaeljackson',100),
('jim',200),
('jill',200),
('j',300)

Now try running this:

SELECT * FROM #X
UNION
SELECT * FROM #Y

Now run the following, which is a model of your script:

SELECT  x.Id,
    SUM(x.Amount) AS "Aggregate"
FROM 
    (
    SELECT * FROM #X
    UNION
    SELECT * FROM #Y
    ) x
GROUP BY x.Id

UNION will not aggregate amounts in the scripts that are being UNIONed together - it simply append one record set onto the other and deletes duplicates. If you want to append one to the other and preserve duplicates then use UNION ALL

Upvotes: 2

ljh
ljh

Reputation: 2594

Yes, you should see the difference as expected.
Give you an example, you will understand why after group by, the total row count is lesser than your row count you get in inner union query. For example, this is your inner query output:

site1, building1, ...., 15
site1, building1, ....., 16

You use UNION in your inner query, but these 2 rows are different, because count is different.
However, then you using GROUP BY in outer query, group by site, building, .....
Then these 2 rows will become one, and count will be 31

Upvotes: 2

Related Questions