Reputation: 355
I am querying a table (my table has NULL values) using this sql code:
SELECT x.f1,Count(x.f1) FROM
(SELECT p1 As F1 FROM table
UNION ALL
SELECT p2 As F1 FROM table
UNION ALL
SELECT p3 As F1 FROM table) x
GROUP BY x.f1
This code achieves the question this user asked: SQL, count in multiple columns then group by
However, when I manually test how many counts a certain item in my table gets using this statement on all columns:
WHERE col1 or col2 or col3, etc = 'entry name'
, I get a different number of counts of that entry as with the union query. The union query either overshoots the amount of the manual query or equals the manual query (what I want). For example, for a certain entry, the manual query will return 2 and the union query will return 4.
I realize this question is a bit vague because I cannot disclose my table info nor my exact query, but I want to know whether I'm missing something important. Thanks!
PS. I am using MS SQL server 2012
EDIT: EXAMPLE (taken from previous user's post), for clarification:
Source data table:
P1 P2 P3
-----------
a b
a a a
b c
a b b
b a
I want it to show something like this:
Desired query output:
Total
-------------
a | 6
b | 5
c | 1
Using the UNION query code above for MY table, I will get 12 as the total count for 'a.' Using my manual query:
SELECT *
FROM TABLE
WHERE P1 = 'a' OR P2 = 'a' OR P3 ='a'
, I will get 6 for the total count.
My question is, shouldn't both queries return the same value?
Upvotes: 0
Views: 2674
Reputation: 4331
The UNION
way runs three times through the table looking for matches, and will count the same row twice if it has the matching value in more than one column. The OR
way runs through the table just once and won't double-count anything.
Remember, UNION
doesn't combine your conditions; it combines the actual rows output from each component query. It's not at all the same thing as using OR
.
Upvotes: 0
Reputation: 1269873
First there really is no reason to think that the two queries would return similar results. The one with the where
clause is counting rows, regardless of the number of matches. The one with union all
could count each row up to three times, depending on the values in p1
, p2
, and p3
.
To count rows with the union all
, you need an id to identify each row. Then you can use count(distinct id)
in the outer query.
Upvotes: 3
Reputation: 116110
col1 or col2 or col3, etc = 'entry name'
is not the right syntax. It just checks if col1
and col2
have any value. Only the last one in the list is checked against the actual value 'entry name'
.
Upvotes: 1