ashishjain
ashishjain

Reputation: 1289

count(*) from 2 tables with same columns

I have a weird problem at hand and I am sure I am very bad at SQL.

The problem is as follows:

I have 2 tables table1 and table2. Both the tables have same set of columns namely ID number, X number, Y number. Here X and Y will have value 0 or 1.

Now say for example ID ranges from 1-100 in table1 and ranges from 91-200 in table2: in table1, X has values for all the 100 rows say 1, and in the same table Y has values for only 90 say 1. The next 10 values of Y that is 91-100 are in table2.

Now, calling a count(*) for various queries say X=1 and Y=1, X=1 and Y=0 etc I do not get the correct values because some set of Y values are present in table2. I was looking at left Join but somehow I am unable to figure out if this is the correct approach.

Table1
-------
Id X Y
1  1 1 
2  1 1 
3  1 1 
4  1 0 
5  1 0



Table2
-------
Id X Y
4  0 1
5  0 1
6  0 0
7  0 0
8  0 0
9  1 1

So If I say X=1 and Y=1 I should get 5 as the count(*).

Hi Justin, Let me explain you the actual scenario. Consider 3 processes p1,p2 and p3 which have a cache where P1, P2 and P3 are nothing but X,Y and Z columns. The content of this cache is nothing but the ID's. IF p1 dumps and Id 1 I would say X=1 for ID=1 etc. Each of these processes dumps cache in a group say g1 and g2. I have created a table with respect to g1 and g2. So g1 represent table1 and g2 represent table2. Each of p1, p2 and p3 have a limit of say dumping 100 IDs. It is possible that P1 has dumped ID's 1-100 in g1(table1) where p2 has dumped just 90 in g1(table1) and rest 10 in g2(table2), similarly p3 would have say dumped 95 in g1(table1) and rest 5 in g2(table2). However each of p1, p2 and p3 have dumped 100 ids but in different groups. Now if I want to get a count() in a ideal case when all the P1, P2 and P3 have dumped cache in g1 I would say get me max(id) from g1 where P1=1 and similarly min(id) from g1 where P1=1. Than I would write a query saying "Select count() from g1 where X=1 and Y=1 and Z=1 where ID between min(id) from g1 and max(id) from g1. In an ideal case it would have returned 100.But in current case it returns 90 which is not correct. So to resolve this issue I will also have to consider the ID's which are present in g2(table2) also.

I hope this answers your question.

Thanks mav

Upvotes: 1

Views: 4935

Answers (2)

Vincenzo Maggio
Vincenzo Maggio

Reputation: 3869

SELECT COUNT(*)
FROM
(SELECT ID, MAX(X) X, MAX(Y) Y
FROM
(SELECT *
FROM TABLE1
UNION ALL
SELECT *
FROM TABLE2)
GROUP BY ID)
WHERE X = 1 AND Y = 1

Or if you wanna use an advanced group by clause

SELECT COUNT(*)
FROM
(SELECT *
FROM TABLE1
UNION ALL
SELECT *
FROM TABLE2)
HAVING MAX(X) = 1 AND MAX(Y) = 1
GROUP BY ID

Upvotes: 0

Justin Cave
Justin Cave

Reputation: 231781

It sounds like you want to UNION or UNION ALL the two tables together before applying the predicates. Something like

SELECT COUNT(*)
  FROM (SELECT id, x, y
          FROM table1
        UNION ALL
        SELECT id, x, y
          FROM table2)
 WHERE x = 1
   AND y = 1;

UNION ALL will return every row from both tables. UNION will eliminate duplicate rows.

If this is not what you want, it would be very helpful to walk through an example where you create a few rows of sample data in each table and show us exactly the result you want and how you obtained that result.

Upvotes: 4

Related Questions