Reputation: 997
This seems like a pretty straight forward nested query to me - but clearly there is something that I am missing entirely.
SETTING:
I have two tables ccod
and psc
.
psc
has col company_number
, which I want to be matched on
ccod.company_number_1
or ccod.company_number_2
.
given that the tables are pretty big, I use some prior knowledge to reduce the number of company numbers to be matched:
company numbers cannot be nulls
company numbers must be 8 char long
psc.company_number
is clean and contains only 8-char long strings.
ccod.company_number_1
or ccod.company_number_2
contains a lot of junk.
ccod.company_number_1
or ccod.company_number_2
can both be null, both contain a string or either contain one.
I don't care about duplicates for now.
Plus
UNION ALL
to create 1 single column as result of the joining the two columns of ccod
- aliasing it to stacked_company_numbers
.so:
SELECT *
INTO ccod_psc
FROM psc
WHERE (SELECT DISTINCT psc.company_number)
in
(SELECT DISTINCT stacked_company_numbers
FROM (SELECT company_number_1
FROM ccod
WHERE company_number_1 is not null
UNION ALL
SELECT company_number_2
FROM ccod
WHERE company_number_2 is not null) AS stacked_company_numbers
WHERE char_length(stacked_company_numbers::text) = 8);
[42883] ERROR: operator does not exist: text = record
Hint: No operator matches the given name and argument types. You might need to add explicit type casts.
Position: 86
Now the words of this error are pretty self-explanatory, but I have no idea how to fix it, and have tried a lot of other answers related to error 42883, but unsuccessfully.
Thanks for any bit of help.
-- EDIT:
I also found a way to make my query work:
SELECT *
INTO ccod_psc
FROM psc
WHERE (SELECT DISTINCT psc.company_number)
in
(SELECT company_number_1
FROM ccod
WHERE company_number_1 is not null
UNION -- this takes cares of the duplicates.
SELECT company_number_2
FROM ccod
WHERE company_number_2 is not null) AS stacked_company_numbers
WHERE char_length(stacked_company_numbers::text) = 8)
Upvotes: 1
Views: 215
Reputation: 1269443
I think two left join
s is more efficient:
SELECT p.*
INTO ccod_psc c
FROM psc p LEFT JOIN
ccod c1
ON p.company_number = c1.company_number_1 LEFT JOIN
ccod c2
ON p.company_number = c2.company_number_2 AND
c1.company_number_1 IS NULL
WHERE c1.company_number_1 IS NOT NULL OR
c2.company_number_2 IS NOT NULL
The advantage of this approach is that this can use indexes on ccod(company_number_1)
and ccod(company_number_2)
.
EDIT:
If there can be duplicates in ccod
, then exists
might have better performance:
SELECT p.*
INTO ccod_psc c
FROM psc p
WHERE EXISTS (SELECT 1
FROM ccod c1
WHERE p.company_number = c1.company_number_1
) OR
EXISTS (SELECT 1
FROM ccod c2
WHERE p.company_number = c2.company_number_2
) ;
Upvotes: 1
Reputation: 222392
I do suspect that exists
does what you want:
select p.*
from psc p
where
char_length(p.company_numbers::text) = 8
and exists (
select 1
from ccod c
where p.company_number in (c.company_number_1, c.company_number_2)
)
The query pulls out records from psc
whose company_number
are 8 characters long and can be found in either company_number_1
or company_number_2
in table ccod
.
Upvotes: 1