sincoder
sincoder

Reputation: 13

Extracting substrings from a column and check whether that substring matched with other substring

I am working with some SQL queries and on one problem I got stuck. In my problem I have a table 'world' that contains a field 'name', question is to find out the name of two countries whose first three initials matched like Australia -> AUS and Austria -> AUS without repeating the countries.

SELECT a.name,SUBSTRING(a.name FROM 1 FOR 3) as cif,
b.name,SUBSTRING(b.name FROM 1 FOR 3) as cis
FROM world a,world b 
where a.name!=b.name 
group by a.name,b.name 
having cif=cis

Result:

name            cif name            cis
Australia       Aus Austria         Aus
Austria         Aus Australia       Aus
Bahamas         Bah Bahrain         Bah
Bahrain         Bah Bahamas         Bah
Belarus         Bel Belgium         Bel
Belarus         Bel Belize          Bel
Belgium         Bel Belarus         Bel
Belgium         Bel Belize          Bel
Belize          Bel Belarus         Bel
Belize          Bel Belgium         Bel
Burkina Faso    Bur Burma           Bur

In above result you will see that the some of the countries is repeating like Australia is matched with Austria in fist row should not matched with Austria with Austalia in second row and result should be like

Result:

name            cif name            cis
Australia       Aus Austria         Aus

Bahamas         Bah Bahrain         Bah

Belarus         Bel Belgium         Bel
Belarus         Bel Belize          Bel

Belgium         Bel Belize          Bel


Burkina Faso    Bur Burma           Bur

Upvotes: 1

Views: 43

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1271003

Add an inequality:

SELECT a.name, SUBSTRING(a.name FROM 1 FOR 3) as cif, b.name, SUBSTRING(b.name FROM 1 FOR 3) as cis
FROM bbc a JOIN
     bbc b 
     ON a.name < b.name 
GROUP BY a.name, b.name 
HAVING cif = cis;

It is not clear to me that GROUP BY is needed. Filtering before the aggregation is usually recommended. I would simplify this to:

SELECT a.name, b.name, LEFT(b.name, 3) as first_3
FROM bbc a JOIN
     bbc b 
     ON a.name < b.name
WHERE LEFT(a.name, 3) = LEFT(b.name, 3) ;

If there are duplicates, then use SELECT DISTINCT or GROUP BY, but that is only needed if there are duplicates in the original data.

Upvotes: 2

Related Questions