Helios093
Helios093

Reputation: 1

MySQL joining two tables with only a partial match

I'm doing some volunteer work with Nepali refugees in my community and I am trying to organize their addresses. I have 656 Nepali last names in one table and about 608,000 addresses in another table. I have never used MySQL before and have only learned a little bit online to make these tables. Not real names from table. My tables:

AddressTable: 4 Columns

Owner_Name      Owner_Address   Owner_CityState   Owner_Zip
------------------------------------------------------
Smith, John   | ************* | *************** | *****
adhikari, Prem| ************* | *************** | *****
Baker, Mary   | ************* | *************** | *****

NamesTable: 1 Column

Last_Name
-----------
Smith
adhikari
Baker

I only want the addresses for people who have Nepali last names, so I want to select all the columns from my AddressTable that match with the last names from my NamesTable by joining the tables from the Last_Name column in the NamesTable with the Owner_Name column in the AddressTable. Since the Owner_Name column has both last name and the first name I've been having trouble doing this.

Upvotes: 0

Views: 387

Answers (2)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521419

I recommend using REGEXP here:

SELECT at.*
FROM AddressTable at
INNER JOIN NamesTable nt
    ON at.Owner_Name REGEXP CONCAT('^', nt.Last_Name, ',');

Demo

As mentioned in previous above comments, a given last name by itself may not be unique. We can modify the above query to also check the first name, assuming that the names table contains that as well:

SELECT at.*
FROM AddressTable at
INNER JOIN NamesTable nt
    ON at.Owner_Name REGEXP CONCAT('^', nt.Last_Name, ',') AND
       at.Owner_Name REGEXP CONCAT(' ', nt.First_Name, '$');

But even this might still have problems, because sometimes people have first or last name consisting of two (or more) words. Also, such a thing as a middle name is possible.

For a better solution, you might want to break up the first, middle, and last names into separate columns before bringing your data into the database.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269953

Before I answer, let me just say that this is not going to work in all likelihood. Name matching like this is fraught with problems, unless you know that the data is canonically structured.

You can do this in several ways. The idea is that you need functions in the on clause. For instance:

select . . . 
from addresstable a join
     namestable n
     on n.last_name = substring_index(owner_name, ',', 1);

This assumes that the last name is in the owner_name before the first comma.

Upvotes: 1

Related Questions