Joining to perform a comparison match with possible swapped data

Question

I'm trying to do a comparison match in snowflake between the first_name, middle_name, and last_name from a target table (700K rows) and from source table (1.5M rows). The values in the first_name, middle_name, and last_name can be partially, or fully swapped in the source table, that is the value in the first_name can be swapped with the value in the last_name or any other way. For a positive probable identification at least two matches need to be correct. I then do further processing to guarantee the accuracy of the match.

Example:

case 1: first name (target) = first name (source) | middle name (target) = middle name. (MATCH)

case 2: first name (target) = last name (source) | middle name (target) = middle name (source) | last name (target) = first name (source) -> first name swapped with last lame (MATCH)

Example (with table):

Target Table
id   | first_na_t    |   middle_name_t   |     last_name_t
1    |   ffa         |       mma         |         lla    
2    |   ffa         |       mmb         |         lla
3    |   ffb         |       ffa         |         llb
4    |   ffc         |       mmd         |         lla
...

Source Table
id      | first_name_s  |   middle_name_s |     last_name_s
1       |   lla         |       mmb         |         ffa    
5       |   ffa         |       mmb         |         lla
3       |   ffb         |       ffa         |         llb
4       |   mmd         |       ffc         |         lla
...

From the example tables the attempt that I used was to compare the first row in the target table with all the other rows in the source table using a CROSS JOIN with a CASE afterwards but this is proving to be inefficient and slow.

Is there any way to JOIN this cleverly where you would compare the rows in the target table against the rows in the source table and come out with at least 2 match even though any of the name columns could be swapped. Also I can not use the id column.

Joining to perform a comparison match with possible swapped data

Answers (1)

Related Questions