Andrew
Andrew

Reputation: 7768

How to search partial/masked strings?

I am storing social security numbers in the database, but instead of storing whole numbers, I only store only 5 digits sequence. So, if SSN# is 123-12-1234, my database would store it #23121### or ####21234 or anything else, as long as it has a 5 digits in the row.

Therefore, when user enters whole SSN, I want the database to locate all matches.

So, I can do this :

SELECT * FROM user WHERE ssn like 123121234

But the query above would not work, since I have some masked characters in the SSN field (#23121###). Is there a good way of doing this?

Maybe a good way would be to use

SELECT * FROM user WHERE REPLACE (ssn, '#', '') like 123121234

Although there could be an issue - the query might return non-relevant matches since 5 numbers that I store in the DB could be anywhere in a sequence.

Any idea how to do a better search?

Upvotes: 1

Views: 129

Answers (2)

silkfire
silkfire

Reputation: 25935

I think you can do something like this:

  1. Extract all possible 5-char combinations out of the queried SSN.
  2. Make an IN() query on those numbers. I'm not sure though how many results you would get from this.

    $n = 123121234;
    $sequences = array();
    
    for($i = 0; $i + 5 <= strlen($n); $i++) {
        $sequences[] = substr($n, $i, 5);
    }
    
    var_dump($sequences);
    

Tell me if you need those hash sign surrounding the strings.

Upvotes: 1

Joachim Isaksson
Joachim Isaksson

Reputation: 180897

If the numbers are always in a sequential block, you can generate a very efficient query by just generating the 5 variations of the ssn that could be stored in the DB and search for all of them with an exact match. This query can also use indexes to speed things up.

SELECT * 
FROM user 
WHERE ssn IN ('12312####', 
              '#23121###', 
              '##31212##', 
              '###12123#', 
              '####21234');

Upvotes: 2

Related Questions