user9945420
user9945420

Reputation:

Search for multiple phone number formats in database

I got a database with a user table. This table contains a column phonenumber. The problem is that its fields use multiple number patterns.

The current patterns I found:

When searching for a user in the database, is there a way to search for all number patterns?

SELECT id FROM user WHERE phone = '0123456789'

I use Oracle and MS SQL

Upvotes: 0

Views: 1019

Answers (3)

Thorsten Kettner
Thorsten Kettner

Reputation: 95052

You have a database containing phone numbers. These are sometimes in international format, but often in some national format, probably German, where two leading zeros introduce a country code, while a single leading zero would introduce an area code instead (assuming the home country Germany then). Moreover, a phone number can contain symbols for readability, namely '-', '/', and ' '.

So

  • +49 12/3456-7 means +491234567 of course
  • 00441234567 means +441234567
  • 04412345 means +494412345

I suggest you convert all numbers into international format in these steps:

  1. replace a leading + with a leading 00, thus making only digits important
  2. remove every character that is not a digit
  3. replace a leading 00 with a leading +
  4. replace a leading 0 with a leading +49

Use Oracle's REGEXP_REPLACE for this:

select 
  regexp_replace( 
    regexp_replace(
      regexp_replace(
        regexp_replace(trim(phone),
          '^\+', '00'), -- leading '+' -> leading '00'
        '[^[:digit:]]', ''), -- remove all non-digits
      '^00' , '+'), -- leading '00' -> leading '+'
    '^0', '+49') -- leading '0' -> leading '+49'
  as international_phone
from mytable;

You can do this in the WHERE clause of course:

SELECT id FROM user WHERE regexp_replace(...) = '+49123456789'

or even

SELECT id FROM user WHERE regexp_replace(...phone...) = regexp_replace(...'0123456789'...)

And you may write a PL/SQL function for this for convenience and use it so:

SELECT id FROM user WHERE international_phone(phone) = international_phone('0123456789')

This is for Oracle. There may be something alike for SQL Server.

Upvotes: 1

user5683823
user5683823

Reputation:

Assuming your question means this:

"Is it possible to remove all the non-digit characters from the stored phone number, before making the comparison in the WHERE clause?"

a possible solution looks like this:

... 
where translate(phone, '0123456789' || phone, '0123456789') = <input value here>

TRANSLATE will translate every digit to itself, and all other characters in phone to nothing (they will simply be deleted from the string). This is exactly what you want.

If you find that the query is slow, you may want to create a (function-based) index on translate(phone, '0123456789' || phone, '0123456789').

EDIT: I missed the part where you said you are using both Oracle and SQL Server. I did a quick search and found that SQL Server does not have a function similar to Oracle's TRANSLATE. I will leave it to SQL Server experts to help you with that part; I don't know SQL Server.

Upvotes: 3

APC
APC

Reputation: 146269

In Oracle you could do it like this. Strip out the non-numeric characters with translate() to get the phone number. You need to handle the leading zero or international dialling code:

select username from your_table
where translate(phone, '1234567890+/ -', '1234567890') in ('064039750', '4964039750')

You may need to tweak this if you don't know what the international dialling code is.

Obviously the actual problem is one of data quality: the application should enforce a strict format on phone numbers. One bout of data cleansing on write saves a whole bunch of grief on read.

Upvotes: 2

Related Questions