Reputation:
I got a database with a user table. This table contains a column phonenumber
. The problem is that its fields use multiple number patterns.
The current patterns I found:
When searching for a user in the database, is there a way to search for all number patterns?
SELECT id FROM user WHERE phone = '0123456789'
I use Oracle and MS SQL
Upvotes: 0
Views: 1019
Reputation: 95052
You have a database containing phone numbers. These are sometimes in international format, but often in some national format, probably German, where two leading zeros introduce a country code, while a single leading zero would introduce an area code instead (assuming the home country Germany then). Moreover, a phone number can contain symbols for readability, namely '-'
, '/'
, and ' '
.
So
I suggest you convert all numbers into international format in these steps:
+
with a leading 00
, thus making only digits important00
with a leading +
0
with a leading +49
Use Oracle's REGEXP_REPLACE
for this:
select
regexp_replace(
regexp_replace(
regexp_replace(
regexp_replace(trim(phone),
'^\+', '00'), -- leading '+' -> leading '00'
'[^[:digit:]]', ''), -- remove all non-digits
'^00' , '+'), -- leading '00' -> leading '+'
'^0', '+49') -- leading '0' -> leading '+49'
as international_phone
from mytable;
You can do this in the WHERE
clause of course:
SELECT id FROM user WHERE regexp_replace(...) = '+49123456789'
or even
SELECT id FROM user WHERE regexp_replace(...phone...) = regexp_replace(...'0123456789'...)
And you may write a PL/SQL function for this for convenience and use it so:
SELECT id FROM user WHERE international_phone(phone) = international_phone('0123456789')
This is for Oracle. There may be something alike for SQL Server.
Upvotes: 1
Reputation:
Assuming your question means this:
"Is it possible to remove all the non-digit characters from the stored phone number, before making the comparison in the WHERE
clause?"
a possible solution looks like this:
...
where translate(phone, '0123456789' || phone, '0123456789') = <input value here>
TRANSLATE
will translate every digit to itself, and all other characters in phone
to nothing (they will simply be deleted from the string). This is exactly what you want.
If you find that the query is slow, you may want to create a (function-based) index on translate(phone, '0123456789' || phone, '0123456789')
.
EDIT: I missed the part where you said you are using both Oracle and SQL Server. I did a quick search and found that SQL Server does not have a function similar to Oracle's TRANSLATE
. I will leave it to SQL Server experts to help you with that part; I don't know SQL Server.
Upvotes: 3
Reputation: 146269
In Oracle you could do it like this. Strip out the non-numeric characters with translate()
to get the phone number. You need to handle the leading zero or international dialling code:
select username from your_table
where translate(phone, '1234567890+/ -', '1234567890') in ('064039750', '4964039750')
You may need to tweak this if you don't know what the international dialling code is.
Obviously the actual problem is one of data quality: the application should enforce a strict format on phone numbers. One bout of data cleansing on write saves a whole bunch of grief on read.
Upvotes: 2