Reputation: 53
I am trying to reformat around 1000 phone numbers in a SQL Server database to US format (###) ###-####
Currently the phone numbers are formatted in all sorts of ways ranging from ##########, ###-###-####, one is ###)-###-####. There is also one with only six digits.
As a first step I've been attempting to isolate the numbers in all of these rows but its just returning the same as they were already.
select SUBSTRING(phone, PATINDEX('%[0-9]%', phone), LEN(phone)) from people
How could I best go about writing a query which would format them all as (###) ###-####
?
expected output:
(555) 222-3333
(555) 444-3030
(555) 092-0920
(555) 444-4444
Upvotes: 1
Views: 5070
Reputation: 3363
If this reformatting something that is going to be used repeatedly then a creating a UDF as suggested by @GSerg would be the way to go.
If this is just a one time clean-up you could give this a try.
First replace all of the numbers with empty strings with a series of nested REPLACE() functions.
DECLARE @PhoneNumbers TABLE (
Number varchar (20))
INSERT INTO @PhoneNumbers VALUES ('(888-239/1239')
INSERT INTO @PhoneNumbers VALUES ('222.1234')
SELECT
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(Number, '0', '')
, '1', '')
, '2', '')
, '3', '')
, '4', '')
, '5', '')
, '6', '')
, '7', '')
, '8', '')
, '9', '')
FROM @PhoneNumbers
Then take those result non-numeric characters and put them each in their own nested REPLACE() function and format the result. You will have to deal with each length individually. If you have only 7 digits and you want to format it to have 10 digits what do you want those extra 3 digits to be. This will handle the 10 digit phone numbers.
SELECT FORMAT(x.NumbersOnly, '(###) ###-####')
FROM
(
SELECT
CONVERT(BIGINT,
REPLACE(
REPLACE(
REPLACE(
REPLACE(Number, '(', '')
, '-', '')
, '/', '')
, '.', '')
) AS NumbersOnly
FROM @PhoneNumbers
) x
WHERE LEN(x.NumbersOnly) = 10
Here is the dbfiddle.
Upvotes: 0
Reputation: 33581
Since one suggestion was made already and the suggestion there to isolate numbers in a string uses a while loop I need to post an alternative to that which doesn't use any looping. Instead it utilizes a tally or numbers table. There are lots of solutions for those. I like to use a view which is lightning fast and has zero reads.
Here is my version of a tally table.
create View [dbo].[cteTally] as
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
select N from cteTally
Next we need a table valued function to remove the characters that are not numbers using our tally table. This is also super fast because we are using our tally table instead of looping.
create function GetOnlyNumbers
(
@SearchVal varchar(8000)
) returns table as return
with MyValues as
(
select substring(@SearchVal, N, 1) as number
, t.N
from cteTally t
where N <= len(@SearchVal)
and substring(@SearchVal, N, 1) like '[0-9]'
)
select distinct NumValue = STUFF((select number + ''
from MyValues mv2
order by mv2.N
for xml path('')), 1, 0, '')
from MyValues mv
Now that we have all the legwork done we can focus on the task at hand. Since you didn't provide any sample data I just made up some stuff. I am not really sure if this is representative of your data or not but this works on the sample data I created.
if OBJECT_ID('tempdb..#Something') is not null
drop table #Something
create table #Something(SomeVal varchar(100))
insert #Something values
('Maybe you have other stuff in here. 5552223333 additional characters can cause grief')
, ('321-654-9878')
, ('123)-333-4444')
, ('1234567')
select replace(format(try_convert(bigint, n.NumValue), '(###) ###-####'), '() ', '')
, n.NumValue
from #Something s
cross apply dbo.GetOnlyNumbers(s.SomeVal) n
The output for the formatted data looks like this:
(555) 222-3333
(321) 654-9878
(123) 333-4444
123-4567
Upvotes: 3