MrDJW
MrDJW

Reputation: 13

Join two tables on Telephone Numbers where the numbers are in differing formats

I have an interesting / frustrating problem for which I already have a solution - however it's slow and I'm not convinced is the best way of solving it.

I have two tables I wish to join. The first I'll refer to as [MobileNumber] and the Second [MobileInventory]. The table [MobileNumber] contains numbers in the format 7123456789. The table [MobileInventory] contains numbers that may be in any one of the following three formats - 7123456789, 07123456789, +447123456789. I need to join the two and the only common field is the mobile number.

The two issues are, within [MobileInventory] I can't guarantee the format (export of external system) and the numbers in both tables are from multiple countries with different length country codes (+1, +44, +852) and within [MobileNumber] I have no way of telling which country a number is from. This makes it very hard just to remove the first x characters from [MobileInventory] or do a replace on any known country code (risky!) and have done with.

My solution (working):

SELECT *
FROM [MobileNumber]
LEFT OUTER JOIN [MobileInventory]
ON CHARINDEX(REVERSE([MobileNumber].[Number]), REVERSE([MobileInventory]),1) = 1

With both numbers reversed its easy to see if you get the position of [MobileNumber].[Number] in [MobileInventory].[Mobile] and if the result is 1 - it's a match.

9876543217

98765432170

987654321744+

98765432174400

This works, and I get a very good match result - however, its SO SLOW! I have c.50k numbers in [MobileNumber] and c.20k in [MobileInventory].

Next thing to try is (as I'm using the result in the above in a MERGE in a Stored Procedure) to take a GUID from [MobileInvetory] and store in [MobileNumber] for later view joins, so I could achieve this using a cursor in a cursor to make the comparison but I'm not convinced that will be any quicker.

I have checked similar questions and the problems solved seem to be with separators, or consistent country codes which don't apply to my issue.

The above is automated via a scheduled task, so I could just kick it off earlier in the day, but that doesn't seem right somehow!

Upvotes: 1

Views: 811

Answers (1)

Luis Cazares
Luis Cazares

Reputation: 3585

There are several ways you can do this. Using just code here are 2 alternatives.

SELECT *
FROM [MobileNumber]
LEFT OUTER JOIN [MobileInventory]
ON [MobileNumber].[Number] = RIGHT([MobileInventory], LEN([MobileNumber].[Number]));

SELECT *
FROM [MobileNumber]
LEFT OUTER JOIN [MobileInventory]
ON [MobileInventory] LIKE '%' + [MobileNumber].[Number];

The first one expects all numbers to be 10 digits long, but I´m not sure if some countries could have shorter numbers. takes the length of the Mobile Number.

You could also create a persisted computed column on MobileInventory to index the column and create a more efficient join.

Upvotes: 1

Related Questions