Reputation: 13
I have an interesting / frustrating problem for which I already have a solution - however it's slow and I'm not convinced is the best way of solving it.
I have two tables I wish to join. The first I'll refer to as [MobileNumber] and the Second [MobileInventory]. The table [MobileNumber] contains numbers in the format 7123456789. The table [MobileInventory] contains numbers that may be in any one of the following three formats - 7123456789, 07123456789, +447123456789. I need to join the two and the only common field is the mobile number.
The two issues are, within [MobileInventory] I can't guarantee the format (export of external system) and the numbers in both tables are from multiple countries with different length country codes (+1, +44, +852) and within [MobileNumber] I have no way of telling which country a number is from. This makes it very hard just to remove the first x characters from [MobileInventory] or do a replace on any known country code (risky!) and have done with.
My solution (working):
SELECT *
FROM [MobileNumber]
LEFT OUTER JOIN [MobileInventory]
ON CHARINDEX(REVERSE([MobileNumber].[Number]), REVERSE([MobileInventory]),1) = 1
With both numbers reversed its easy to see if you get the position of [MobileNumber].[Number] in [MobileInventory].[Mobile] and if the result is 1 - it's a match.
9876543217
98765432170
987654321744+
98765432174400
This works, and I get a very good match result - however, its SO SLOW! I have c.50k numbers in [MobileNumber] and c.20k in [MobileInventory].
Next thing to try is (as I'm using the result in the above in a MERGE in a Stored Procedure) to take a GUID from [MobileInvetory] and store in [MobileNumber] for later view joins, so I could achieve this using a cursor in a cursor to make the comparison but I'm not convinced that will be any quicker.
I have checked similar questions and the problems solved seem to be with separators, or consistent country codes which don't apply to my issue.
The above is automated via a scheduled task, so I could just kick it off earlier in the day, but that doesn't seem right somehow!
Upvotes: 1
Views: 811
Reputation: 3585
There are several ways you can do this. Using just code here are 2 alternatives.
SELECT *
FROM [MobileNumber]
LEFT OUTER JOIN [MobileInventory]
ON [MobileNumber].[Number] = RIGHT([MobileInventory], LEN([MobileNumber].[Number]));
SELECT *
FROM [MobileNumber]
LEFT OUTER JOIN [MobileInventory]
ON [MobileInventory] LIKE '%' + [MobileNumber].[Number];
The first one expects all numbers to be 10 digits long, but I´m not sure if some countries could have shorter numbers. takes the length of the Mobile Number.
You could also create a persisted computed column on MobileInventory to index the column and create a more efficient join.
Upvotes: 1