Reputation: 543
I have a list of phone number prefixes defined for large number of zones (in query defined by gvcode and cgi). I need to efficiently find a longest prefix that matches given number PHONE_NR.
I use inverted LIKE clause on field digits (which contains prefixes in form +48%, +49%, +1%, +1232% and so on).
Therefore I can't use normal index on that field.
I managed to get substantial improvement by using IOT on gvcode and cgi field (which are part (first two cols) of primary key). I also looked at some oracle text indexes but can't find one that will match longer input with shorter prefix in the table.
Is there any other way to perform such search that is faster than this approach.
Here is the query which gives a list of all matched prefixes (I sort it afterwards on digits length).
select t.gvcode, t.digits
from NUMBERS t
where
t.gvcode=ZONE_SET_CODE
and t.cgi=cgi_f
and ( PHONE_NR like t.digits)
order by length(digits) desc
Upvotes: 8
Views: 2186
Reputation: 1
Create Or Replace Function DBA_UTIL.COMMON_PREFIX (P_STRING1 In Varchar2, P_STRING2 In Varchar2)
Return Varchar2
As L_STRING Varchar2 (100); Begin -- ================================================================================== -- ------- Find any common prefix between 2 strings -- ==================================================================================
With
FIND_COMMON_PREFIX
As
( Select Max (Rownum) MAX_PREFIX
From Dual
Where Substr (P_STRING1, 1, Rownum) = Substr (P_STRING2, 1, Rownum)
Connect By Level <= Least (Length (P_STRING1), Length (P_STRING2)))
Select Substr (P_STRING1, 1, MAX_PREFIX)
Into L_STRING
From FIND_COMMON_PREFIX;
Return L_STRING;
Exception When Others Then Return 'BOGUS'; End; /
Upvotes: 0
Reputation: 2787
I've hit the same problem, I found this solution to be useful (credits to L. Schneider on https://community.oracle.com/thread/351988):
create table a (a varchar2(100));
create index a_1 on a(a);
begin
delete a;
insert into a values ('00431');
insert into a values ('004312');
insert into a values ('0043123');
insert into a values ('00431234');
insert into a values ('004312345');
end;
/
select max(a)
from a
where '004311' like a||'%'
;
http://sqlfiddle.com/#!4/abc975/1
Upvotes: 1
Reputation: 21
Okay, just writing because I had the same issue. If you know the range of the prefix lengths you have, you can do something similar to the following. The following example assumes prefix lengths 2-6
select t.num, coalesce(p6.PREFIX, p5.PREFIX, p4.PREFIX, p3.PREFIX, p2.PREFIX) PREFIX
from NUMBERS t
LEFT OUTER JOIN PREFIXES p2 ON substr(t.num,1,2)=p2.PREFIX
LEFT OUTER JOIN PREFIXES p3 ON substr(t.num,1,3)=p3.PREFIX
LEFT OUTER JOIN PREFIXES p4 ON substr(t.num,1,4)=p4.PREFIX
LEFT OUTER JOIN PREFIXES p5 ON substr(t.num,1,5)=p5.PREFIX
LEFT OUTER JOIN PREFIXES p6 ON substr(t.num,1,6)=p6.PREFIX
Equal joins are as good as you can get.
I believe that it runs way better than any other possible solution here, hope it helps anyone who stumbles on the same issue
Sqlfiddle link modified from sailaway's answer whose script still gives all matches instead of only the longest one
Upvotes: 1
Reputation: 36
In addition to the index on "digits" you can create the index on rpad(substr(digits,1,length(digits)-1), 10, '9')
. "10" is the maximum length that you want to support. You will add an additional condition to the where clause: rpad(substr(digits,1,length(digits)-1), 10, '9') >= PHONE_NR
Your SQL would be:
select t.gvcode, t.digits
from NUMBERS t
where
t.gvcode=ZONE_SET_CODE
and t.cgi=cgi_f
and PHONE_NR like t.digits
and substr(digits, 1, length(digits)-1) <= PHONE_NR
and rpad(substr(digits,1,length(digits)-1), 10, '9') >= PHONE_NR
order by length(digits) desc
Here is an example in sqlfiddle
Upvotes: 2
Reputation: 4511
I might sound stupid, but when I ran into such problem I went in most non-space-efficient brute force way:
Lets say:
L=length of longest prefix to match (without obvious +, of course)
Add L
additional fields naming them, for example, P1, P2,...,PL
Update those fields with
UPDATE NUMBERS set P1=SUBSTR(PHONE_NR,1,1), P2=SUBSTR(PHONE_NR,1,2), ..., PL=SUBSTR(PHONE_NR,1,L)
(in future you can do this in INSERT OR UPDATE
trigger too)
Now you have L fields to create index on and compare to anything you like.
Upvotes: 1
Reputation: 16905
I'm not sure that this will really help, but I think it's worth a try.
Create a Function Based Index on substr(digits, 1, length(digits)-1)
(this is just to index digits without the '%')
Then in your query you can add another condition:
AND substr(digits, 1, length(digits)-1) <= PHONE_NR
The idea is that with lexical comparison you can "cut out" all the digits which are after PHONE_NR
Upvotes: 0