user1344919
user1344919

Reputation: 13

how do I speed up substr searches in my query

I am using this as a sub select (in a simple query joining 2 other tables) and as you can imagine it takes a while to run. going on 6 hours+ so far. is this the only way to do this? I know that doing another JOIN instead of a sub select might help a bit. but the main bottleneck is all these OR's and sub-strings.

SELECT ex_array
FROM   service_x 
WHERE  
   ( substr(ex_array,1,2) = 'FW' OR substr(ex_array,3,2) = 'FW' OR substr(ex_array,5,2) = 'FW' OR substr(ex_array,7,2) = 'FW' OR substr(ex_array,9,2) = 'FW' OR substr(ex_array,11,2) = 'FW' ) 
OR ( substr(ex_array,1,2) = 'IL' OR substr(ex_array,3,2) = 'IL' OR substr(ex_array,5,2) = 'IL' OR substr(ex_array,7,2) = 'IL' OR substr(ex_array,9,2) = 'IL' OR substr(ex_array,11,2) = 'IL' )  
OR ( substr(ex_array,1,2) = 'IN' OR substr(ex_array,3,2) = 'IN' OR substr(ex_array,5,2) = 'IN' OR substr(ex_array,7,2) = 'IN' OR substr(ex_array,9,2) = 'IN' OR substr(ex_array,11,2) = 'IN' )  
OR ( substr(ex_array,1,2) = 'IK' OR substr(ex_array,3,2) = 'IK' OR substr(ex_array,5,2) = 'IK' OR substr(ex_array,7,2) = 'IK' OR substr(ex_array,9,2) = 'IK' OR substr(ex_array,11,2) = 'IK' )  
OR ( substr(ex_array,1,2) = 'IH' OR substr(ex_array,3,2) = 'IH' OR substr(ex_array,5,2) = 'IH' OR substr(ex_array,7,2) = 'IH' OR substr(ex_array,9,2) = 'IH' OR substr(ex_array,11,2) = 'IH' )  
OR ( substr(ex_array,1,2) = 'KP' OR substr(ex_array,3,2) = 'KP' OR substr(ex_array,5,2) = 'KP' OR substr(ex_array,7,2) = 'KP' OR substr(ex_array,9,2) = 'KP' OR substr(ex_array,11,2) = 'KP' )  
)

Upvotes: 0

Views: 827

Answers (4)

mathewbutler
mathewbutler

Reputation: 1039

Given the information available I'd suggest, given the selectivity of the query that you need:

1) An index on ex_array (preferably just ex_array, or with ex_array on the leading edge of the index.

2) to modify the query. If you append additional filter criteria:

... and (Ex_array like '%FW%' OR ex_array like '%IL%' or ex_aray like .... Etc covering each of the six cases. With the index in place this should enable an index range scan to reduce the the 12.2 million rows to only those that contain one of the six strings you are interested in. Only these rows will then have your substr logic applied, leaving the expected 175K.

In principle this should improve efficiency. It does depend greatly on the data distribution though (it's possible for example that all 12 million rows contain one of the six strings, just not in the places specified in the original query. in this case my suggested changes might be less efficient than the original query). There may be other options but I'd need more information, including explain plans to be more precise.

Hope this helps - good luck!!

Upvotes: 0

A.B.Cade
A.B.Cade

Reputation: 16905

Maybe with regexp_like:

REGEXP_LIKE(ex_array, '^(.{2}){0,5}(FW|IL|IN|IK|IH|KP).*$')

(the regular expression can probably be written better ...)

You can also add a function based index as dcp suggested (though you'll need only one):

create index fbIndex on service_x (REGEXP_INSTR(ex_array, '^(.{2}){0,5}(FW|IL|IN|IK|IH|KP).*$'));

and the change the query to use:

REGEXP_INSTR(ex_array, '^(.{2}){0,5}(FW|IL|IN|IK|IH|KP).*$') = 1

Upvotes: 1

Randy
Randy

Reputation: 16677

how about this structure...

INSTR( ex_array, 'FW' ) IN (1,3,5,7,9,11)

then at least you'll only be doing the string parsing once...

Upvotes: 1

dcp
dcp

Reputation: 55444

One thing you could try would be using a function based index. Specifically, create function based indexes on substr(ex_array,3,2) and substr(ex_array,9,2), etc.

It might be a lot of indexes though, so you'll have to weigh how much it helps by running some tests. But it's an idea to start with.

I assume you are on 8i or later.

Upvotes: 3

Related Questions