Reputation: 535
I have two tables tab1
and tab2
.tab1
has 108000 rows and tab2
has 1200000 rows.
Here is sample data
tab1
+-----------------------------------------------------+
| Low | high | Region_id |
+-----------------------------------------------------+
|5544220000000000 | 5544225599999999 | 1 |
|5544225500000000 | 5544229999999999 | 2 |
|5511111100000000 | 5511111199999999 | 3 |
+-----------------------------------------------------+
tab2
+------------------+
| pan |
+-------------------
|5544221111111111 |
|5544225524511244 |
|5511111111254577 |
+------------------+
So I run a query like this
select t2.pan, t1.region_id from tab2 t2
join tab1 t1 on t2.pan between t1.low and t1.high;
What I'm trying to do is finding in which range does tab2.pan
exist and retrieving it's region_id
: Ranges are unique, Meanning that low and high pairs are distinct.
I tried adding indexes, running in parallel but the query is running very slow(about 3 hours).
Can anyone suggest something to fasten the query, it can be adding some kind of indexes, or changing data structure or anything else.
I'm running the query against Oracle 11gR2.
UPDATE
From the comments i tested several things
Adding index like (high, low) and adding index (pan) and (high, low, region), Both ways there goes index full scan, i also tried index on(low,high) and index on pan, this way goes index range scan on tab1 and index full scan on tab2, but anyways it seems extremely slow.
Upvotes: 2
Views: 105
Reputation: 1269543
If you have no overlaps and each value in tab1
matches exactly one row in tab2
, then I think the best approach is a correlated subquery with the right indexes:
select t.*, t2.region_id
from (select t1.*,
(select max(t2.low)
from tab2 t2
where t2.low <= t.pan
) as t2low
from tab1 t1
) t join
tab2 t2
on t.t2low = t2.low;
The index that you want is tab2(low, region)
. This index should be used very efficiently for the subquery to get the closest low
value. The join should then be quite fast as well.
Does this help your performance?
EDIT:
I should note that in the above query, you can test for the high
value in the outer join. This should be fine if the the low
values are unique, because the join on low
will be really fast. So:
select t.*,
(case when t.pan <= t2.high then t2.region_id end) as region_id
from (select t1.*,
(select max(t2.low)
from tab2 t2
where t2.low <= t.pan
) as t2low
from tab1 t1
) t join
tab2 t2
on t.t2low = t2.low;
Upvotes: 3