Reputation: 23135
I have a table which is basically a tree structure with a column parent_id
and id
.
parent_id
is null
for root nodes.
There is also a self referential foreign key, so that every parent_id
has a corresponding id
.
This table is mainly read-only with mostly infrequent batch updates.
One of the most common queries from the application which accesses this table is select ... where parent_id = X
. I thought this might be faster if this table was index organised on parent_id
.
However, I'm not sure how to index organise this table if parent_id
can be null
. I'd rather not fudge things so that parent_id=0
is some special id, as I'd have to add dummy values to the table to ensure the foreign key constraints are satisfied, and it also changes the application logic.
Is there any way to index organise a table by possible null
value columns?
Upvotes: 2
Views: 174
Reputation: 23135
Solution from question asker:
I found I could get the same benefits from index organisation just by adding the queried columns to the end of the parent_id
index, i.e. instead of:
create index foo_idx on foo_tab(parent_id);
I do:
create index foo_idx on foo_tab(parent_id, col1, col2, col3);
Where col1
, col2
, col3
etc are frequently accessed columns.
I've only done this with indexes which are used to return multiple rows which benefit from the ordering and hence disk locality provided by the index, instead of having to jump around the table. Indexes which are generally used to return single rows I've left to reference the table, as there is only one row to read anyway so locality matters much less.
Like I mentioned, this is a mainly read table, and also space is not a huge concern, so I don't think the overhead to writes caused by these indexes is a big concern.
(I realise this won't index null
parent_id
s, but instead I've made another index on decode(parent_id, null, 1, null)
which indexes nulls and only nulls).
Upvotes: 1
Reputation: 61705
I would try adding the index on the single column parent_id.
If all of the columns in your index are non-null, then this row does not appear in your index.
So for the parent_id = X
you cite above, this should use the index. However, if you're doing parent_id is null
, then it won't use the index, and you'll be getting the same performance as you have now. This sounds like behaviour that would suit you.
I have used this in the past to improve the performance of queries. It works particulalry well if the number of items in the index is small compared to the number of rows in the database. We had about 3% of our rows in this particular index, and it flew :-)
But, as always, you need to try it and measure the difference in performance. Your mileage may vary.
Upvotes: 0