Sean Nguyen
Sean Nguyen

Reputation: 13138

how to index on a column for is not null query in oracle?

I have a table with a lot of column and my query is like this

select  * from ( select my_table_id from my_table where start_time_local>=? 
and type_pk<>? and rule_pk=? and this_.name is not null order by start_time_gmt 
desc ) where rownum <= ?

If I create an index on

(start_time_local, type_pk, rule_pk, name) 

it will be inefficient because name is a varchar(1024). Is there a better way to index on something like:

 (start_time_local, type_pk, rule_pk, isNotNull(name))

Thanks,

Upvotes: 1

Views: 5867

Answers (2)

Branko Dimitrijevic
Branko Dimitrijevic

Reputation: 52117

If a field will be searched only for NOT NULL and never for an actual value, you can use a function-based index to save space in the index (and potentially increase performance through better cache utilization). For example:

CREATE TABLE THE_TABLE (
    ID INT PRIMARY KEY,
    THE_FIELD VARCHAR2(20)
);

CREATE INDEX THE_TABLE_IE1 ON THE_TABLE(NVL2(THE_FIELD, 'Y', 'N')) COMPRESS;

(There will be many repeated 'Y' and 'N' values in the index, so may be worth to COMPRESS the index, as shown above.)

And then select like this:

SELECT * FROM THE_TABLE WHERE NVL2(THE_FIELD, 'Y', 'N') = 'Y' -- Equivalent to THE_FIELD IS NOT NULL
SELECT * FROM THE_TABLE WHERE NVL2(THE_FIELD, 'Y', 'N') = 'N' -- Equivalent to THE_FIELD IS NULL

If you don't need searching for NULL, you can probably squeeze some more space-efficiency like this:

CREATE INDEX THE_TABLE_IE1 ON THE_TABLE(NVL2(THE_FIELD, 'Y', NULL)) COMPRESS;
SELECT ID FROM THE_TABLE WHERE NVL2(THE_FIELD, 'Y', NULL) = 'Y' -- Equivalent to THE_FIELD IS NOT NULL

Oracle does not index NULLs (in B-tree based indexes), so NVL2(THE_FIELD, 'Y', NULL) will completely eliminate THE_FIELD IS NULL rows from the index.

Upvotes: 3

Ben
Ben

Reputation: 52863

Generally, the optimal index for a query is to index on everything in the where clause in order of decreasing selectivity then on everything in the order by in order of decreasing selectivity then everything you are selecting that you have not yet indexed. This means that you will only ever use the index not the table behind it.

This method can, however, be completely ridiculous. It's up to you to decide how far you want to go.

Selective means how many values there are in the index as a percentage of the total number of rows. Generally, the more values there are in a column the quicker it will be to find in the index. I say generally as there are always exceptions to every rule.

If for instance rule_pk is the primary key of your table then it would probably be enough to index on this column. This'll mean you do a unique index scan followed by table access by rowid.

Continuing to assume rule_pk is the primary key, start_time_local is almost unique and the other columns are equally selective the optimal index would be something like: (rule_pk, start_time_local, type_pk, this_.name, start_time_gmt, my_table_id). This is fairly ridiculous though.

I would suggest reading this part of the documentation on how to read explain plans and to use them regularly.

Also don't forget to gather statistics after creating an index, as this can make a big difference:

dbms_stats.gather_table_stats( 'SCHEMA_NAME'
                             , 'TABLE_NAME'
                             , cascade => True
                             , method_opt => 'FOR ALL INDEXED COLUMNS'
                               );

should be sufficient.

Upvotes: 2

Related Questions