While this idea looks good because you produce a much smaller index, there is the problem of how do you get to use the index?
You will have to edit the SQL in your application such that the predicate matches the index expression. Alternatively, you just index the column and use a histogram to help the optimiser use it for the rare values.
There is a greater cost in index maintenance, but the index accesses are almost as efficient as the function based index. The question is whether it is worth changing the code for each statement where you want to use the function based index in order to squeeze out every possible consistent read? Or is that CTD?
Here's a simple test, there are some comments in-line
set autotrace off drop table dmk; create table dmk (a number,t varchar2(200));
insert into dmk select 0 , RPAD(owner||'.'||object_name||'.'||subobject_name||':'||object_type,200,'.') from dba_objects where rownum <= 1;
insert into dmk select 1 , RPAD(owner||'.'||object_name||'.'||subobject_name||':'||object_type,200,'.') from dba_objects where rownum <= 30000;
--function based index create index dmk_idx1 on dmk(case a when 0 then 0 end); analyze table dmk compute statistics; analyze table dmk compute statistics for columns a;
select num_rows from user_tables where table_name = 'DMK'; NUM_ROWS -- ---- -- 7644 select num_rows from user_indexes where table_name = 'DMK'; NUM_ROWS -- ---- -- 1 --so I only have the non null values in my index
set autotrace on
select count(t) from dmk where a = 1; COUNT(T) -- ---- -- 7643 1 row selected.
--if the predicate matches the function the index is used.
set autotrace off
--normal index drop index dmk_idx1; create index dmk_idx1 on dmk(a); analyze table dmk compute statistics; analyze table dmk compute statistics for columns a; --now lets try with a normal index, and a histogram
select num_rows from user_tables where table_name = 'DMK'; select num_rows from user_indexes where table_name = 'DMK';
set autotrace on select count(t) from dmk where a = 0; COUNT(T) -- ---- -- 1
1 row selected.
Execution Plan -- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=202) 1 0 SORT (AGGREGATE) 2 1 TABLE ACCESS (BY INDEX ROWID) OF 'DMK' (Cost=2 Card=1 Bytes=202) 3 2 INDEX (RANGE SCAN) OF 'DMK_IDX1' (NON-UNIQUE) (Cost=1 Card=1)
--the index scan has the same cost, but requires an additional consistent read because the index has an extra level because it contains entries all the rows. But you this way you don't have to modify the code.
select count(t) from dmk where a = 1; set autotrace off