Reputation: 441
I have a query which looks like follow:
select
CASE
WHEN col BETWEEN 0 AND 20 THEN 0
WHEN col BETWEEN 20 AND 50 THEN 20
WHEN col BETWEEN 50 AND 100 THEN 40
WHEN col BETWEEN 100 AND 200 THEN 75
WHEN col BETWEEN 200 AND 10000 THEN 86
ELSE 0 END AS t_col
from TAB1;
I wanted to get a better performance and rewrote it to:
select case
WHEN col < 20 then 0
WHEN col < 50 then 20
WHEN col < 100 then 40
WHEN col < 200 then 75
WHEN col < 1000 then 86
END AS t_col
from TAB1;
I tought the second query might be faster because not an interval would be created but only the value of column would be compared to one number. The Explain Plan gives me identical results for both queries. i would like to know which one of them do a better performance?
Upvotes: 0
Views: 156
Reputation: 16001
They are effectively the same. Any tiny difference in CPU cycles will be overshadowed by I/O, network overhead etc required to perform the query.
I suppose the between
version lets you put the most frequently occurring values first, in theory improving efficiency if a large proportion of the values fall within one or two ranges. Then again, the <
version only requires one comparison per case instead of two.
I still award the <
version more style points :)
I tested both versions within a PL/SQL cursor loop to eliminate network issues, and ran the test multiple times. The results were within half a second of each other over 10 million iterations, which is about the same as the variation over multiple runs (because there is always something else running on any server or laptop).
create table t1 (c1, c2, c3) pctfree 0 nologging
as
select round(dbms_random.value(1,100))
, cast(dbms_random.string('X',40) as varchar2(40))
, cast(dbms_random.string('X',40) as varchar2(40))
from xmltable('1 to 1000000');
declare
type test_rectype is record(num number);
totalTime1 simple_integer := 0;
totalTime2 simple_integer := 0;
time1Percentage simple_integer := 0;
time2Percentage simple_integer := 0;
function testCursor
( cursorNum in integer )
return sys_refcursor
is
testCursor sys_refcursor;
begin
if cursorNum = 1 then
open testCursor for
select case
when c1 < 20 then 0
when c1 < 50 then 20
when c1 < 100 then 40
when c1 < 200 then 75
when c1 < 1000 then 86
end as t_c1
from t1;
elsif cursorNum = 2 then
open testCursor for
select case
when c1 < 20 then 0
when c1 < 50 then 20
when c1 < 100 then 40
when c1 < 200 then 75
when c1 < 1000 then 86
end as t_c1
from t1;
end if;
return testCursor;
end testCursor;
-- Fetch all rows from a cursor and return time in hundredths of a second:
procedure time_cursor
( inCursor in sys_refcursor
, outTime in out nocopy simple_integer )
is
startTime simple_integer := dbms_utility.get_time;
begin
-- 21c new iterator syntax
for r test_rectype in values of inCursor loop
null; -- Could also compare rowcounts here
end loop;
outTime := dbms_utility.get_time - startTime;
close inCursor;
end time_cursor;
-- Report timing difference:
procedure print_comparison
( time1 simple_integer
, time2 simple_integer )
is
begin
time1Percentage := 100 * time1 / (time1 + time2);
time2Percentage := 100 * time2 / (time2 + time2);
dbms_output.put_line('Between: '||to_char(time1/100,'900d00')|| rpad(' |',time1Percentage,'|'));
dbms_output.put_line('LessThan: '||to_char(time2/100,'900d00')|| rpad(' |',time2Percentage,'|'));
end print_comparison;
procedure compare_cursors
( runningTime1 in out nocopy simple_integer
, runningTime2 in out nocopy simple_integer )
is
testCursor1 sys_refcursor := testCursor(1);
testCursor2 sys_refcursor := testCursor(2);
time1 simple_integer := 0;
time2 simple_integer := 0;
time1Percentage simple_integer := 0;
time2Percentage simple_integer := 0;
begin
time_cursor(testCursor1, time1);
time_cursor(testCursor2, time2);
print_comparison(time1, time2);
-- Update running totals:
runningTime1 := nvl(runningTime1,0) + nvl(time1,0);
runningTime2 := nvl(runningTime2,0) + nvl(time2,0);
dbms_output.new_line;
end compare_cursors;
begin
for i in 1..10 loop
compare_cursors(totalTime1, totalTime2);
end loop;
dbms_output.put_line('Total:'||chr(10));
print_comparison(totalTime1, totalTime2);
end;
Between: 02.18 ||||||||||||||||||||||||||||||||||||||||||||||||
LessThan: 02.25 |||||||||||||||||||||||||||||||||||||||||||||||||
Between: 02.03 |||||||||||||||||||||||||||||||||||||||||||||||||
LessThan: 02.03 |||||||||||||||||||||||||||||||||||||||||||||||||
Between: 02.13 |||||||||||||||||||||||||||||||||||||||||||||||||
LessThan: 02.13 |||||||||||||||||||||||||||||||||||||||||||||||||
Between: 02.13 ||||||||||||||||||||||||||||||||||||||||||||||||||
LessThan: 02.07 |||||||||||||||||||||||||||||||||||||||||||||||||
Between: 02.16 ||||||||||||||||||||||||||||||||||||||||||||||||
LessThan: 02.27 |||||||||||||||||||||||||||||||||||||||||||||||||
Between: 02.28 |||||||||||||||||||||||||||||||||||||||||||||||||
LessThan: 02.28 |||||||||||||||||||||||||||||||||||||||||||||||||
Between: 02.17 ||||||||||||||||||||||||||||||||||||||||||||||||
LessThan: 02.27 |||||||||||||||||||||||||||||||||||||||||||||||||
Between: 02.24 ||||||||||||||||||||||||||||||||||||||||||||||||
LessThan: 02.30 |||||||||||||||||||||||||||||||||||||||||||||||||
Between: 02.24 |||||||||||||||||||||||||||||||||||||||||||||||||
LessThan: 02.27 |||||||||||||||||||||||||||||||||||||||||||||||||
Between: 02.25 ||||||||||||||||||||||||||||||||||||||||||||||||||
LessThan: 02.18 |||||||||||||||||||||||||||||||||||||||||||||||||
Total:
Between: 21.81 |||||||||||||||||||||||||||||||||||||||||||||||||
LessThan: 22.05 |||||||||||||||||||||||||||||||||||||||||||||||||
Upvotes: 3
Reputation: 8518
Well, let me do a PoC and see what happens
SQL> create table t1 ( c1 number, c2 varchar2(40) , c3 varchar2(40) ) ;
Table created.
SQL> declare
2 begin
3 for i in 1 .. 1000000
4 loop
5 insert into t1 values ( round(dbms_random.value(1,100)) , dbms_random.string('X',40) , dbms_random.string('X',40) );
6 end loop;
7 end;
8 /
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_table_stats('TESTUSER','T1');
PL/SQL procedure successfully completed.
SQL> select count(*) from t1 ;
COUNT(*)
----------
1000000
Scenario
SQL> set autotrace traceonly
SQL> select
CASE
WHEN c1 BETWEEN 0 AND 20 THEN 0
WHEN c1 BETWEEN 20 AND 50 THEN 20
WHEN c1 BETWEEN 50 AND 100 THEN 40
WHEN c1 BETWEEN 100 AND 200 THEN 75
WHEN c1 BETWEEN 200 AND 10000 THEN 86
ELSE 0 END AS t_c1
from t1; 2 3 4 5 6 7 8 9
1000000 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 2929K| 2322 (1)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T1 | 1000K| 2929K| 2322 (1)| 00:00:01 |
--------------------------------------------------------------------------
SQL> select case
WHEN c1 < 20 then 0
WHEN c1 < 50 then 20
WHEN c1 < 100 then 40
WHEN c1 < 200 then 75
WHEN c1 < 1000 then 86
END AS t_c1
from t1; 2 3 4 5 6 7 8
1000000 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 2929K| 2322 (1)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T1 | 1000K| 2929K| 2322 (1)| 00:00:01 |
--------------------------------------------------------------------------
They behave identical, right? Not exactly, although the plan looks the same, let's check the statistics. For that I will flush buffer cache and shared pool after each test.
SQL> set autotrace traceonly timing on
SQL> select
CASE
WHEN c1 BETWEEN 0 AND 20 THEN 0
WHEN c1 BETWEEN 20 AND 50 THEN 20
WHEN c1 BETWEEN 50 AND 100 THEN 40
WHEN c1 BETWEEN 100 AND 200 THEN 75
WHEN c1 BETWEEN 200 AND 10000 THEN 86
ELSE 0 END AS t_c1
from t1; 2 3 4 5 6 7 8 9
1000000 rows selected.
Elapsed: 00:00:02.92
Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 2929K| 2322 (1)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T1 | 1000K| 2929K| 2322 (1)| 00:00:01 |
--------------------------------------------------------------------------
Statistics
----------------------------------------------------------
12 recursive calls
0 db block gets
72870 consistent gets
6180 physical reads
0 redo size
19435128 bytes sent via SQL*Net to client
733901 bytes received via SQL*Net from client
66668 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
1000000 rows processed
SQL> alter system flush shared_pool ;
System altered.
Elapsed: 00:00:00.08
SQL> alter system flush buffer_cache ;
System altered.
Elapsed: 00:00:00.04
SQL> select t1.* , case
WHEN c1 < 20 then 0
WHEN c1 < 50 then 20
WHEN c1 < 100 then 40
WHEN c1 < 200 then 75
WHEN c1 < 1000 then 86
END AS t_c1
from t1; 2 3 4 5 6 7 8
1000000 rows selected.
Elapsed: 00:00:03.49
Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 81M| 2323 (1)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T1 | 1000K| 81M| 2323 (1)| 00:00:01 |
--------------------------------------------------------------------------
Statistics
----------------------------------------------------------
18 recursive calls
0 db block gets
72878 consistent gets
6180 physical reads
0 redo size
101747627 bytes sent via SQL*Net to client
733834 bytes received via SQL*Net from client
66668 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
1000000 rows processed
The second one is slower than the first one, probably because of the recursive calls
done in the second one ( 18 ) against the first one ( 12 ). If you start putting more fields, and the data is huge, I am quite sure you will get better performance in the query with between
than in the other one.
But obviously, it is just a feeling, you must test yourself in your own database. I am quite sure that the final query contains other fields, where conditions, etc... therefore this answer only covers what you put in the original question, and not what will happen in a real case scenario with many other fields, where conditions, indexes, etc...
Upvotes: 1