Reputation: 55524
I have a PL/SQL procedure that does a lot of SUBSTR
s on a VARCHAR2
parameter. I would like to remove the length limit, so I tried to change it to CLOB
.
Works fine, but performance suffers, so I did some tests (based on these tests from 2005).
UPDATE: I can reproduce this on several different instances with different Oracle versions and different hardware, dbms_lob.substr
is always noticeable slower than substr(CLOB)
, and a lot slower than SUBSTR(VARCHAR2)
.
Bob's results and the tests in the link above tell a different story.
Can anyone explain this, or at least reproduce either Bob's or my results? Thanks!
Test results:
+000000000 00:00:00.004000000 (VARCHAR2)
+000000000 00:00:00.298000000 (CLOB SUBSTR)
+000000000 00:00:00.356000000 (DBMS_LOB.SUBSTR)
Test code:
DECLARE
l_text VARCHAR2(30) := 'This is a test record';
l_clob CLOB := l_text;
l_substr VARCHAR2(30);
t TIMESTAMP;
BEGIN
t := SYSTIMESTAMP;
FOR i IN 1..100000 LOOP
l_substr := SUBSTR(l_text,1,14);
END LOOP;
dbms_output.put_line( SYSTIMESTAMP - t || ' (VARCHAR2)');
t := SYSTIMESTAMP;
FOR i IN 1..100000 LOOP
l_substr := SUBSTR(l_clob,1,14);
END LOOP;
dbms_output.put_line( SYSTIMESTAMP - t || ' (CLOB SUBSTR)');
t := SYSTIMESTAMP;
FOR i IN 1..100000 LOOP
l_substr := DBMS_LOB.SUBSTR(l_clob,14,1);
END LOOP;
dbms_output.put_line( SYSTIMESTAMP - t || ' (DBMS_LOB.SUBSTR)');
END;
Upvotes: 25
Views: 23326
Reputation: 191275
I know this is very old, but may still be relevant to people on older systems. This looks like a data type conversion problem. Based on something I noticed looking at the effect @bernhard.weingartner saw, the data type of the offset and amount arguments seems to make a huge difference.
This is run on 11.2.0.3 on Linux (OEL 5.6), and increased to a million iterations just to make the differences even more obvious:
DECLARE
l_text VARCHAR2(30) := 'This is a test record';
l_clob CLOB := l_text;
l_substr VARCHAR2(30);
t TIMESTAMP;
BEGIN
t := SYSTIMESTAMP;
FOR i IN 1..1000000 LOOP
l_substr := SUBSTR(l_text,1,14);
END LOOP;
dbms_output.put_line( SYSTIMESTAMP - t || ' (VARCHAR2)');
t := SYSTIMESTAMP;
FOR i IN 1..1000000 LOOP
l_substr := SUBSTR(l_clob,1,14);
END LOOP;
dbms_output.put_line( SYSTIMESTAMP - t || ' (CLOB SUBSTR)');
t := SYSTIMESTAMP;
FOR i IN 1..1000000 LOOP
l_substr := DBMS_LOB.SUBSTR(l_clob,14,1);
END LOOP;
dbms_output.put_line( SYSTIMESTAMP - t || ' (DBMS_LOB.SUBSTR with 14,1)');
t := SYSTIMESTAMP;
FOR i IN 1..1000000 LOOP
l_substr := DBMS_LOB.SUBSTR(l_clob,14.0,1.0);
END LOOP;
dbms_output.put_line( SYSTIMESTAMP - t || ' (DBMS_LOB.SUBSTR with 14.0,1.0)');
t := SYSTIMESTAMP;
FOR i IN 1..1000000 LOOP
l_substr := DBMS_LOB.SUBSTR(l_clob,cast(14 as number), cast(1 as number));
END LOOP;
dbms_output.put_line( SYSTIMESTAMP - t || ' (DBMS_LOB.SUBSTR with casts)');
END;
/
+000000000 00:00:00.043019000 (VARCHAR2)
+000000000 00:00:03.671621000 (CLOB SUBSTR)
+000000000 00:00:05.704337000 (DBMS_LOB.SUBSTR with 14,1)
+000000000 00:00:00.040097000 (DBMS_LOB.SUBSTR with 14.0,1.0)
+000000000 00:00:00.040907000 (DBMS_LOB.SUBSTR with casts)
The 11gR2 docs show the formal parameters as type INTEGER, but actually passing an integer (or pls_integer, or binary_double) is slow, while explicitly passing a number is quick.
From your original question and Bob's results this looks like something that changed between 11.1 and 11.2. I don't have a 12c instance to test on so don't know if it's changed again. Whether it's due to a change in dbms_lob
or a wider change to how PL/SQL handles numeric values by default isn't clear. I haven't found anything on MOS that looks relevant.
Upvotes: 2
Reputation: 17643
I see that on 11gR1 tests ran smooth for DBMS_LOB.substr, but for 11gR2 the function is slow.
Below my test on Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
on AIX6.
+000000000 00:00:00.009440000 (VARCHAR2)
+000000000 00:00:00.749113000 (CLOB SUBSTR)
+000000000 00:00:01.177685000 (DBMS_LOB.SUBSTR)
Upvotes: 2
Reputation: 341
Ran your script three times on the following system:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
Here are the results:
+000000000 00:00:00.007787000 (VARCHAR2)
+000000000 00:00:03.093258000 (CLOB SUBSTR)
+000000000 00:00:00.340017000 (DBMS_LOB.SUBSTR)
+000000000 00:00:00.019460000 (VARCHAR2)
+000000000 00:00:03.302425000 (CLOB SUBSTR)
+000000000 00:00:00.336915000 (DBMS_LOB.SUBSTR)
+000000000 00:00:00.007773000 (VARCHAR2)
+000000000 00:00:03.210619000 (CLOB SUBSTR)
+000000000 00:00:00.336689000 (DBMS_LOB.SUBSTR)
Upvotes: 2
Reputation: 50017
(Lies, damn lies, and benchmarks...)
I re-ran your test 10 times, expanding the string so it was a full 30 characters long, and got the following averaged results:
+000000000 00:00:00.011694200 (VARCHAR2)
+000000000 00:00:00.901000600 (CLOB SUBSTR)
+000000000 00:00:00.013169200 (DBMS_LOB.SUBSTR)
I then changed the substring range to 5,14 (14,5 for DBMS_LOB.SUBSTR) and got:
+000000000 00:00:00.011731000 (VARCHAR2)
+000000000 00:00:01.010840000 (CLOB SUBSTR)
+000000000 00:00:00.011427000 (DBMS_LOB.SUBSTR)
I then changed the range to 17,14 (14,17 for DBMS_LOB.SUBSTR) and got
+000000000 00:00:00.013578900 (VARCHAR2)
+000000000 00:00:00.964527400 (CLOB SUBSTR)
+000000000 00:00:00.011416800 (DBMS_LOB.SUBSTR)
Finally, I changed the range to 25,14 (14,25 for DBMS_LOB.SUBSTR) and got
+000000000 00:00:00.011210200 (VARCHAR2)
+000000000 00:00:00.916439800 (CLOB SUBSTR)
+000000000 00:00:00.013781300 (DBMS_LOB.SUBSTR)
My conclusion is that when working against CLOB's it's best to use DBMS_LOB.SUBSTR as it appears to have effectively no performance penalty compared to using SUBSTR against a "normal" VARCHAR2. SUBSTR against a CLOB seems to suffer from a significant performance penalty. For the record - OS = HP/UX (Unix variant), Oracle version=11.1, processor=HP Itanium 2-plex. YMMV.
Share and enjoy.
And because if it's worth doing it's worth over-doing, here's some more results with the strings expanded to 32767 characters. Substring ranges given with each set of results:
1, 25000
+000000000 00:00:00.198466400 (VARCHAR2)
+000000000 00:00:02.870958700 (CLOB SUBSTR)
+000000000 00:00:00.174490100 (DBMS_LOB.SUBSTR)
1000, 25000
+000000000 00:00:00.253447900 (VARCHAR2)
+000000000 00:00:02.491790500 (CLOB SUBSTR)
+000000000 00:00:00.193560100 (DBMS_LOB.SUBSTR)
10000, 25000
+000000000 00:00:00.217812000 (VARCHAR2)
+000000000 00:00:02.268794800 (CLOB SUBSTR)
+000000000 00:00:00.222200200 (DBMS_LOB.SUBSTR)
Same day, same conclusion.
Cthulhu fhtagn.
(Once more unto the breach, dear friends, once more...)
Re-ran the benchmarks, changing the size of the CLOB to 3276700, and taking the substring from the middle starting at 2475000 for length 25000 I get:
+000000000 00:00:00.176883200 (VARCHAR2)
+000000000 00:00:02.069482600 (CLOB SUBSTR)
+000000000 00:00:00.175341500 (DBMS_LOB.SUBSTR)
(Note that changes only affect the last two tests).
AND...same results, different day.
YMMV.
Upvotes: 17