Oracle stored procedure, returning ref cursor vs associative arrays

Question

Our DBA requires us to return all tabular data from stored procedures in a set of associative arrays rather than using a ref cursor which is what I see in most examples on the web. He says this is because it is much faster for Oracle to do things this way, but it seems counter intuitive to me because the data needs to be looped over twice, once in the stored procedure and then again in the application when it is processes. Also, values often need to be casted from their native types to varchar so they can be stored in the array and then casted back on the application side. Using this method also makes it difficult to use orm tools because they seem to want ref cursors in most cases.

An example of a stored procedure is the following:

PROCEDURE sample_procedure (
                                p_One       OUT varchar_array_type,
                                p_Two       OUT varchar_array_type,
                                p_Three     OUT varchar_array_type,
                                p_Four      OUT varchar_array_type
                            )
IS
p_title_procedure_name        VARCHAR2(100) := 'sample_procedure';
v_start_time DATE :=SYSDATE;    

CURSOR cur
  IS
    SELECT e.one, e.two, e.three, e.four FROM package.table 
    WHERE filter='something';

    v_counter PLS_INTEGER := 0;
BEGIN

    FOR rec IN cur LOOP
        BEGIN
            v_counter := v_counter + 1;
            p_One(v_counter) := rec.one;
            p_Two(v_counter) := rec.two;
            p_Three(v_counter) := rec.three;
            p_Four(v_counter) := rec.four;
        END;
    END LOOP;
END;

The cursor is used to populate one array for each column returned. I have tried to find information supporting his claim that this is method faster but have been unable to do so. Can anyone fill me in on why he might want us (the .net developers) to write stored procedures in this way?

Justin Cave · Accepted Answer

The DBA's request does not make sense.

What the DBA is almost certainly thinking is that he wants to minimize the number of SQL to PL/SQL engine context shifts that go on when you're fetching data from a cursor. But the solution that is being suggested is poorly targetted at this particular problem and introduces other much more serious performance problems in most systems.

In Oracle, a SQL to PL/SQL context shift occurs when the PL/SQL VM asks the SQL VM for more data, the SQL VM responds by executing the statement further to get the data which it then packages up and hands back to the PL/SQL VM. If the PL/SQL engine is asking for rows one at a time and you're fetching a lot of rows, it is possible that these context shifts can be a significant fraction of your overall runtime. To combat that problem, Oracle introduced the concept of bulk operations back at least in the 8i days. This allowed the PL/SQL VM to request multiple rows at a time from the SQL VM. If the PL/SQL VM requests 100 rows at a time, you've eliminated 99% of the context shifts and your code potentially runs much faster.

Once bulk operations were introduced, there was a lot of code that could be refactored in order to be more efficient by explicitly using BULK COLLECT operations rather than fetching row-by-row and then using FORALL loops to process the data in those collections. By the 10.2 days, however, Oracle had integrated bulk operations into implicit FOR loops so an implicit FOR loop now automatically bulk collects in batches of 100 rather than fetching row-by-row.

In your case, however, since you're returning the data to a client application, the use of bulk operations is much less significant. Any decent client-side API is going to have functionality that lets the client specify how many rows need to be fetched from the cursor in each network round-trip and those fetch requests are going to go directly to the SQL VM, not through the PL/SQL VM, so there are no SQL to PL/SQL context shifts to worry about. Your application has to worry about fetching an appropriate number of rows in each round-trip-- enough that the application doesn't become too chatty and bottleneck on the network but not so many that you have to wait too long for the results to be returned or to store too much data in memory.

Returning PL/SQL collections rather than a REF CURSOR to a client application isn't going to reduce the number of context shifts that take place. But it is going to have a bunch of other downsides not the least of which is memory usage. A PL/SQL collection has to be stored entirely in the process global area (PGA) (assuming dedicated server connections) on the database server. This is a chunk of memory that has to be allocated from the server's RAM. That means that the server is going to have to allocate memory in which to fetch every last row that every client requests. That, in turn, is going to dramatically limit the scalability of your application and, depending on the database configuration, may steal RAM away from other parts of the Oracle database that would be very useful in improving application performance. And if you run out of PGA space, your sessions will start to get memory related errors. Even in purely PL/SQL based applications, you would never want to fetch all the data into collections, you'd always want to fetch it in smaller batches, in order to minimize the amount of PGA you're using.

In addition, fetching all the data into memory is going to make the application feel much slower. Almost any framework is going to allow you to fetch data as you need it so, for example, if you have a report that you are displaying in pages of 25 rows each, your application would only need to fetch the first 25 rows before painting the first screen. And it would never have to fetch the next 25 rows unless the user happened to request the next page of results. If you're fetching the data into arrays like your DBA proposes, however, you're going to have to fetch all the rows before your application can start displaying the first row even if the user never wants to see more than the first handful of rows. That's going to mean a lot more I/O on the database server to fetch all the rows, more PGA on the server, more RAM on the application server to buffer the result, and longer waits for the network.

Oracle stored procedure, returning ref cursor vs associative arrays

Answers (2)

Related Questions