TMNuclear
TMNuclear

Reputation: 1175

PL/SQL Error: A loop that contains DML statements should be refactored to use BULK COLLECT and FORALL

I've searched across the whole internet for some examples, but I still can't get my head around why I can not use DML statement inside this cursor. I'm kind of missing the theory behind it, but I won't deny an example of how to do write this correctly would make my life lots easier as well. Here is the query I'm working on (Note: I removed exits when not found results, close if cursor already open and things like that just to focus on the main point here):

DECLARE
    // lots of vars

    // the cursor below gets all datasources connected to Node XXYZ123
    CURSOR DataSourceCheck
    IS
        SELECT NODENAAM, NAAM, URL, DBNODE1, DBNODE2, DBUSERNAAM, DBNAAM
        FROM SCHEMA.TABLENAME
        WHERE NODENAAM = 'XXYZ123';

    // this cursor will execute row-by-row based on the result set of above cursor
    CURSOR CheckIfOnlyDataSource
    IS
        SELECT NODENAAM, NAAM, URL, DBNODE1, DBNODE2, DBUSERNAAM, DBNAAM
        FROM SCHEMA.TABLENAME
        WHERE DBUSERNAAM = var_dbusernaam AND (DBNode1 = var_dbnode1 OR DBNode2 = var_dbnode2);

BEGIN  
    OPEN DataSourceCheck;   
    LOOP
        FETCH DataSourceCheck into var_nodenaam, var_naam, var_URL, var_dbnode1, var_dbnode2, var_dbusernaam, var_dbnaam;

        var_rowcount:= 0;
        OPEN CheckIfOnlyDataSource;     
            LOOP
                FETCH CheckIfOnlyDataSource into var_nodenaam2, var_naam2, var_URL2, var_dbnode12, var_dbnode22, var_dbusernaam2, var_dbnaam2;
                var_rowcount:= var_rowcount + 1;
            END LOOP;  

            // only save result in a temp table when var_rowcount is 1 and not higher.
            IF var_rowcount = 1
                THEN
                    INSERT INTO global_temp_table
                    (t_dbusernaam, t_nodenaam, t_dbnode1, t_dbnode2, t_distinctcount)
                    VALUES
                    (var_dbusernaam2, var_nodenaam2, var_dbnode12, var_dbnode22, var_rowcount)
            END IF;
        CLOSE CheckIfOnlyDataSource;
    END LOOP;    
END;

The point of failure is this part, with the message that DML should be reconfigured into FORALL or BULK INTO statements:

IF var_rowcount = 1
    THEN
        INSERT INTO global_temp_table
        (t_dbusernaam, t_nodenaam, t_dbnode1, t_dbnode2, t_distinctcount)
        VALUES
        (var_dbusernaam2, var_nodenaam2, var_dbnode12, var_dbnode22, var_rowcount)
END IF;

I don't understand why DML is not working in a row-by-row approach? The output is clearly stored inside the variables var_dbusernaam2, var_nodenaam2, var_dbnode12 and var_dbnode22, hence I can do a dbms_output.put_line to show them. But if it is stored into the variable already, then why can't I just store it simply into a table (this isn't billions of bulk data, not even 1000 records!).

Is there no simple workaround? I gave BULK COLLECT and FORALL a try, but I need a lot more time to invest to understand it and get the query right - the cursor in a cursor definately won't make it any easier.

Upvotes: 1

Views: 10512

Answers (2)

Boneist
Boneist

Reputation: 23578

In addition to the suggestion in Mottor's answer, the reason why Toad is flagging up your code is because row-by-row processing is slow. You've got a lot of context switching going on between the PL/SQL and SQL engines.

Think of it like building new wall near your house - if the bricks are delivered to the bottom of the drive, do you:

  1. Go to the pile of bricks
  2. Pick up a single brick
  3. Go back to your wall
  4. Add the brick onto the wall
  5. Go back to step 1 and repeat until the wall is complete

(This is the equivalent of row-by-row processing)

Or:

  1. Take your wheelbarrow down to the pile of bricks
  2. Load your wheelbarrow with as many bricks as will fit and/or you can carry
  3. Take the wheelbarrow back over to the wall
  4. Add each brick into the wall
  5. Go back to step 1 and repeat until the wall is complete.

(This is the equivalent of bulk processing.)

Of course, if you're canny, you could avoid all the walking and carrying required in the above scenarios by getting the bricks delivered right next to the wall in the first place. (This is the equivalent of set-based processing).

Turning your procedure into a set-based approach (incorporating Mottor's answer) would make it simply:

declare
    -- lots of vars
begin
  insert into global_temp_table (t_dbusernaam,
                                 t_nodenaam,
                                 t_dbnode1,
                                 t_dbnode2,
                                 t_distinctcount)
  select dbusernaam,
         nodenaam,
         dbnode1,
         dbnode2,
         cnt
  from   (select nodenaam,
                 naam,
                 url,
                 dbnode1,
                 dbnode2,
                 dbusernaam,
                 dbnaam,
                 count(*) over (partition by dbnode1, dbnode2, dbusernaam) cnt
          from   schema.tablename
          where  nodenaam = 'XXYZ123')
  where  cnt = 1;
end;
/

This has the advantage of being more compact than your original code, making it easier to read, understand and therefore debug. Plus you can run the select statement on its own outside of the procedure - much easier to see what it's doing that way.

It will also be faster than your original approach of looping through two cursors (which, by the way, was reinventing the nested loop join - something that the database is optimised to do in pure SQL... and may not be the fastest way of doing the join anyway, if you had been stuck with keeping the join!).

I'd also be interested to know why you need to insert the rows into the GLOBAL_TEMP_TABLE (which I suspect is a GTT - global temporary table - rather than a normal heap table) - can you not do the subsequent processing in a single SQL statement, using the above select statement rather than inserting the data into the GTT?

Upvotes: 7

Mottor
Mottor

Reputation: 1948

This is not an error, but the TOAD suggestion with number Rule 4809.

P.S. If the table is the same in the both query, you can use

..., COUNT(*) OVER (PARTITION BY DBNODE1, DBNODE2, DBUSERNAAM) c 

in the first query to get the number of rows per DBNODE1, DBNODE2, DBUSERNAAM and not to need the second one.

Upvotes: 2

Related Questions