user2518751
user2518751

Reputation: 735

ERROR: Out of memory

I'm getting an Out Of Memory error when running a program with an input of ~5M obs. and 153 variables.

When running the same program on ~40,000 obs. - program executes and completes great. When I try to increment number of input obs. to more than 100K - I get an error : Out of memory.

I tried to follow the below tips and uncheck few boxes:

1.Tools ► Options ► Results General ► deselect all Result Formats.
2.Tools ► Options ► Results General ► deselect Open generated data/results automatically.
3.Tools ► Options ► Results General ► deselect Link handcoded ODS results.
4.Tools ► Options ► SAS Programs ► deselect Automatically direct results back to SAS Enterprise Guide.

Still - Not working.

I use Enterprise Guide 5.1 but eventually will insert the working code into a User Written Code transformation on Data Integration Studio 4.6.

Does anyone have a clue how to get this program work?

edited

This is the piece of code causes the error messaage:

 data _null_;
        set all_include_Stornos ;
        IF TREATMENT_IND = 1 AND DDS_ROW_IND NE 1 THEN DO;
        CALL EXECUTE ('%STORNO_TKUFA ('||POLICY_RK||');');
        CALL EXECUTE ('%UPDATE (STORNO_TKUFA_CUMULATE);');
        END;
        IF TREATMENT_IND in (4) AND DDS_ROW_IND NE 1  THEN DO;

            CALL EXECUTE ('%HAKPAA ('||POLICY_RK||','||POLICY_VERSION||');');
            call execute ('%UPDATE(HAKPAA_CUMULATE);');


        END;

        IF TREATMENT_IND  = 5  AND DDS_ROW_IND NE 1  and count_bitul_no <2  or (count_bitul_no >1 and max_bitul_ver = policy_verSion ) THEN DO;
            CALL EXECUTE ('%BITUL ('||POLICY_RK||','||POLICY_VERSION||');');
            CALL EXECUTE ('%UPDATE(BITUL_CUMULATE);');

        END;

        IF TREATMENT_IND = 6 AND DDS_ROW_IND NE 1  THEN DO;
        CALL EXECUTE ('%LAST_STATE ('||POLICY_RK||','||POLICY_VERSION||');'); 
        CALL EXECUTE ('%UPDATE (LAST_STATE_calc);');
        END;

        IF DDS_ROW_IND NE 1 and ((PREV_TREATMENT_IND = 4 AND TREATMENT_IND NOT IN (1,2,5)) or treatment_ind = 3) THEN DO;
            CALL EXECUTE ('%HAFSHARA ('||POLICY_RK||','||POLICY_VERSION||');');
            CALL EXECUTE ('%UPDATE (HAFSHARA_CUMULATE);');
        END;

            IF TREATMENT_IND = 2 AND POLICY_VERSION - 1 = max_bitul_ver AND DDS_ROW_IND NE 1  THEN DO;
        CALL EXECUTE ('%STORNO_BITUL ('||POLICY_RK||','||POLICY_VERSION||');');
        CALL EXECUTE ('%UPDATE (STORNO_BITUL_CUMULATE);');
        END;
RUN; 

and these are 2 macros (shortened) executed when treatment_ind = 5:

%macro BITUL (pol_rk , pol_ver );


    proc sql;
    create table macro_BITUL
    as select * from all_include_Stornos
    where policy_rk = &pol_rk
    and treatment_ind_5 = &pol_ver 
    order by policy_rk, policy_version;
    quit;



    data BITUL_calc;
    set macro_BITUL;
    BY POLICY_RK;

    IF LAST.policy_rk THEN ACT_DAILY_AMT_END_DT = POLICY_VERSION_END_DT;
            ELSE ACT_DAILY_AMT_END_DT = NEXT_POLICY_VERSION_START_DT; 

    VERSION_EXPOSURE_DAYS_NO = ACT_DAILY_AMT_END_DT - ACT_DAILY_AMT_START_DT + 1;

    BITUL_DURATION =  (POLICY_EXPIRATION_DT_5 - POLICY_VERSION_START_DT + 1) / (POLICY_VERSION_END_DT_5 - POLICY_VERSION_START_DT + 1);

    GAINED_NET_PREMIUM_V_AMT  = NET_PREMIUM_V_AMT *BITUL_DURATION;
    .
    .
    .
    GAINED_NET_COMMISION_B_IB_V_AMT = PRODUCR_B_NET_COMM_IB_V_AMT * BITUL_DURATION; 
    run;

    /**/


    PROC SQL;
    CREATE TABLE TOTAL_GAINED AS 
    SELECT POLICY_RK,
    SUM(GAINED_NET_PREMIUM_V_AMT) AS TOT_GAINED_NET_PREMIUM_V_AMT,
    SUM(GAINED_NET_FEES_V_AMT) AS TOT_GAINED_NET_FEES_V_AMT,
    .
    .
    .
    SUM(GAINED_NET_COMMISION_B_IB_V_AMT) AS TOT_GAINED_NET_COMMN_B_IB_V_AMT 

    FROM BITUL_calc
    GROUP BY POLICY_RK;
    QUIT;

    PROC SQL;
    CREATE TABLE BITUL_calc_AND_TOTALS AS
    SELECT A.* , 
    TOT_GAINED_NET_PREMIUM_V_AMT,
    .
    .
    JOIN TOTAL_GAINED AS B
    ON
    (A.POLICY_rK = B.POLICY_RK
    )order  by policy_rk, policy_version;
    QUIT;




    DATA bitul_CALCULATED;
    SET BITUL_calc_AND_TOTALS;
    IF TOT_GAINED_NET_PREMIUM_V_AMT = 0 THEN CALCULATED_NET_PREMIUM_V_AMT = 0; 
        ELSE CALCULATED_NET_PREMIUM_V_AMT = NET_PREMIUM_AMT_5 * GAINED_NET_PREMIUM_V_AMT / TOT_GAINED_NET_PREMIUM_V_AMT;
    .
    .
        ELSE CALC_NET_COMMISION_B_IB_V_AMT = PRODUCR_B_NET_COMM_IB_AMT_5 * GAINED_NET_COMMISION_B_IB_V_AMT / TOT_GAINED_NET_COMMN_B_IB_V_AMT;


    .
    .
    .
    .
    .


    run;

    DATA BITUL_CUMULATE;
    SET bitul_CALCULATED;
.
.
.
.
.
.
run;    

%mend ;




%MACRO UPDATE (TABLE_NAME);
        PROC SQL;
            DELETE FROM all_include_Stornos
            WHERE CATS(POLICY_RK,POLICY_VERSION) IN ( SELECT CATS(POLICY_RK,POLICY_VERSION) FROM     &TABLE_NAME);
            INSERT INTO all_include_Stornos
select *    FROM &TABLE_NAME
            ;QUIT;

/*          PROC SORT DATA=all_include_Stornos OUT=all_include_Stornos; BY POLICY_RK     POLICY_VERSION;RUN;*/

%MEND;

Upvotes: 0

Views: 4571

Answers (4)

Tom
Tom

Reputation: 51566

Either add %nrstr() into your CALL EXECUTE statements.

CALL EXECUTE ('%nrstr(%STORNO_TKUFA) ('||POLICY_RK||');');
CALL EXECUTE ('%nrstr(%UPDATE) (STORNO_TKUFA_CUMULATE);');

Or perhaps better change your data _null_ step to generate a file with the code and %include the file.

filename code temp;
data _null_;
  set all_include_Stornos ;
  IF TREATMENT_IND = 1 AND DDS_ROW_IND NE 1 THEN DO;
    put '%STORNO_TKUFA (' POLICY_RK ');';
    put '%UPDATE (STORNO_TKUFA_CUMULATE);';
...
RUN;
%include code ; 

Upvotes: 0

momo1644
momo1644

Reputation: 1804

First you need to check how much disk space is assigned to your sas session and sorting. The options for these are called MEMSIZE and SORTSIZE

Run this code and you will see the values in the log in bytes:

proc options group=memory option=(memsize sortsize) ; run;

I recommend to add this line as the first line in your code to assign MAX space to sort size: options sortsize=MAX; and see if this solved the issue.

Otherwise you can change the MEMSIZE to MAX , follow steps in this post.

Upvotes: 0

John Chrysostom
John Chrysostom

Reputation: 3963

Based on this: "I know on which data step it throws me out. Thing is, it's a data null calls different macros according to if statements."

Take a look at how Call Execute works. (http://www2.sas.com/proceedings/sugi22/CODERS/PAPER70.PDF will be helpful)

Essentially, Call Execute will parse the macro code immediately, but it queues up the resulting SAS steps until after the current data step finishes. In other words, you are potentially building up millions upon millions of lines of SAS code in memory that SAS is just storing up to be executed when that data _null_; step finishes. Eventually, this gets so large that SAS just craps out.

How to deal with this? Simply write a macro to break the data _null_; step into chunks. You can use proc sql to select the number of records in all_include_Stornos into a macro variable. Then use a macro loop to run the data step multiple times, using firstobs and obs to hit only a portion of the data each time. That way, you can keep the buffer that's built up by call execute down to a reasonable size. That should prevent you from running out of memory.

Upvotes: 1

mvherweg
mvherweg

Reputation: 1283

Based on the edit and your comment, i suspect that the 'order by' and/or 'proc sort' parts just cannot handle the volume on the system it is running.

50M is a lot of records, but still manageable by most. But if your data grows to 500M records, it might be too much.

I recall a client I had a while ago where the dataset was 120M rows and (iirc) +-300 variables and we did get similar issues.
Luckily, there are some ways to handle larger amounts of data in a sort.

1. Use the tagsort option
You can add tagsort as an option to the proc sort statement. In that case, SAS only sorts using the columns that are in the by statement and attaches the remaining part of the records afterwards. It is less resource-intensive, but of course, it is slower.

2. Use indexes for by statements
If you need the data to be in a particular order for a by statement later on in a proc/data step, you can create an index on these variables instead of sorting the data. My experience is that this will allow SAS to cope with larger datasets as well. Of course, it is slower, it consumes more space (for the index) and you need to take into account the existence (and destruction) of indexes.

3. Split the dataset, sort the parts and stitch everything together afterwards
A final method (which is probably preferrable to option 2 in many cases) is to:

  1. Split the dataset into smaller datasets.
  2. Sort each smaller dataset.
  3. Bring everything back together again by using a simple data step. (put all the smaller datasets in the set statement and use the by statement you sorted on)

This list method (and tagsort) are explained here: http://www.nesug.org/Proceedings/nesug12/cc/cc36.pdf

Note that all these methods require you to get rid of any order by and follow up those proc sql pieces with the appropriate sort procedure.

Upvotes: 0

Related Questions