Creating table with cumulative values

I have table like first table on the picture. enter image description here

It's information about banks deals on the FX market on daily basis (buy minus sell). I would like to calculate cumulative results like on the second table. The number of banks and their names, also as date are not fixed. I'm new in SAS and tried to find solutions, but didn't find anything useful. I will be glad for any help.

Upvotes: 0

Views: 1377

Answers (1)

Stu Sztukowski
Stu Sztukowski

Reputation: 12909

When data such as this is in a wide format, it can be more difficult to process in SAS compared to a long format. Long data formats have numerous benefits in the form of by-group processing, indexing, filtering, etc. Many SAS procedures are designed around this concept.

For more information on the examples below, check out SAS's example on the Program Data Vector and by-group processing. Mastering these concepts will help you with data step programming.

Here are two ways you can solve it:

1. Use a sum statement and by-group processing.

In this example, we will:

  • Convert the data from wide to long in order to convert the bank name to a character variable
  • Perform a cumulative sum on each bank
  • Convert back to long again

By converting the bank name into a character variable, we can use by-group processing on it.

/* Convert from wide to long */
proc transpose data=raw 
               out=raw_transposed 
               name=bank
               ;
    by date;
run;

proc sort data=raw_transposed;
    by bank date;
run;

/* Use by-group processing to get cumulative values by month for each bank */
data cumulative_long;
    set raw_transposed;
    by bank date;

    /* Reset the cumulative sum for each bank */
    if(first.bank) then call missing(cumulative);

    cumulative+COL1;
run;

proc sort data=raw_transposed;
    by date bank;
run;

/* Convert from long to wide */
proc transpose data=raw_transposed
               out=want(drop=_NAME_)
               ;
    by date;
    id bank;
    var COL1;
run;

The sum statement can be used as a shortcut of the following code:

data cumulative_long;
    set raw_transposed;
    by bank date;
    retain cumulative;

    if(first.bank) then cumulative = 0;

    cumulative = cumulative + COL1;
run;

cumulative does not exist in the dataset: we are creating it here. This value will become missing whenever SAS moves on to read a new row. We want SAS to carry the last value forward. retain tells SAS to carry its last value forward until we change it.

2. Use macro variables and dictionary tables

A second option would be to read all of the bank names from a dictionary table to prevent transposing. We will:

  • Read the names of the banks from the special table dictionary.columns into a macro variable using PROC SQL
  • Use arrays to perform cumulative sums

This assumes the bank naming scheme is always prefixed with "Bank." If does not follow a regular pattern, you can exclude all other variables from the initial SQL query.

proc sql noprint;
    select name
         , cats(name, '_cume')
    into :banks separated by ' '
       , :banks_cume separated by ' '
    from dictionary.columns
    where     memname = 'RAW'
          AND libname = 'WORK'
          AND upcase(name) LIKE 'BANK%'
    ;
quit;

data want;
    set raw;

    array banks[*]      &banks.;
    array banks_cume[*] &banks_cume.;

    do i = 1 to dim(banks);
        banks_cume[i]+banks[i];
    end;

    drop i;
run;

Upvotes: 2

Related Questions