user2146441
user2146441

Reputation: 228

SAS total of 'by' category

I have the following sample data:

data have;
   input username $ amount betdate : datetime.;
   dateOnly = datepart(betdate) ;
   format betdate DATETIME.;
   format dateOnly ddmmyy8.;
   datalines; 
player1 90 12NOV2008:12:04:01
player1 -100 04NOV2008:09:03:44
player2 120 07NOV2008:14:03:33
player1 -50 05NOV2008:09:00:00
player1 -30 05NOV2008:09:05:00
player1 20 05NOV2008:09:00:05
player2 10 09NOV2008:10:05:10
player2 -35 15NOV2008:15:05:33
run;
 PROC PRINT data=have; RUN;

proc sort data=have;
   by username betdate;
run;

data want;
   set have;
   by username dateOnly betdate;   
   retain calendarTime eventTime cumulativeDailyProfit profitableFlag totalDailyProfit;
   if first.username then calendarTime = 0;
   if first.dateOnly then calendarTime + 1;
   if first.username then eventTime = 0;
   if first.betdate then eventTime + 1;   

   if first.username then cumulativeDailyProfit= 0;
   if first.dateOnly then cumulativeDailyProfit= 0;
   if first.betdate then cumulativeDailyProfit+ amount;

   if first.dateOnly then totalDailyProfit = 0;
   if first.betdate then totalDailyProfit + amount;
 PROC PRINT data=want; RUN;

The final column in the output 'cumulativeDailyProfit' is exactly what I want: an incrementing value which adds the value of the 'amount' field. However, I don't want the same to happen for the field 'totalDailyProfit' becuase I want this to show the profit at the end of the day i.e. the last value for cumulativeDailyProfit for each customer.

For example, the eight columns above would ideally show the following: -100, -60, -60 ,-60, 90, 120, 10, -35. I will then set the 'profitableFlag' boolean if this value is greater than 0 for the rows relating to that day and that customer.

Is this something that can actually be done in a data step? I want to be able to run the following query (with the right flag in the case when clause) to get the average, average on winning days and average on losing days.

proc sql;
    select calendarTime,
    mean(amount) as meanStake,
    mean(case when profitableFlag  = 1 then amount else . End) as meanLosingDayStake,
    mean(case when profitableFlag  = 1 then amount else . End) as meanWinningDayStake
    from want
    group by 1;     
quit;

Upvotes: 0

Views: 128

Answers (1)

isJustMe
isJustMe

Reputation: 5470

Try this query :

proc sql;
    select calendarTime,
    avg(amount) as meanStake,
    avg(case when profitableFlag  = 1 
            then amount else 0 End) as meanLosingDayStake,
    avg(case when profitableFlag  = 1
            then amount else 0 End) as meanWinningDayStake
    from want
    group by calendarTime;     
quit;

Upvotes: 1

Related Questions