Reputation: 228
I have the following query which returns a customer's username, the year and half-year, a count of bets for the first semi-year, a count of bets for the second semi-year and a profit calc.
proc sql;
create table avg
as
select
username as username,
year(datepart(betdate))*10 + floor( (qtr(datepart(betdate))+1)/2) as yearsemiyear,
count(bet_id) as betcount,
sum( case when floor( (qtr(datepart(betdate))+1)/2) = 1 then 1 else 0 end ) as firstHalfBetcount,
sum( case when floor( (qtr(datepart(betdate))+1)/2) = 2 then 1 else 0 end ) as secondHalfBetcount,
round(mean((winnings - stake) / stake) * 100, 0.01) as averageMargin
from &dsn
WHERE datepart(BETDATE) > '31DEC2010'd AND datepart(BETDATE) <'01JAN2012'd
AND bet_type = 'SGL'
group by 1,2
/* HAVING sum( case when floor( (qtr(datepart(betdate))+1)/2) = 1 then 1 else 0 end ) >= 4 */
order by username desc, yearsemiyear asc;
quit;
How can I restrict this to customers who had at least 4 bets in the first half of the sample? If I uncomment the HAVING clause I have there, it only seems to return rows for the first half, and no second half rows are returned. I tried putting the restriction into the WHERE clause, but apparently you can't put a summary function in that CLAUSE.
How should I do this?
Thanks
Upvotes: 1
Views: 5975
Reputation: 1
I have the issue, too. After detailed compare the right code and mine, I found that the problem is group by, which is only character variables. If we use numerical ones, then the error comes
Upvotes: 0
Reputation: 63424
You can reference the calculated columns in your HAVING clause. See for example:
proc sql;
select country, sum(case when region='EAST' then Actual else 0 end) as east_act,
sum(case when region='WEST' then Actual else 0 end) as west_act
from sashelp.prdsale
group by 1
having east_act > 120000;
quit;
Specific to your problem, you should change your HAVING clause to
HAVING firstHalfBetcount ge 4
I'm looking at this again, though, and I think your problem is that you're grouping by half year - so you wouldn't get records from the second half, as you'd never have a record that has both first half and second half data (if your data is as I think it is). You either need to not group by half year, and do this at the overall year level, or not separate first half/second half in columns.
If your goal is to get rows for both, but only include rows for people with 4 first half bets, you may need to create this table and then join it to a query of people with at least 4 first half bets, or use a where (select... ) subquery to filter down to those people; you can't do it with just grouping and having.
Upvotes: 2