cappuccino
cappuccino

Reputation: 3

Basic SQL (compile in SAS) - retrieve top 10 entries by frequency

I'm new to SQL and trying to run this piece of code to do a count on the frequency of each entry, but I'm getting an error (and can't figure out why - no real debugging in this environment, just "ERROR 22-322: Syntax error, expecting one of the following: ;, ',', USING. & ERROR 202-322: The option or parameter is not recognized and will be ignored.").

I'm compiling in SAS Enterprise Guide 6.1, by the way. Any help would be much appreciated!

proc sql;
create view work.temp as 
select model_name, count(*) as frequency
from mytable_name
where model_type like '%Smartphones%'
and model_name not like '%Apple iPhone 5%'
group by model_name
order by count(*) desc 
limit 10;   
quit; 

Upvotes: 0

Views: 5196

Answers (2)

Pasha
Pasha

Reputation: 11

You can use the OUTOBS= option of PROC SQL to limit observations.

proc sql outobs= 10;
    create view work.temp as 
    select model_name, count(*) as frequency
    from mytable_name
    where model_type like '%Smartphones%'
    and model_name not like '%Apple iPhone 5%'
    group by model_name
    order by count(*) desc;   
quit; 

If you are trying to learn SQL in SAS using some sample code then I would recommend going through these examples from SAS.

Upvotes: 1

Joe
Joe

Reputation: 63424

limit 10 is not valid code in SAS proc sql. In general, there isn't a good way to limit observations on output in SAS PROC SQL; (obs=10) is not valid as a dataset option for some reason.

You also can't order by count(*); you can only order by the created column.

So you could do this:

proc sql;
  create view work.temp as 
    select model, count(*) as frequency
      from sashelp.cars
      group by model
      order by 2  desc 
;
quit; 

2 there refers to the second column (so, the frequency column).

If you want to limit to a smaller number of observations, you can use (obs=#) where you actually use the view, in some cases (but not in others - in general, I don't think it's valid in PROC SQL in most places).

In general this particular query is probably easier in base SAS rather than SQL, especially if you want to limit its output.

proc freq data=sashelp.cars order=freq noprint;
  where origin='Asia';
  tables make/out=makelist(where=(count>10) keep=make count);
run;

That will run a bit faster than SQL (assuming you don't gain anything from the view, anwyay; if you're only outputting a few rows view won't help you much). Of course here I'm limiting to a particular count not a particular # of obs, but it's much the same.

To accomplish the title's question, you could then simply select the top 10 entries from that dataset (assuming you asked PROC FREQ to return the dataset sorted by frequency as I do above):

data topten;
  set makelist;
  if _n_ le 10;  *first 10 iterations;
run;

Or use proc rank or any number of options.

Upvotes: 1

Related Questions