oivemaria
oivemaria

Reputation: 463

Ordering boxplots using SGplot in SAS

I'm generating vertical boxplots in SAS using SGPLOT procedure, following is the code. I cannot order the boxplots by various categories, I'd like to order them by ascending mean of the variable being measured "cholesterol", but cannot seem to make that happen. Also, i'd love to be able to change the names of the various levels of category variable "weight_status" and perhaps even consider plotting the boxplots sorted alphabetically.

proc sgplot data=sashelp.heart;
title "Cholesterol Distribution by Weight Class";
vbox cholesterol / category=weight_status GROUPORDER= descending;
run;
title

Could someone assist with this?

Upvotes: 2

Views: 2639

Answers (1)

Dirk Horsten
Dirk Horsten

Reputation: 3845

Correcting your error

To use GROUPORDER=, you must use GROUP= instead of categories, so this works:

title "Cholesterol Distribution by Weight Status";
proc sgplot data=sashelp.heart;
    where Weight_Status NE '';
    vbox cholesterol / 
        GROUP=weight_status  
        GROUPORDER= descending;
run;
title

I had to add the where clause because missing categories are suppressed but missing groups are not.

Sorting by the mean of the variable being measured

To do this, you must add that mean to the dataset, for instance this way:

proc sql;
    create view temp_heart as
    select individu.weight_status, cholesterol, group_mean
    from   sashelp.heart individu inner join
        (   select weight_status, mean(cholesterol) as group_mean
            from   sashelp.heart
            group by weight_status ) collection 
        on individu.weight_status EQ collection.weight_status;
quit;

I have two ways for you to display this

Use mean cholesterol as category and weight_status as group

Numerical categories are sorted in numerical order, so that does the job but to display also your weight classes in the legend, I use them as the group

proc sgplot data=temp_heart;
    where Weight_Status NE '';
    vbox cholesterol / 
        Category=group_mean
        GROUP=weight_status  ;
run;

Now this is quick and dirty, I agree, so I have another option

Use mean cholesterol as category, but put a format on it

Use Proc Format to create a format for the averages and then use that format

proc sql;
    create table temp_means as 
    select 'mean2status' as fmtname
         , mean(cholesterol) - 1E-8 as start
         , mean(cholesterol) + 1E-8 as end
         , weight_status as label
    from   sashelp.heart
    group by weight_status;
quit;

proc format cntlin=temp_means (where=(label NE '')) cntlout=temp_check;
run;

proc sgplot data=temp_heart (where=(weight_status NE ''));
    format group_mean mean2status.;
    vbox cholesterol / 
        category=group_mean;
run;

Unfortunately, I had to use an interval of 2E-8 wide around the averages before it worked.

Upvotes: 2

Related Questions