PROC TABULATE Rearrange header subgroups independent of the data

Question

This is a follow up to a previous question which considered only a single variable with two subgroups. In that question, the solution was to order the subgroups by sorting the data.

That approach, however, breaks down when there are several variables. It becomes a whack-a-mole of sorting the variables into a sequence which produces the desired result. Each subsequent BY variable subgroup depends on the previous levels. If not all subgroups are present in a higher level group, then not all subgroups can be arranged into the desired order.

Either PROC TABULATE is not the appropriate tool for this task (i.e. obtaining percents of various groups across several variables) or there is a technique that allows for subgroups to be arranged independently of the data.

Do I go digging for the needle in the documentation haystack or do I reinvent the wheel? Any insights you could give me would be appreciated.

Example:

To give an illustration, say that I want to create a table with the subgroups of each variable arranged in (Y, N) order. Notice how var4 is not output in the correct order. By the time the other variables have been sorted, there aren't enough var4 values present to be sorted into the desired order.

data example;
    input group $ var1 $ var2 $ var3 $ var4 $;
    datalines;
    1 Y Y N Y
    1 N Y N N
    2 Y N Y N
    2 Y Y Y N
    3 N N N Y
    3 N Y Y N
    ;
run;

proc sort data = example out = sorted;
    by  descending var1
        descending var2
        descending var3
        descending var4
        ;
run;

title 'Percent';
proc tabulate data = sorted order = data;
    class group var1 var2 var3 var4;
    table group='Group', 
            all  = 'Total'*pctn='' 
            var1 = 'Variable 1'*pctn=''
            var2 = 'Variable 2'*pctn=''
            var3 = 'Variable 3'*pctn=''
            var4 = 'Variable 4'*pctn='';
run;

It may be possible to devise a combination of BY variables in the PROC SORT which give a (Y, N) subgrouping order, but it would involve a bunch of fiddling which is not robust against changes in data. If the table needs to be updated monthly, then each month you would have to fiddle with the sorting.

PROC TABULATE Rearrange header subgroups independent of the data

Answers (1)

Related Questions