Shingston
Shingston

Reputation: 217

Building a table that counts the frequency that a value appears within the same by group as another value of the same variable

I have build a table where the headers each column except the first column is a unique value, and every cell in the first column is a unique value - just the headers but vertical.

values value1 value2 value3
value1
value2
value3

I also have a dataset group. Every unique group can have multiple unique values.

group value
group1 value1
group1 value2
group1 value3
group2 value2
group2 value3
group2 value4
group2 value5

I want to see the amount of times every unique value appears alongside any other unique value that is not itself within the same group by populating my first table.

Desired output, given the above example group dataset:

values value1 value2 value3
value1 0 1 1
value2 1 0 2
value3 1 2 0

I have difficulties finding answers as I am not sure what the type of output I want is called. How would I go about achieving my desired results?

Thank you for your help.

Upvotes: 1

Views: 68

Answers (1)

Stu Sztukowski
Stu Sztukowski

Reputation: 12849

Create a cross-join that counts the number of times each value matches with a different value:

proc sql noprint;
    create table count_values as
        select t1.value as value_t1
             , t2.value as value_t2
             , count(*) as total
        from have as t1, have as t2
        where     t1.group = t2.group 
              AND t1.value NE t2.value
        group by t1.value, t2.value
    ;
quit;

Which gives you this:

value_t1    value_t2    total
value1      value2      1
value1      value3      1
value2      value1      1
value2      value3      2
value3      value1      1
value3      value2      2

Now we can do a few different things. You can use proc tabulate and generate your table:

options missing='0';

proc tabulate data   = count_values
              out    = totals 
              format = 8.0;

    class value_t1 value_t2;
    var total;

    /* ' ' eliminates row headings */
    table value_t1 = ' ', 
          value_t2 = ' '*total = ' '*sum=' '
    ;
run;

enter image description here

Or, you can turn it into a dataset:

proc transpose data = count_values 
               out  = count_values_tpose(drop=_NAME_);
    by value_t1;
    id value_t2;
    var total;
run;

data want;

    /* Set var order */
    length value_t1 $8.
           value1-value3 8.
    ;

    set count_values_tpose;
    array value[*] value1-value3;

    /* Set missing to 0 */
    do i = 1 to dim(value);
        if(value[i] = .) then value[i] = 0;
    end;

    rename value_t1 = values;

    drop i;
run;

Output:

values  value1  value2  value3
value1  0       1       1
value2  1       0       2
value3  1       2       0

Upvotes: 2

Related Questions