Robbe Motmans
Robbe Motmans

Reputation: 135

combine two columns into two other columns?

I have a dataset that looks like this:

weight    |      malf
3.18456          0
4.2433           1
3.8543           0
4.0123           1
4.15             1

I need to change this to:

Type    |      respons
weight         3.18456
malf           0
weight         4.2433
malf           1
weight         3.8543
malf           0
weight         4.0123
malf           1
weight         4.15
malf           1

So my dataset will be double as large.
If been trying a while now but i don't find a way. Is there a way to do this?

Upvotes: 0

Views: 88

Answers (3)

Matteo Felici
Matteo Felici

Reputation: 1107

This solution is valid even with non numerical variable.

    - It first stores all the variable names in a dataset with a PROC CONTENTS.
    - Then it creates a macro variable for each column.
    - Finally it creates the output dataset: for every input row it returns N output rows, where N is the number of columns.
%macro transpose_like;

/* With this step we save all the column names in the columns_dataset */

proc contents data=start_dataset out=columns_dataset(keep=name) noprint;
run;



/* With the symputx routine we save dinamically the column names to N  macro variables */

data _null_;
    set columns_dataset end=eof;
    call symputx("Var"||strip(_N_),strip(name));
    if eof then
        call symputx("Var_Number",strip(_N_));
run;



/* Finally we write the output_dataset: for every row of start_dataset we output N rows */

data output_dataset(keep=var value);
    length var $ 20 value $20;
    set start_dataset;
    %do i=1 %to &Var_Number.;
        var = "&&Var&i.";
        value = strip(&&Var&i.);
        output;
    %end;
run;

%mend transpose_like;

/* Call the macro */
%transpose_like

Upvotes: 0

user5660356
user5660356

Reputation: 41

This is the code to specifically work with your inputs. First data step recreates your data. Second outputs what you need. I assume that by "double as large" you mean twice as many observations.

/*recreate input file*/
data aa;
     weight  =3.18456; malf=         0;
output;
 weight  =4.2433  ;       malf=  1;
output;
 weight  =3.8543  ;        malf= 0;
output;
 weight  =4.0123  ;       malf=  1;
output;
 weight  =4.15   ;         malf= 1;
  output;
 run;

/*output file*/
data bb;
    set aa;
    type="weight"; respons=weight;
    output;
     type="malf"; respons=malf; 
    output;
run;

Upvotes: 0

Stu Sztukowski
Stu Sztukowski

Reputation: 12909

Here's a generic solution that should work for you. It places all of your numeric variables into an array. For each observation, it loops through all of your numeric variables in order and does the following order of operations:

  1. Reads variable i's name stored in the array, and spits it out to the variable Type.

  2. Reads variable i's value and spits it out to the variable Response.

  3. Outputs, then goes back through and does the same thing with numeric variable i+1. Rinse and repeat for every observation.

    data want;
    length Type $10.;
    
        set have;
        array numvars[*] _NUMERIC_;
    
        do i = 1 to dim(numvars);
            Type = vname(numvars[i]);
            Response = numvars[i];
            output;
        end;
    
        keep Type Response;
    run;
    

Upvotes: 2

Related Questions