Mats Lind
Mats Lind

Reputation: 934

SAS: Changing labels by reading from table

I have a table with non-informative variable names and labels. But I have another table with variables whose names match the lables of the first table and whose lables are informative. How do I change the lables in the first table to the good ones in the second table?

Upvotes: 0

Views: 863

Answers (2)

DomPazz
DomPazz

Reputation: 12465

To do this without rewriting the data set (useful if it is large or you have constraints or indexes) you will need to use PROC DATASETS. Unfortunately it's not straightforward.

Start with some badly named and labeled data:

data have;
attrib var1 format=best. label="Blah"
       var2 format=$6.   label="Super Blah";

var1 = 1;
var2 = "123456";
run;

And here are what it actually means

data real_labels;
format old new $200.;
old="Blah";
new="Number of People";
output;
old="Super Blah";
new="Postal Code";
output;
run;

Let's get the existing variable names and labels:

proc contents data=have out=vars(keep=name label) noprint;
run;

Join the new labels to the variable names. Inner join so that we don't overwrite labels for variables that don't exist in the real_labels data.

proc sql noprint;
create table label_updates as
select a.name,
       b.new
    from vars as a
      inner join
         real_labels as b
      on a.label=b.old;

quit;

Now build the PROC DATASETS based on the above data.

data _null_;
set label_updates end=last;
if _n_ = 1 then do;
    call execute("proc datasets lib=work nolist;");
    call execute("modify have;");
end;

call execute("attrib " || strip(name) || " label=" || QUOTE(STRIP(NEW)) || ";");

if last then do;
    call execute("run; quit;");
end;
run;

Upvotes: 1

Joe
Joe

Reputation: 63424

If the tables are truly identical in terms of variables, and you don't mind the other attributes of the datasets matching also (Format, Informat, Length, variable order), then you can do this easily:

data want;
  if 0 then set useful_label_dataset;
  set actual_dataset;
run;

This will use the useful label dataset to define the columns in the PDV and thus the output dataset, which means the labels, formats, lengths, order, etc. will all come from that dataset. if 0 means it won't actually be executed, but that's okay; in SAS the compiler determines those attributes even if the code isn't executed.

This is a good solution for a table that is not so large that it will be inefficient to rewrite the table.

Upvotes: 1

Related Questions