Reputation: 21
I have a SAS dataset with around 250 variables, and I want to assign labels from a two-column ("Variable" and "Label") excel file to each variable. Obviously, entering in each label by hand is impractical...
The Excel file is something like this (Some of the labels are very long, >100 characters):
Variable | Label |
---|---|
Var1 | Label1 |
Var2 | Label2 |
Var3 | Label3 |
And the SAS dataset has the same variables as the excel file.
After importing the excel file and converting it into a temporary SAS dataset, I tried creating a macro to apply each label to each variable dynamically. Some of the variables are numeric and some are character, so I tried dealing with them separately:
data dataset_labeled;
set dataset1;
/* Dynamically assign labels using metadata */
if _n_ = 1 then do;
declare hash lbl();
lbl.definekey("Variable");
lbl.definedata("Label");
lbl.definedone();
/* Standardize Variable names in metadata to uppercase */
do until (eof);
set mims3_spect_labels end=eof;
Variable = upcase(Variable);
lbl.add();
end;
end;
/* Separate arrays for numeric and character variables */
array num_vars _numeric_; /* Array for all numeric variables */
array char_vars _character_; /* Array for all character variables */
/* Apply labels to numeric variables */
do i = 1 to dim(num_vars);
if lbl.find(key: upcase(vname(num_vars[i]))) ne 0 then Label = "No Label";
call label(num_vars[i], Label);
end;
/* Apply labels to character variables */
do i = 1 to dim(char_vars);
if lbl.find(key: upcase(vname(char_vars[i]))) ne 0 then Label = "No Label";
call label(char_vars[i], Label);
end;
run;
proc contents data=dataset_labeled;
run;
But when I run the proc contents, none of the labels show up. How can I fix this?
Thanks in advance!
Upvotes: 2
Views: 45
Reputation: 27498
Your metadata stored in Excel might not align exactly with your data set. You should find out which columns exist before generating statements that will alter the label metadata of the data set columns.
...
... create mydata ...
...
proc import dbms=xlsx file='mylabels.xlsx' out=labels ;
sheet=labels ;
run ;
proc contents noprint data=mydata out=mydatacolumns(keep=name) ;
run ;
proc sql ;
create table statements as
select trim(C.name)||'='||quote(trim(L.label)) as statement
from mydatacolumns C
join labels L on upcase(C.name)=upcase(L.variable)
;
data _null_ ;
call symputx('syslast', 'mydata') ;
libmem = symget('syslast') ;
call execute (catx(' ', 'proc datasets nowarn noprint lib=',scan(libmem,1,'.'), ';')) ;
call execute (catx(' ', 'modify', scan(libmem,2,'.'), ';')) ;
do while (not laststatement) ;
set statements end=laststatement ;
call execute ('label ' || trim(statement) || ' ;') ;
end ;
call execute ('quit ;') ;
stop ;
run ;
Upvotes: 0
Reputation: 51566
The CALL LABEL() routine returns the value of the label that is currently attached to the variable. To attach a label you have to execute the LABEL statement. The simplest way is to just write the LABEL statement to a text file and then use %INCLUDE to run the generated code at the right place.
So if you have read in the label information into a dataset named mims3_spect_labels with variables named VARIABLE and LABEL then you can create the LABEL statement with a data step like this:
filename code temp;
data _null_;
set mims3_spect_labels end=eof;
file code;
if _n_=1 then put 'label';
put variable '=' label :$quote. ;
if eof then put ';' ;
run;
Now you can use that statement in a DATA step or a PROC MODIFY step to attach the labels to the variables. So say your intent is make a new dataset named dataset_labeled with the data from dataset1 with the labels attached then you could just run this data step.
data dataset_labeled;
set dataset1;
%include code / source2;
run;
Note that the TYPE of the variable does not matter for a LABEL statement.
If your question is that you have more variable/label pairs than exist in your dataset then first make a subset with just the variables you need. So get the list of variables in your dataset.
proc contents data=dataset1 noprint out=contents; run;
Then combine it with the list of variables in your metadata.
proc sql;
create table labels as select a.*
from mims3_spect_labels a
where upcase(variable) in
(select upcase(name) from contents)
;
quit;
And use that LABELS dataset to generate the LABEL statement instead.
Upvotes: 1