user3806985
user3806985

Reputation: 3

PROC SQL/base SAS-Removing missing observations/no responses from the survey data-set

I have a sas data-set with 5000 rows and 150 variables from a survey of 5000 respondents, but I need to remove the entire row/respondent where the column has missing observation for any of the 150 variables. So basically, I just need those respondents who have completed the answers for all 150 variables.

I am using proc sql or base sas, but I am not able to come up with a simpler way to do this. I have used conditional queries, but some columns are numeric and some are character type and I further need to do analysis on the numeric columns, so transposing doesnt seem to be an alternative.. Any help would be much appreciated?

Thanks

Upvotes: 0

Views: 5265

Answers (3)

Wes McClintick
Wes McClintick

Reputation: 507

Taking Gordon Linoff's idea about Excel a step further with just SAS...

ods output SQL_Results=appliance;
proc sql number;
select * from sashelp.applianc;
quit;


data appliance_2;
  set appliance;
  if cmiss(of _all_) = 0;
run;


proc sql; create table que as select * from dictionary.columns where libname = "WORK" and memname = "APPLIANCE"; quit;


proc sql ; 
select name, "IS NOT NULL AND"
from dictionary.columns where libname = "WORK" and memname = "APPLIANCE"; 
quit;

*copy / paste / clean-up ;
proc sql; 
create table appliance_3 as
select * from appliance 
where
Row IS NOT NULL AND 
units_1 IS NOT NULL AND 
units_2 IS NOT NULL AND 
units_3 IS NOT NULL AND 
units_4 IS NOT NULL AND 
units_5 IS NOT NULL AND 
units_6 IS NOT NULL AND 
units_7 IS NOT NULL AND 
units_8 IS NOT NULL AND 
units_9 IS NOT NULL AND 
units_10 IS NOT NULL AND 
units_11 IS NOT NULL AND 
units_12 IS NOT NULL AND 
units_13 IS NOT NULL AND 
units_14 IS NOT NULL AND 
units_15 IS NOT NULL AND 
units_16 IS NOT NULL AND 
units_17 IS NOT NULL AND 
units_18 IS NOT NULL AND 
units_19 IS NOT NULL AND 
units_20 IS NOT NULL AND 
units_21 IS NOT NULL AND 
units_22 IS NOT NULL AND 
units_23 IS NOT NULL AND 
units_24 IS NOT NULL AND 
cycle IS NOT NULL 
;quit;

Upvotes: 0

Quentin
Quentin

Reputation: 6378

With data step it is just:

data want;
  set have;
  if cmiss(of _all_) = 0;
run;

Will handle character and numeric variables.

Upvotes: 3

Gordon Linoff
Gordon Linoff

Reputation: 1270773

SAS procs tend to ignore missing values by removing the entire row from the data being analyzed. So, this might be less of a problem than you think. That is, if you are doing a forward selection logistic regression, add in a bunch of variables, then only rows with no missing values for those columns will be processed.

If you want to create a new data set where the columns have no missing values, you can do something like this:

proc sql;
    create table t_nomissing
        select t.* 
        from t
        where col1 is not null and col2 is not null and col3 is not null and . . .
              col150 is not null;

If you have a list of the column names, I would recommend creating the where clause in a tool such as Excel where you can use formulas and copy them down.

Upvotes: 0

Related Questions