JodeCharger100
JodeCharger100

Reputation: 1059

Drop variables with all missing values

I have 5000 variables and 91,534 observations in my dataset.

I want to drop all variables that have all their values missing:

X1     X2    X3
1      2      .
.      3      .
3      .      .
.      5      .

X1     X2
1      2  
.      3   
3      . 
.      5  

I tried using the dropmiss community-contributed command, but it does not seem to be working for me even after reading the help file. For example:

dropmiss 
command dropmiss is unrecognized
r(199);

missings dropvars
force option required with changed dataset

Instead, as suggested in one of the solutions, I tried the following:

ssc install nmissing
nmissing, min(91534)  
drop `r(varlist)'

This alternative community-contributed command seems to work for me.

However, I wanted to know if there is a more elegant solution, or a way to use dropmiss.

Upvotes: 4

Views: 25599

Answers (2)

Nick Cox
Nick Cox

Reputation: 37183

In an up-to-date Stata either search dropmiss or search nmissing will tell you that both commands are superseded by missings from the Stata Journal.

The following dialogue may illuminate your question:

. sysuse auto , clear
(1978 Automobile Data)

. generate empty = .
(74 missing values generated)

. missings dropvars
force option required with changed dataset
r(4);

. missings dropvars, force

Checking missings in make price mpg rep78 headroom trunk weight length turn
    displacement gear_ratio foreign empty:
74 observations with missing values

note: empty dropped

missings dropvars, once installed, will drop all variables that are entirely missing, except that you need the force option if the dataset in memory has not been saved.

Upvotes: 5

user8682794
user8682794

Reputation:

You can simply loop over all variables in your dataset and use the capture and assert commands to test which ones have all their values missing.

The advantage of this approach is that you can do this with only built-in Stata commands:

clear

input X1 X2 X3
1 2 .
. 3 .
3 . .
. 5 .
end

list
     +--------------+
     | X1   X2   X3 |
     |--------------|
  1. |  1    2    . |
  2. |  .    3    . |
  3. |  3    .    . |
  4. |  .    5    . |
     +--------------+

foreach var of varlist _all {
    capture assert missing(`var')
    if !_rc {
        drop `var'
    }
}

list
     +---------+
     | X1   X2 |
     |---------|
  1. |  1    2 |
  2. |  .    3 |
  3. |  3    . |
  4. |  .    5 |
     +---------+

Upvotes: 5

Related Questions