Drop variables with all missing values

JodeCharger100 picture JodeCharger100 · Nov 28, 2018 · Viewed 9.8k times · Source

I have 5000 variables and 91,534 observations in my dataset.

I want to drop all variables that have all their values missing:

X1     X2    X3
1      2      .
.      3      .
3      .      .
.      5      .

X1     X2
1      2  
.      3   
3      . 
.      5  

I tried using the dropmiss community-contributed command, but it does not seem to be working for me even after reading the help file. For example:

dropmiss 
command dropmiss is unrecognized
r(199);

missings dropvars
force option required with changed dataset

Instead, as suggested in one of the solutions, I tried the following:

ssc install nmissing
nmissing, min(91534)  
drop `r(varlist)'

This alternative community-contributed command seems to work for me.

However, I wanted to know if there is a more elegant solution, or a way to use dropmiss.

Answer

Nick Cox picture Nick Cox · Nov 28, 2018

In an up-to-date Stata either search dropmiss or search nmissing will tell you that both commands are superseded by missings from the Stata Journal.

The following dialogue may illuminate your question:

. sysuse auto , clear
(1978 Automobile Data)

. generate empty = .
(74 missing values generated)

. missings dropvars
force option required with changed dataset
r(4);

. missings dropvars, force

Checking missings in make price mpg rep78 headroom trunk weight length turn
    displacement gear_ratio foreign empty:
74 observations with missing values

note: empty dropped

missings dropvars, once installed, will drop all variables that are entirely missing, except that you need the force option if the dataset in memory has not been saved.