I have 5000
variables and 91,534
observations in my dataset.
I want to drop all variables that have all their values missing:
X1 X2 X3
1 2 .
. 3 .
3 . .
. 5 .
X1 X2
1 2
. 3
3 .
. 5
I tried using the dropmiss
community-contributed command, but it does not seem to be working for me even after reading the help file. For example:
dropmiss
command dropmiss is unrecognized
r(199);
missings dropvars
force option required with changed dataset
Instead, as suggested in one of the solutions, I tried the following:
ssc install nmissing
nmissing, min(91534)
drop `r(varlist)'
This alternative community-contributed command seems to work for me.
However, I wanted to know if there is a more elegant solution, or a way to use dropmiss
.
In an up-to-date Stata either search dropmiss
or search nmissing
will tell you that both commands are superseded by missings
from the Stata Journal.
The following dialogue may illuminate your question:
. sysuse auto , clear
(1978 Automobile Data)
. generate empty = .
(74 missing values generated)
. missings dropvars
force option required with changed dataset
r(4);
. missings dropvars, force
Checking missings in make price mpg rep78 headroom trunk weight length turn
displacement gear_ratio foreign empty:
74 observations with missing values
note: empty dropped
missings dropvars
, once installed, will drop all variables that are entirely missing, except that you need the force
option if the dataset in memory has not been save
d.