Using Stata's keep command on multiple blocks of variables

Question

I just started working on a massive dataset with 5 million observations and lots and lots of variables. To process this faster, I want to select only some variables of interest and drop the rest.

with keep, I could select a block of variables, very simple:

keep varx1-x5

However, the variables I want are not in order in the dataset:

varx1 varx2 varx3 varz1 varz2 vary1 vary2 vary3

Where I don't want the varz variables. I want only the blocks with varx and vary.

So. I'm not very good at loops, but I tried this:

foreach varname of varlist varx1-varx3 vary1-vary3  {
keep `varname'
}

This doesn't work, because it keeps only varx1, then tries to keep the others, and errors out because they have just been dropped.

How can I tell keep to select multiple blocks of variables?

GPierre · Accepted Answer

If you don't know all the variables you want to drop, to keep only the blocks with varx and vary :

keep varx* varz*

The * means “match zero or more” of the preceding expression.

Using Stata's keep command on multiple blocks of variables

Answers (2)

Related Questions

Using Stata&#39;s keep command on multiple blocks of variables

Answers (2)

Related Questions

Using Stata's keep command on multiple blocks of variables