broti
broti

Reputation: 1382

Translate Stata "gen" into R

I am reproducing some Stata code in R and struggling with the following command:

gen new_var=((var1_a==1 & var1_b==0) | (var2_a==1 & var2_b==0))

I am generally familiar with the gen syntax, but in this case I do not understand how values are assigned based on the boolean condition.

What would the above be in R?

Upvotes: 0

Views: 169

Answers (1)

langtang
langtang

Reputation: 24722

In Stata, in general the above gen command will work because you have variables in your in-memory dataset (similar to a single R dataframe) named var1_a, var1_b, var_2_a, and var2_b. If these exist as vectors in your R environment, then our colleague Nick Cox is exactly correct: all is needed is the statement without the leading gen.. (although typically in R we would write it like this):

new_var <- (var1_a==1 & var1_b==0) | (var2_a==1 & var2_b==0)

However, if you have a data frame object, say df that contains columns with these names, and the objective is to add another column to df that reflects your logical conditions (like adding a new variable ("column") to the dataset in Stata using generate / gen. In this case, the above approach will not work as the columns var1_a, var1_b, etc will not be found in the global environment.

Instead, to add a new column called new_var to the dataframe called df, we would write something like this:

df["new_var"] <- (df$var1_a==1 & df$var1_b==0) | (df$var2_a==1 & var2_b==0)

Upvotes: 3

Related Questions