Reputation: 1382
I am reproducing some Stata code in R and struggling with the following command:
gen new_var=((var1_a==1 & var1_b==0) | (var2_a==1 & var2_b==0))
I am generally familiar with the gen
syntax, but in this case I do not understand how values are assigned based on the boolean condition.
What would the above be in R?
Upvotes: 0
Views: 169
Reputation: 24722
In Stata, in general the above gen
command will work because you have variables in your in-memory dataset (similar to a single R dataframe) named var1_a
, var1_b
, var_2_a
, and var2_b
. If these exist as vectors in your R environment, then our colleague Nick Cox is exactly correct: all is needed is the statement without the leading gen
.. (although typically in R we would write it like this):
new_var <- (var1_a==1 & var1_b==0) | (var2_a==1 & var2_b==0)
However, if you have a data frame object, say df
that contains columns with these names, and the objective is to add another column to df
that reflects your logical conditions (like adding a new variable ("column") to the dataset in Stata using generate / gen
. In this case, the above approach will not work as the columns var1_a
, var1_b
, etc will not be found in the global environment.
Instead, to add a new column called new_var
to the dataframe called df
, we would write something like this:
df["new_var"] <- (df$var1_a==1 & df$var1_b==0) | (df$var2_a==1 & var2_b==0)
Upvotes: 3