Reputation: 169
How could I convert this Stata command to R?
I have a database composed of individuals (each person is a row), but I also need some family variables to analyze. In this case, what I want is to identify the total amount of income earned by each family. Each member of a family is an individual in the database, and although I don't have the individuals; identifications, I have a variable that identifies the family. . Since I also know, for each individual, earnings in 2014, in Stata I have this command to create the variable:
egen family_inc = total(annual_inc), by (id_family)
where
family_inc
is the total income of a family
annual_inc
is the total income earned by the individual
id_family
is the identification of this family in the data
So the command says to Stata:
(1) For each member of the id_family
;
(2) Find all the members of that family;
(3) Sum the income earned during 2014;
(4) Assign this value to a new variable family_inc
.
Could I use group_by()
for this? I am very n00b at R. and can't spare some time to do a course now because of deadlines! course(df_damn, mother = FALSE, explicit = 3, !is.numeric("loads of"))
Upvotes: 2
Views: 938
Reputation: 2374
The following Stata
code
webuse iris
egen mean_petal_width = total(petwid), by(iris)
is equivalent to the R
code.
iris %>%
group_by(Species) %>%
mutate(
# new_var_name = function of other vars
mean_petal_width = sum(Petal.Width, na.rm = TRUE)
)
if the answer is helpful and solves the question, please mark it as solved :)
Upvotes: 3
Reputation: 79246
Stata:
egen family_inc = total(annual_inc), by (id_family)
My interpretation:
Generate family_inc
equal to overall sum of annual_inc
within each level of id_family
R code:
library(dplyr)
df %>%
group_by(id_family) %>%
summarize(family_inc = sum(annual_inc))`
Upvotes: 2