Baran Aksoy
Baran Aksoy

Reputation: 27

How to create a new dataset based on multiple conditions in R?

I have a dataset called carcom that looks like this

carcom <- data.frame(household = c(173, 256, 256, 319, 319, 319, 422, 422, 422, 422), individuals= c(1, 1, 2, 1, 2, 3, 1, 2, 3, 4))

Where individuals refer to father for "1" , mother for "2", child for "3" and "4". What I would like to get two new columns. First one should indicate the number of children in that household if there is. Second, assigning a weight to each individual respectively "1" for father, "0.5" to mother and "0.3" to each child. My new dataset should look like this

newcarcom <- data.frame(household = c(173, 256, 319, 422), child = c(0, 0, 1, 2), weight = c(1, 1.5, 1.8, 2.1)

I have been trying to find the solutions for days. Would be appreciated if someone helps me. Thanks

Upvotes: 1

Views: 708

Answers (2)

akrun
akrun

Reputation: 887991

An option with data.table

library(data.table)
setDT(carcom)[, .(child = sum(individuals %in% 3:4), 
        weight = sum(recode(individuals,`1` = 1, `2` = 0.5, .default = 0.3))), household]

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 389355

We can count number of individuals with value 3 and 4 in each household. To calculate weight we change the value for 1:4 to their corresponding weight values using recode and then take sum.

library(dplyr)

newcarcom <- carcom %>%
  group_by(household) %>%
  summarise(child = sum(individuals %in% 3:4), 
            weight = sum(recode(individuals,`1` = 1, `2` = 0.5, .default = 0.3)))

#  household child weight
#      <dbl> <int>  <dbl>
#1       173     0    1  
#2       256     0    1.5
#3       319     1    1.8
#4       422     2    2.1

Base R version suggested by @markus

newcarcom <- do.call(data.frame, aggregate(individuals ~ household, carcom, function(x) 
      c(child = sum(x %in% 3:4), weight = sum(replace(y <- x^-1, y < 0.5, 0.3))))) 

Upvotes: 2

Related Questions