Reputation: 43

Combining datasets using dplyr

I have a 1000*4 dataset in which I have the following variables:

Cohabiting at year-9? (C9)
Cohabiting at year-15? (C15)
Married at year-9? (M9)
Married at year-15? (M15)

Thanks for the detailed solution. What is "NA_character"? I have NA in my dataset as well. And what if I want to add another value (4) for those that have (C9 = 1 and M15 = 0) and (C9 = 0 and M15 = 1) in addition to the above?

Each of these has a value of 1 (for yes), 0 (for No), and NA for non-response. I want to create a new column with the following conditions:

`If M9 = M15 = 1, then name the value "Married" and assign it "1" If C9 = C15 = 1, then name the value "Cohabiting" and assign it "2" If C9 = C15 = M9 = M15 = 0, then name the value "Non-intact" and assign it 3 If (C9 = 1 and M15 = 0) and (C9 = 0 and M15 = 1), then name the value "Delete" and assign it 4 If one of the columns contains NA, then keep it as "NA"'

So basically, I want to make a new column with the above values.

Upvotes: 0

Answers (2)

akrun

Reputation: 887691

We may use if_all/if_any

library(dplyr)
df1 %>% 
  mutate(result = case_when(if_any(c(M9, M15, C9, C15), is.na) ~ 
        NA_character_,  
     if_all(c(M9, M15), `==`, 1) ~ 'Married', 
     if_all(c(C9, C15), `==`, 1) ~ 'Cohabiting', 
     if_all(c(M9, M15, C9, C15), `==`, 0) ~ 'Non-intact'),
     new =match(result, c("Married", "Cohabiting", "Non-intact")))

-output

# A tibble: 6 × 6
     C9   C15    M9   M15 result       new
  <dbl> <dbl> <dbl> <dbl> <chr>      <int>
1     1     1     0     1 Cohabiting     2
2     0    NA     1     0 <NA>          NA
3     1     1     0     1 Cohabiting     2
4     1     1     0     1 Cohabiting     2
5     0     1     1     1 Married        1
6     0     0     0     0 Non-intact     3

data

df1 <- structure(list(C9 = c(1, 0, 1, 1, 0, 0), C15 = c(1, NA, 1, 1, 
1, 0), M9 = c(0, 1, 0, 0, 1, 0), M15 = c(1, 0, 1, 1, 1, 0)), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

Upvotes: 2

tspano

Reputation: 701

You can use case_when to map all your conditions to values in a single column. Something like this:

data %>% 
  mutate(result = case_when( M9 == 1 & M15 == 1 ~ "Married",
                             C9 == 1 & C15 == 1 ~ "Cohabiting",
                             M9+M15+C9+C15 == 0 ~ "Non-intact"
  ))

Upvotes: 0

Combining datasets using dplyr

Answers (2)

data

Related Questions