tall_table
tall_table

Reputation: 311

Using data from one column to assign NAs to others

Goal: Add NAs to columns based on the value of another column. For example, if I have a data set with five columns (one ID column and four binary variables: caseID, var1, var2, var3, nocasedata), how can I evaluate the data from "nocasedata" to determine TRUE (no data) or FALSE (data) and then remove that column and assign NA (if TRUE) or do nothing (if FALSE) to the entire row of that case for the three other variables. (tidyverse tools preferred, but not necessary.)

Reproducible example:

df <- data.frame(caseID = c(1:5),
                var1 = c(1, 0, 0, 1, 1),
                 var2 = c(0, 0, 1, 1, 0),
                 var3 = c(0, 0, 0, 1, 1),
                nocasedata = c(0, 1, 0, 0, 0))

df

desired_df <- data.frame(caseID = c(1:5),
                 var1 = c(1, NA, 0, 1, 1),
                 var2 = c(0, NA, 1, 1, 0),
                 var3 = c(0, NA, 0, 1, 1))

desired_df

Upvotes: 1

Views: 44

Answers (1)

ardaar
ardaar

Reputation: 1274

Here is a reprex of the solution using tidyverse tools, as requested.


library(tidyverse)

#> -- Attaching packages ---------------------------------------------------- tidyverse 1.2.1 --
#> v ggplot2 2.2.1     v purrr   0.2.4
#> v tibble  1.3.4     v dplyr   0.7.4
#> v tidyr   0.7.2     v stringr 1.2.0
#> v readr   1.1.1     v forcats 0.2.0
#> -- Conflicts ------------------------------------------------------- tidyverse_conflicts() --
#> x dplyr::filter() masks stats::filter()
#> x dplyr::lag()    masks stats::lag()

df <- data.frame(caseID = c(1:5),
                 var1 = c(1, 0, 0, 1, 1),
                 var2 = c(0, 0, 1, 1, 0),
                 var3 = c(0, 0, 0, 1, 1),
                 nocasedata = c(0, 1, 0, 0, 0))

df

#>   caseID var1 var2 var3 nocasedata
#> 1      1    1    0    0          0
#> 2      2    0    0    0          1
#> 3      3    0    1    0          0
#> 4      4    1    1    1          0
#> 5      5    1    0    1          0

desired_df = df %>%
  mutate_at(.vars = vars(var1:var3), 
            .funs = funs(ifelse(nocasedata == 1, NA, .))) %>%
  select(-nocasedata)

desired_df

#>   caseID var1 var2 var3
#> 1      1    1    0    0
#> 2      2   NA   NA   NA
#> 3      3    0    1    0
#> 4      4    1    1    1
#> 5      5    1    0    1

Upvotes: 1

Related Questions