Reputation: 25
I have the following data:
data <- tibble(
product_01 = c("AB", "AC", NA),
product_02 = c("AD", NA, "AB"),
AB = NA,
AC = NA,
AD = NA)
Now, I want to get the following tibble:
> data
# A tibble: 3 x 5
product_01 product_02 AB AC AD
<chr> <chr> <lgl> <lgl> <lgl>
1 AB AD TRUE FALSE TRUE
2 AC NA FALSE TRUE FALSE
3 NA AB TRUE FALSE FALSE
i.e. for each element of a row (in columns starting with product) check whether that element is the same as the name of the column (after columns starting with product) and put TRUE
if the same and FALSE
otherwise. Does someone have any idea how to proceed here? I tried some code (mostly with lapply
) but with no results. thanks in advance
Upvotes: 1
Views: 40
Reputation: 26218
A dplyr
try that will work but I am not that comfortable in using loops, that's why it is a bit long
library(tidyverse)
data2 <- data %>% mutate(row_id = row_number()) %>%
select(-AB, -AC, -AD) %>%
pivot_longer(cols = c(product_01, product_02), names_to ='name', values_to = "val") %>%
filter(!is.na(val)) %>%
mutate(val2 = TRUE) %>%
pivot_wider(id_cols = row_id, names_from = val,
values_from = val2, values_fill =FALSE)
data %<>% mutate(row_id = row_number()) %>%
select(-AB, -AC, -AD) %>%
left_join(data2, by = "row_id") %>%
select(-row_id)
# A tibble: 3 x 5
product_01 product_02 AB AD AC
<chr> <chr> <lgl> <lgl> <lgl>
1 AB AD TRUE TRUE FALSE
2 AC NA FALSE FALSE TRUE
3 NA AB TRUE FALSE FALSE
Upvotes: 1
Reputation: 388837
You can use apply
:
cols <- names(data)[-(1:2)]
data[cols] <- t(apply(data[1:2], 1, function(x) cols %in% x))
# product_01 product_02 AB AC AD
# <chr> <chr> <lgl> <lgl> <lgl>
#1 AB AD TRUE FALSE TRUE
#2 AC NA FALSE TRUE FALSE
#3 NA AB TRUE FALSE FALSE
I think in general case, we would not have already created columns with NA
values.
data <- tibble(
product_01 = c("AB", "AC", NA),
product_02 = c("AD", NA, "AB"))
We can then use dplyr
and tidyr
in the following way :
library(dplyr)
library(tidyr)
data1 <- data %>% mutate(row = row_number())
data1 %>%
pivot_longer(cols = -row,
values_drop_na = TRUE) %>%
mutate(val = TRUE) %>%
select(-name) %>%
pivot_wider(names_from = value, values_from = val, values_fill = FALSE) %>%
left_join(data1, by = 'row')
Upvotes: 2