danish
danish

Reputation: 25

Check whether element of table is the same as column name

I have the following data:

data <- tibble(
  product_01 = c("AB", "AC", NA),
  product_02 = c("AD", NA, "AB"),
  AB = NA,
  AC = NA,
  AD = NA)

Now, I want to get the following tibble:

> data
# A tibble: 3 x 5
  product_01 product_02 AB    AC    AD   
  <chr>      <chr>      <lgl> <lgl> <lgl>
1 AB         AD         TRUE  FALSE TRUE   
2 AC         NA         FALSE TRUE  FALSE   
3 NA         AB         TRUE  FALSE FALSE

i.e. for each element of a row (in columns starting with product) check whether that element is the same as the name of the column (after columns starting with product) and put TRUE if the same and FALSE otherwise. Does someone have any idea how to proceed here? I tried some code (mostly with lapply) but with no results. thanks in advance

Upvotes: 1

Views: 40

Answers (2)

AnilGoyal
AnilGoyal

Reputation: 26218

A dplyr try that will work but I am not that comfortable in using loops, that's why it is a bit long

library(tidyverse)

data2 <- data %>% mutate(row_id = row_number()) %>% 
  select(-AB, -AC, -AD) %>%
  pivot_longer(cols = c(product_01, product_02), names_to ='name', values_to = "val") %>%
  filter(!is.na(val)) %>%
  mutate(val2 = TRUE) %>%
  pivot_wider(id_cols = row_id, names_from = val, 
values_from = val2, values_fill =FALSE) 

data %<>% mutate(row_id = row_number()) %>% 
  select(-AB, -AC, -AD) %>%
  left_join(data2, by = "row_id") %>%
  select(-row_id)

# A tibble: 3 x 5
  product_01 product_02 AB    AD    AC   
  <chr>      <chr>      <lgl> <lgl> <lgl>
1 AB         AD         TRUE  TRUE  FALSE
2 AC         NA         FALSE FALSE TRUE 
3 NA         AB         TRUE  FALSE FALSE

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388837

You can use apply :

cols <- names(data)[-(1:2)]
data[cols] <- t(apply(data[1:2], 1, function(x)  cols %in% x))

# product_01 product_02    AB    AC    AD   
#  <chr>      <chr>      <lgl> <lgl> <lgl>
#1 AB         AD         TRUE  FALSE TRUE 
#2 AC         NA         FALSE TRUE  FALSE
#3 NA         AB         TRUE  FALSE FALSE

I think in general case, we would not have already created columns with NA values.

data <- tibble(
  product_01 = c("AB", "AC", NA),
  product_02 = c("AD", NA, "AB"))

We can then use dplyr and tidyr in the following way :

library(dplyr)
library(tidyr)

data1 <- data %>% mutate(row = row_number())

data1 %>%
  pivot_longer(cols = -row,
               values_drop_na = TRUE) %>%
  mutate(val = TRUE) %>%
  select(-name) %>%
  pivot_wider(names_from = value, values_from = val, values_fill = FALSE) %>%
  left_join(data1, by = 'row')

Upvotes: 2

Related Questions