Luther_Proton
Luther_Proton

Reputation: 358

R - Create new columns based on different conditions of one column

seeking some expertise/guidance on creating new columns in a data frame based on different conditions of one column.

My data frame is pretty huge, but I've subsetted it for easier reference. It's a feedback from animal lovers on their cost.

Feedback Cost pig
cat, dog, rabbit 90 100
cat & rabbit 60 100
cat and fish 50 100
dog/pig/rabbit 250 100
cat, pig 200 100
fish, rabbit, dog 150 100
dog, cat and pig 260 100
df = structure(list(Feedback = c("cat, dog, rabbit", "cat & rabbit ", 
                                 "cat and fish", "dog/pig/rabbit", "cat, pig", "fish, rabbit, dog", 
                                 "dog, cat and pig"), Cost = c(90L, 80L, 50L, 300L, 200L, 120L, 
                                 260L), pig = c(100L, 100L, 100L, 100L, 100L, 100L, 100L)), 
                                 class = "data.frame", row.names = c(NA, -7L))

Does anyone know how I can create new columns based on the animals in the Feedback columns dplyr or tidyverse functions?

Currently I'm using ifelse() and it's not feasible should the dataframe gets larger.

The conditions are that

  1. Cost of pig is fixed at 100.
  2. Cost of remaining animals can be equally divided. However should pig exist, I have to subtract 100 from the total cost before dividing equally.
  3. Needing to account for the special characters in Feedback column to separate the animals.

The ideal output should be as follows:

Feedback Cost pig cat dog rabbit fish pig_new
cat, dog, rabbit 90 100 30 30 30 0 0
cat & rabbit 60 100 30 0 30 0 0
cat and fish 50 100 25 0 0 25 0
dog/pig/rabbit 250 100 0 75 75 0 100
cat, pig 200 100 100 100 0 0 0
fish, rabbit, dog 150 100 0 50 50 50 0
dog, cat and pig 260 100 80 80 0 0 100

Thanks in advance!

Upvotes: 1

Views: 122

Answers (1)

akrun
akrun

Reputation: 887048

Based on the logic, we could do

library(dplyr)# version >= 1.1.0
library(stringr)
library(fastDummies)
dummy_cols(df, "Feedback", split = "\\s*[,&/]\\s*|\\s*and\\s*") %>% 
  rename_with(~ str_replace(str_remove(.x, "Feedback_"), "pig", "new_pig"), 
     starts_with("Feedback_")) %>%
  mutate(cnt = rowSums(pick(cat:dog)),
   across(cat:dog, ~ case_when(.x== 1 & new_pig == 0 ~ Cost/cnt, .x == 1 & 
    new_pig == 1 ~ (Cost - pig)/cnt, TRUE ~ 0)), 
   new_pig = pig * new_pig, cnt = NULL)

Upvotes: 0

Related Questions