slinel
slinel

Reputation: 61

Add new/sort-of-duplicate rows conditional on certain column values

Image the following structure of a dataset.

DF <- expand.grid(car=c("BMW","TESLA", "Mercedes"), 
                  id=c("id1","id2","id3"),
                  id2=c("idA","idB"),
                  color_blue=c(0,1),
                  color_red=c(0,1),
                  color_black=c(0,1,2),
                  color_white=c(0,1),
                  tech_radio=c(0,1),
                  comf_heat=c(0,1),
                  stringsAsFactors=TRUE)

expand.grid gives a dataset with every combination, which serves my purpose here. Combinations such as colour_blue=1 and colour_red=1 are possible, which I want to split up when they occur.

I want to go from here:

car  id   id2  color_blue  color_red color_black color_white tech_radio comf_heat
BMW  id1  idA     1            1          1            0           1           2

to there

car       id   id2  color_blue color_red color_black color_white tech_radio comf_heat
BMW_blue   id1  idA    1             0          0           0        1         2
BMW_red    id1  idA    0             1          0           0        1         2
BMW_black  id1  idA    0             0          1           0        1         2

In effect two things shall happen:

  1. adding rows as sort-of-duplicates IF certain similarly named variables (not a range, as that might change) > 0
  2. rename value of "car"-variable by certain part of that one variable that is kept.

I know there maybe a lot of pipe-using solutions with dplyr or tidyverse or so around. As I am not using those, I am very unfamiliar with them and will have a (harder) time to apply them to my data. But in the end: any solution will be progress.

Upvotes: 0

Views: 30

Answers (1)

slinel
slinel

Reputation: 61

This works:

DF_test <- DF %>% 
  pivot_longer(cols = starts_with('color'), names_to='color') %>% 
  filter(value==1) %>% 
  mutate(color=gsub(color, pattern = 'color_', replacement = ''), 
                    code_together=paste(car, color, sep = '_')) %>% 
  select(-c(color, car))

Upvotes: 1

Related Questions