A_beginner
A_beginner

Reputation: 69

add a new column based on grouping attributes

I want to add a new column in R which summarizes my subgroups into groups.

Here my example:

id = c(1,2,2,3,4,4,4,5,5,5,6,6,6)
subgroup = c("lightred","lightblue","darkblue","lightred","darkred","darkred","lightblue","darkgreen","darkgreen","lightgreen","darkred","darkblue","lightgreen")
data = data.frame(cbind(id,subgroup))

> data
   id   subgroup
1   1   lightred
2   2  lightblue
3   2   darkblue
4   3   lightred
5   4    darkred
6   4    darkred
7   4  lightblue
8   5  darkgreen
9   5  darkgreen
10  5 lightgreen
11  6    darkred
12  6   darkblue
13  6 lightgreen

Now I want to add a new column "colour" which groups the attributes into 3 gropus "red", "green" and "blue", regardless if they are light- or dark-coloured.

It should look like this at the end:

   id   subgroup colour
1   1   lightred    red
2   2  lightblue   blue
3   2   darkblue   blue
4   3   lightred    red
5   4    darkred    red
6   4    darkred    red
7   4  lightblue   blue
8   5  darkgreen  green
9   5  darkgreen  green
10  5 lightgreen  green
11  6    darkred    red
12  6   darkblue   blue
13  6 lightgreen  green

Upvotes: 2

Views: 76

Answers (3)

MHammer
MHammer

Reputation: 1314

While this method isn't as slick as others, it's quite flexible. I tweaked the ops's sample data to show how you can combine multiple groups that don't follow the light/dark paradigm.

Edit:

Updated post to answer the op's question in the comments.

id = c(1,2,2,3,4,4,4,5,5,5,6,6,6)
subgroup = c("lightred","lightblue","cyan","lightred","water","darkred","lightblue","darkgreen","darkgreen","lightgreen","darkred","darkblue","lightgreen")
data = data.frame(cbind(id,subgroup))


library(dplyr)
data <- data %>% 
  dplyr::mutate(
    colour = dplyr::case_when(
      grepl("red"  , subgroup, fixed = TRUE) ~ "red",
      grepl("(blue)|(cyan)|(water)", subgroup, perl = TRUE) ~ "blue",
      grepl("green", subgroup, fixed = TRUE) ~ "green",
      TRUE ~ "else"
    )
  )
data

Upvotes: 0

BENY
BENY

Reputation: 323276

From stringr

stringr::str_extract(data$subgroup,"red|green|blue")
 [1] "red"   "blue"  "blue"  "red"   "red"   "red"   "blue"  "green" "green" "green" "red"   "blue"  "green"



data$color=stringr::str_extract(data$subgroup,"red|green|blue")
data
   id   subgroup color
1   1   lightred   red
2   2  lightblue  blue
3   2   darkblue  blue
4   3   lightred   red
5   4    darkred   red
6   4    darkred   red
7   4  lightblue  blue
8   5  darkgreen green
9   5  darkgreen green
10  5 lightgreen green
11  6    darkred   red
12  6   darkblue  blue
13  6 lightgreen green

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521399

I think sub should be workable here:

data$colour <- sub("^(?:light|dark)", "", data$subgroup)

Demo

Upvotes: 2

Related Questions