Reputation: 493
I have the following data and I was to make a new column using mutate which details when colour = 'g' then take the level on the g row minus the level figure on the 'r' row.
Then likewise with type. Where type = 1 then take the corresponding level minus the level on the type 2 row.
library(dplyr)
d <- tibble(
date = c("2018", "2018", "2018", "2019", "2019", "2019", "2020", "2020", "2020", "2020"),
colour = c("none","g", "r", "none","g", "r", "none", "none", "none", "none"),
type = c("type1", "none", "none", "type2", "none", "none", "none", "none", "none", "none"),
level= c(78, 99, 45, 67, 87, 78, 89, 87, 67, 76))
Just to be clear this is what I want the data to look like.
So the data should look like this:
d2 <- tibble(
date = c("2018", "2018", "2018", "2019", "2019", "2019", "2020", "2020", "2020", "2020"),
colour = c("none","g", "r", "none","g", "r", "none", "none", "none", "none"),
type = c("type1", "none", "none", "type2", "none", "none", "none", "none", "none", "none"),
level= c(78, 99, 45, 67, 87, 78, 89, 87, 67, 76),
color_gap = c("NULL", 44, "NULL", "NULL", 9, "NULL", "NULL", "NULL", "NULL", "NULL"),
type_gap = c(11, "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL"))
I started to use mutate and case when and got to the below. However, I'm stuck on the final calculation part. How do I say I want to take the color g level - the color r level?
d %>%
mutate(color_gap = case_when(color == "g" ~ level)%>%
mutate(type_gap = case_when(type== "type1" ~ level)%>%
) -> d2
Anyone know how to complete this?
Thanks
Upvotes: 1
Views: 117
Reputation: 28685
This subtracts the first r level from the first g level, second r level from second g level, etc. Same for type1 and type2. This has no checks at all. It doesn't check whether there is a matching r for each g, whether they are in the expected order, whether they are in the same date-group, etc. It assumes the data is already perfectly formatted as expected, so be careful using this on real data.
d %>%
mutate(color_gap = replace(rep(NA, n()), colour == 'g',
level[colour == 'g'] - level[colour == 'r']),
type_gap = replace(rep(NA, n()), type == 'type1',
level[type == 'type1'] - level[type == 'type2']))
# # A tibble: 10 x 6
# date colour type level color_gap type_gap
# <chr> <chr> <chr> <dbl> <dbl> <dbl>
# 1 2018 none type1 78 NA 11
# 2 2018 g none 99 54 NA
# 3 2018 r none 45 NA NA
# 4 2019 none type2 67 NA NA
# 5 2019 g none 87 9 NA
# 6 2019 r none 78 NA NA
# 7 2020 none none 89 NA NA
# 8 2020 none none 87 NA NA
# 9 2020 none none 67 NA NA
# 10 2020 none none 76 NA NA
Upvotes: 3
Reputation: 6020
you could do this with group_by
and mutate.
I assumed that there is only 1 row per date
that would satisfy each condition.
d %>%
mutate(color_gap = case_when(colour == "g" ~ level)) %>%
mutate(type_gap = case_when(type== "type1" ~ level)) %>%
group_by(date) %>%
mutate(diff = max(color_gap,na.rm=T)-max(type_gap, na.rm=T))
Upvotes: 0