Reputation: 51
I have a data set that looks like this:
Value <- c(1, 3, 4, 5, 0, 210,
2, 0.5, 7, 0, 201, 300,
3, 0, 500, 6, 2, 1,
8, 0, 200, 137, 0.76, 2.3)
Ingredient <- as.factor(c("A", "B", "C", "D", "E", "E",
"E" ,"F", "G", "H", "H", "H",
"A", "B", "B", "C", "D", "E",
"E", "F", "F", "F", "G", "H"))
Condition <- as.factor(rep(c(rep(1,6), rep(2, 6)),2))
df <- data.frame(Condition, Ingredient, Value)
I want to create a column that indicates when a value in the column Ingredient
is consecutive within a condition so from this:
> df
Condition Ingredient Value
1 1 A 1.00
2 1 B 3.00
3 1 C 4.00
4 1 D 5.00
5 1 E 0.00
6 1 E 210.00
7 2 E 2.00
8 2 F 0.50
9 2 G 7.00
10 2 H 0.00
11 2 H 201.00
12 2 H 300.00
13 1 A 3.00
14 1 B 0.00
15 1 B 500.00
16 1 C 6.00
17 1 D 2.00
18 1 E 1.00
19 2 E 8.00
20 2 F 0.00
21 2 F 200.00
22 2 F 137.00
23 2 G 0.76
24 2 H 2.30
I can get this output:
Condition Ingredient Value Consecutive
1 1 A 1.00 FALSE
2 1 B 3.00 FALSE
3 1 C 4.00 FALSE
4 1 D 5.00 FALSE
5 1 E 0.00 FALSE
6 1 E 210.00 TRUE
7 2 E 2.00 FALSE
8 2 F 0.50 FALSE
9 2 G 7.00 FALSE
10 2 H 0.00 FALSE
11 2 H 201.00 TRUE
12 2 H 300.00 TRUE
13 1 A 3.00 FALSE
14 1 B 0.00 FALSE
15 1 B 500.00 TRUE
16 1 C 6.00 FALSE
17 1 D 2.00 FALSE
18 1 E 1.00 FALSE
19 2 E 8.00 FALSE
20 2 F 0.00 FALSE
21 2 F 200.00 TRUE
22 2 F 137.00 TRUE
23 2 G 0.76 FALSE
24 2 H 2.30 FALSE
Please notice the transition from row 6 to 7: there are two consecutive letters (E), but row 7 should be FALSE, as this consecutive "E" is not appearing within the same condition.
Thanks for your help!
Upvotes: 0
Views: 35
Reputation: 389235
Using data.table
:
library(data.table)
setDT(df)[,Consecutive := Ingredient == shift(Ingredient,fill = last(Ingredient)), Condition]
df
# Condition Ingredient Value Consecutive
# 1: 1 A 1.00 FALSE
# 2: 1 B 3.00 FALSE
# 3: 1 C 4.00 FALSE
# 4: 1 D 5.00 FALSE
# 5: 1 E 0.00 FALSE
# 6: 1 E 210.00 TRUE
# 7: 2 E 2.00 FALSE
# 8: 2 F 0.50 FALSE
# 9: 2 G 7.00 FALSE
#10: 2 H 0.00 FALSE
#11: 2 H 201.00 TRUE
#12: 2 H 300.00 TRUE
#13: 1 A 3.00 FALSE
#14: 1 B 0.00 FALSE
#15: 1 B 500.00 TRUE
#16: 1 C 6.00 FALSE
#17: 1 D 2.00 FALSE
#18: 1 E 1.00 FALSE
#19: 2 E 8.00 FALSE
#20: 2 F 0.00 FALSE
#21: 2 F 200.00 TRUE
#22: 2 F 137.00 TRUE
#23: 2 G 0.76 FALSE
#24: 2 H 2.30 FALSE
# Condition Ingredient Value Consecutive
Upvotes: 1
Reputation: 740
Something like this?
df %>%
group_by(Condition) %>%
mutate(Consecutive = case_when(Ingredient == dplyr::lag(Ingredient) ~ TRUE,
TRUE ~ FALSE)) %>%
ungroup()
Upvotes: 2