Lucas
Lucas

Reputation: 51

Creating a column in data frame that indicate if value in other column is consecutive

I have a data set that looks like this:

Value <- c(1, 3, 4, 5, 0, 210,
          2, 0.5, 7, 0, 201, 300,
          3, 0, 500, 6, 2, 1,
          8, 0, 200, 137, 0.76, 2.3)

Ingredient <- as.factor(c("A", "B", "C", "D", "E", "E",
        "E" ,"F", "G", "H", "H", "H",
        "A", "B", "B", "C", "D", "E",
        "E", "F", "F", "F", "G", "H"))

Condition <- as.factor(rep(c(rep(1,6), rep(2, 6)),2))


df <- data.frame(Condition, Ingredient, Value)

I want to create a column that indicates when a value in the column Ingredient is consecutive within a condition so from this:

> df
   Condition Ingredient  Value
1          1          A   1.00
2          1          B   3.00
3          1          C   4.00
4          1          D   5.00
5          1          E   0.00
6          1          E 210.00
7          2          E   2.00
8          2          F   0.50
9          2          G   7.00
10         2          H   0.00
11         2          H 201.00
12         2          H 300.00
13         1          A   3.00
14         1          B   0.00
15         1          B 500.00
16         1          C   6.00
17         1          D   2.00
18         1          E   1.00
19         2          E   8.00
20         2          F   0.00
21         2          F 200.00
22         2          F 137.00
23         2          G   0.76
24         2          H   2.30

I can get this output:

   Condition Ingredient  Value Consecutive
1          1          A   1.00       FALSE
2          1          B   3.00       FALSE
3          1          C   4.00       FALSE
4          1          D   5.00       FALSE
5          1          E   0.00       FALSE
6          1          E 210.00        TRUE
7          2          E   2.00       FALSE
8          2          F   0.50       FALSE
9          2          G   7.00       FALSE
10         2          H   0.00       FALSE
11         2          H 201.00        TRUE
12         2          H 300.00        TRUE
13         1          A   3.00       FALSE
14         1          B   0.00       FALSE
15         1          B 500.00        TRUE
16         1          C   6.00       FALSE
17         1          D   2.00       FALSE
18         1          E   1.00       FALSE
19         2          E   8.00       FALSE
20         2          F   0.00       FALSE
21         2          F 200.00        TRUE
22         2          F 137.00        TRUE
23         2          G   0.76       FALSE
24         2          H   2.30       FALSE

Please notice the transition from row 6 to 7: there are two consecutive letters (E), but row 7 should be FALSE, as this consecutive "E" is not appearing within the same condition.

Thanks for your help!

Upvotes: 0

Views: 35

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389235

Using data.table :

library(data.table)
setDT(df)[,Consecutive := Ingredient == shift(Ingredient,fill = last(Ingredient)), Condition]
df

#    Condition Ingredient  Value Consecutive
# 1:         1          A   1.00       FALSE
# 2:         1          B   3.00       FALSE
# 3:         1          C   4.00       FALSE
# 4:         1          D   5.00       FALSE
# 5:         1          E   0.00       FALSE
# 6:         1          E 210.00        TRUE
# 7:         2          E   2.00       FALSE
# 8:         2          F   0.50       FALSE
# 9:         2          G   7.00       FALSE
#10:         2          H   0.00       FALSE
#11:         2          H 201.00        TRUE
#12:         2          H 300.00        TRUE
#13:         1          A   3.00       FALSE
#14:         1          B   0.00       FALSE
#15:         1          B 500.00        TRUE
#16:         1          C   6.00       FALSE
#17:         1          D   2.00       FALSE
#18:         1          E   1.00       FALSE
#19:         2          E   8.00       FALSE
#20:         2          F   0.00       FALSE
#21:         2          F 200.00        TRUE
#22:         2          F 137.00        TRUE
#23:         2          G   0.76       FALSE
#24:         2          H   2.30       FALSE
#    Condition Ingredient  Value Consecutive

Upvotes: 1

jsv
jsv

Reputation: 740

Something like this?

df %>% 
  group_by(Condition) %>% 
  mutate(Consecutive = case_when(Ingredient == dplyr::lag(Ingredient) ~ TRUE,
                                 TRUE ~ FALSE)) %>% 
  ungroup()

Upvotes: 2

Related Questions