Reputation: 27
I want to create the "turn" column in the example data frame. I have a larger dataset with thousands of rows. This column will indicate the current turn of the speaker. Even if the sentences are across different rows, if they are spoken by the same speaker, it will count as the same turn. Then, the next time said person has a turn to speak, it will be nth turn.
df <- data.frame(
line = c(1:9),
speaker = c("nick", "nick", "nick", "bob", "nick", "ann", "ann", "nick", "bob"),
sentence = c("hi", "how are you?", "what's up?", "i'm good", "me too", "hi guys", "any plans for the weekend", "no", "ya, the movies"),
turn = c(1, 1, 1, 2, 3, 4, 4, 5, 6))
I have used:
line speaker sentence turn turn_curgroupid
1 1 nick hi 1 3
2 2 nick how are you? 1 3
3 3 nick what's up? 1 3
4 4 bob i'm good 2 2
5 5 nick me too 3 3
6 6 ann hi guys 4 1
line speaker sentence turn turn_seqalong
1 1 nick hi 1 1
2 2 nick how are you? 1 2
3 3 nick what's up? 1 3
4 4 bob i'm good 2 1
5 5 nick me too 3 4
6 6 ann hi guys 4 1
Thanks for your help.
Upvotes: 0
Views: 41
Reputation: 66880
df |>
mutate(turn2 = cumsum(speaker != lag(speaker, 1, "")),
turn3 = consecutive_id(speaker))
# H/T @andre-wildberg for mentioning this useful dplyr 1.1.0 function
Result
line speaker sentence turn turn2 turn3
1 1 nick hi 1 1 1
2 2 nick how are you? 1 1 1
3 3 nick what's up? 1 1 1
4 4 bob i'm good 2 2 2
5 5 nick me too 3 3 3
6 6 ann hi guys 4 4 4
7 7 ann any plans for the weekend 4 4 4
8 8 nick no 5 5 5
9 9 bob ya, the movies 6 6 6
Upvotes: 2