Reputation: 339
I have this dataset.
data.frame(
id = c("id1","id1","id1","id1","id2","id2","id2"),
seq = c(1,2,3,4,1,2,3),
obj = c("A","B","C","D","B","D","E")
)
id seq obj
1 id1 1 A
2 id1 2 B
3 id1 3 C
4 id1 4 D
5 id2 1 B
6 id2 2 D
7 id2 3 E
I want to transform seq&obj variable , from to form. like this.
data.frame(
id = c("id1","id1","id1","id1","id1","id2","id2","id2","id2"),
from = c("start","A","B","C","D","start","B","D","E"),
to = c("A","B","C","D","end","B","D","E","end")
)
id from to
1 id1 start A
2 id1 A B
3 id1 B C
4 id1 C D
5 id1 D end
6 id2 start B
7 id2 B D
8 id2 D E
9 id2 E end
If we think of id as a runner names , we can imagine that it passes through checkpoints named obj in the order of seq.
do you know any idea? thank you.
Upvotes: 0
Views: 202
Reputation: 545618
The following should work:
df %>%
group_by(id) %>%
arrange(seq) %>%
summarize(from = c('start', obj), to = c(obj, 'end'), .groups = 'drop')
# A tibble: 9 x 3
id from to
<chr> <chr> <chr>
1 id1 start A
2 id1 A B
3 id1 B C
4 id1 C D
5 id1 D end
6 id2 start B
7 id2 B D
8 id2 D E
9 id2 E end
If your initial data is already in the correct order (as in your given example), the arrange()
call is unnecessary. However, with tabular data it’s best not to assume a specific order.
Upvotes: 3