h-y-jp
h-y-jp

Reputation: 339

How to transform "sequence" data to "from & to" in R

I have this dataset.


data.frame(
id = c("id1","id1","id1","id1","id2","id2","id2"),
seq = c(1,2,3,4,1,2,3),
obj = c("A","B","C","D","B","D","E")
)

   id seq obj
1 id1   1   A
2 id1   2   B
3 id1   3   C
4 id1   4   D
5 id2   1   B
6 id2   2   D
7 id2   3   E



I want to transform seq&obj variable , from to form. like this.

data.frame(
  id = c("id1","id1","id1","id1","id1","id2","id2","id2","id2"),
  from = c("start","A","B","C","D","start","B","D","E"),
  to = c("A","B","C","D","end","B","D","E","end")
)

   id  from  to
1 id1 start   A
2 id1     A   B
3 id1     B   C
4 id1     C   D
5 id1     D end
6 id2 start   B
7 id2     B   D
8 id2     D   E
9 id2     E end

If we think of id as a runner names , we can imagine that it passes through checkpoints named obj in the order of seq.

do you know any idea? thank you.

Upvotes: 0

Views: 202

Answers (1)

Konrad Rudolph
Konrad Rudolph

Reputation: 545618

The following should work:

df %>%
    group_by(id) %>%
    arrange(seq) %>%
    summarize(from = c('start', obj), to = c(obj, 'end'), .groups = 'drop')
# A tibble: 9 x 3
  id    from  to
  <chr> <chr> <chr>
1 id1   start A
2 id1   A     B
3 id1   B     C
4 id1   C     D
5 id1   D     end
6 id2   start B
7 id2   B     D
8 id2   D     E
9 id2   E     end

If your initial data is already in the correct order (as in your given example), the arrange() call is unnecessary. However, with tabular data it’s best not to assume a specific order.

Upvotes: 3

Related Questions