user63230
user63230

Reputation: 4636

order dataset by exact numeric sequence in r

I have, what I think is a very simple question but can't figure it out or find the exact problem online. I want to order my dataset by id and time 1:4 so that it is in the sequence 1,2,3,4 not 1,1,1,2,2,2,3,4. See example:

dff <- data.frame (id=c(1,1,1,1,1,1,1,1,2,2,2,3),
                      time=c(1,1,2,2,3,3,4,4,1,1,2,1))
    R>dff
       id time
    1   1    1
    2   1    1
    3   1    2
    4   1    2
    5   1    3
    6   1    3
    7   1    4
    8   1    4
    9   2    1
    10  2    1
    11  2    2
    12  3    1

I want the resulting dataset to be ordered as follows:

    R>dff
   id time
1   1    1
2   1    2
3   1    3
4   1    4
5   1    1
6   1    2
7   1    3
8   1    4
9   2    1
10  2    2
11  2    1
12  3    1

I would preferably like to use arrange function in dplyr but will take any solution. I believe I should be creating a vector v<-c(1,2,3,4) and ordering with this using %in% but I'm not sure how. Something like this would i think just order 1,1,1 which is not what I want. Any help appreciated, thanks.

Upvotes: 2

Views: 954

Answers (2)

Krupa Kapadia
Krupa Kapadia

Reputation: 479

A slight build on @akrun answer. Using dplyr version 0.4.3 I think ungroup() needs to be used before arranging it - Since its grouped by id & time. Seems like its sorted on the level of the group first & then the columns specified in arrange.

library(dplyr)
dff %>%
    group_by(id, time) %>% 
    mutate(ind = row_number()) %>%
    ungroup() %>%
    arrange(id, ind) %>%
    select(-ind)

Upvotes: 4

akrun
akrun

Reputation: 886948

We can create a sequence column grouped by 'id', 'time', then do the arrange based on the 'ind' and remove the column afterwards with select

library(dplyr)
dff %>%
    group_by(id, time) %>% 
    mutate(ind = row_number()) %>%
    arrange(id, ind) %>%
    select(-ind)
#     id  time
#   <dbl> <dbl>
#1      1     1
#2      1     2
#3      1     3
#4      1     4
#5      1     1
#6      1     2
#7      1     3
#8      1     4
#9      2     1
#10     2     2
#11     2     1
#12     3     1

If we are using base R, the following one-liner would serve the purpose

dff[order(dff$id, with(dff, ave(time, id, time, FUN = seq_along))),]
#   id time
#1   1    1
#3   1    2
#5   1    3
#7   1    4
#2   1    1
#4   1    2
#6   1    3
#8   1    4
#9   2    1
#11  2    2
#10  2    1
#12  3    1

Upvotes: 5

Related Questions