Reputation: 4636
I have, what I think is a very simple question but can't figure it out or find the exact problem online. I want to order my dataset by id and time 1:4 so that it is in the sequence 1,2,3,4 not 1,1,1,2,2,2,3,4. See example:
dff <- data.frame (id=c(1,1,1,1,1,1,1,1,2,2,2,3),
time=c(1,1,2,2,3,3,4,4,1,1,2,1))
R>dff
id time
1 1 1
2 1 1
3 1 2
4 1 2
5 1 3
6 1 3
7 1 4
8 1 4
9 2 1
10 2 1
11 2 2
12 3 1
I want the resulting dataset to be ordered as follows:
R>dff
id time
1 1 1
2 1 2
3 1 3
4 1 4
5 1 1
6 1 2
7 1 3
8 1 4
9 2 1
10 2 2
11 2 1
12 3 1
I would preferably like to use arrange
function in dplyr
but will take any solution. I believe I should be creating a vector v<-c(1,2,3,4) and ordering with this using %in% but I'm not sure how. Something like this would i think just order 1,1,1 which is not what I want.
Any help appreciated, thanks.
Upvotes: 2
Views: 954
Reputation: 479
A slight build on @akrun answer. Using dplyr version 0.4.3 I think ungroup() needs to be used before arranging it - Since its grouped by id & time. Seems like its sorted on the level of the group first & then the columns specified in arrange.
library(dplyr)
dff %>%
group_by(id, time) %>%
mutate(ind = row_number()) %>%
ungroup() %>%
arrange(id, ind) %>%
select(-ind)
Upvotes: 4
Reputation: 886948
We can create a sequence column grouped by 'id', 'time', then do the arrange
based on the 'ind' and remove the column afterwards with select
library(dplyr)
dff %>%
group_by(id, time) %>%
mutate(ind = row_number()) %>%
arrange(id, ind) %>%
select(-ind)
# id time
# <dbl> <dbl>
#1 1 1
#2 1 2
#3 1 3
#4 1 4
#5 1 1
#6 1 2
#7 1 3
#8 1 4
#9 2 1
#10 2 2
#11 2 1
#12 3 1
If we are using base R
, the following one-liner would serve the purpose
dff[order(dff$id, with(dff, ave(time, id, time, FUN = seq_along))),]
# id time
#1 1 1
#3 1 2
#5 1 3
#7 1 4
#2 1 1
#4 1 2
#6 1 3
#8 1 4
#9 2 1
#11 2 2
#10 2 1
#12 3 1
Upvotes: 5