Reputation: 1643
Fairly simple problem I can't seem to come up with an elegant solution.
I'd like to arrange a column of data by differing descending levels:
library(dplyr)
test <- data.frame(ID=c(19000,19001,19002,1,2))
test %>%
arrange(desc(ID)) %>%
mutate(ID = formatC(ID,width=5,format="d",flag="0"))
ID
1 19002
2 19001
3 19000
4 00002
5 00001
I want:
ID
1 00002
2 00001
3 19002
4 19001
5 19000
This is for a pipeline so more IDs will be added, e.g. 00003, 00004....
Here's something I came up with:
test %>%
mutate(ID = formatC(ID,width=5,format="d",flag="0")) %>%
group_by(group=substr(ID,1,1)) %>%
arrange(desc(ID)) %>%
arrange(group) %>%
ungroup() %>%
select(ID)
Anything better than this?
EDIT--
library(microbenchmark)
test <- data.frame(ID=c(1:29999))
microbenchmark(group = test %>%
mutate(ID = formatC(ID,width=5,format="d",flag="0"),
group = substr(ID,1,1)) %>%
arrange(group, desc(ID)) %>%
select(ID),
mod = test %>%
arrange(ID %/% 1000, desc(ID %% 1000)) %>%
mutate(ID = formatC(ID,width=5,format="d",flag="0")))
Unit: milliseconds
expr min lq mean median uq max neval cld
group 138.0480 152.21025 168.7705 160.41305 176.6362 352.4736 100 b
mod 27.7697 29.94265 34.1312 31.92085 35.5323 88.8065 100 a
Thanks all! Looks like I have my answer.
Upvotes: 4
Views: 95
Reputation: 173793
You could just sort by number of thousands then descending-sort by modulo 1000. That way you don't need to add a groups column.
library(dplyr)
test <- data.frame(ID=c(19000,19001,19002,1,2))
test %>%
arrange(ID %/% 1000, desc(ID %% 1000)) %>%
mutate(ID = formatC(ID,width=5,format="d",flag="0"))
#> ID
#> 1 00002
#> 2 00001
#> 3 19002
#> 4 19001
#> 5 19000
Upvotes: 6
Reputation: 2323
Here is a small edit to your solution:
library(dplyr)
test <- data.frame(ID=c(19000,19001,19002,1,2))
test %>%
mutate(ID = formatC(ID,width=5,format="d",flag="0")) %>%
arrange(substr(ID,1,1), desc(ID))
Upvotes: 0