Reputation: 3
I would like to sort a column while keeping the order of another column in R. For instance, I have "year" and "region" variables in the dataset of "sales":
year region
2006 1
2006 101
2006 107
2006 11
2006 13
2007 1
2007 101
2007 107
2007 11
2007 13
What I want is to have an ascending order for the "year" variable for each region while keeping the order of "region", which is neither ascending nor descending.
year region
2006 1
2007 1
2006 101
2007 101
2006 107
2007 107
2006 11
2007 11
2006 13
2007 13
I tried to use the order
and arrange
functions but they do not give me the result I want. Anyone can help me with this?
Upvotes: 0
Views: 1067
Reputation: 2686
This should work:
library(dplyr)
dat <- data.frame(year = c(2006,2007,2006,2007,2006,2007,2006,2007,2006,2007),
region=c(1,1,101,101,107,107,11,11,13,13)
)
dat %>% arrange(year)
Upvotes: 0
Reputation: 886938
Another option is match
with arrange
library(dplyr)
df %>%
arrange(match(region, unique(region)), year)
-output
# year region
#1 2006 1
#2 2007 1
#3 2006 101
#4 2007 101
#5 2006 107
#6 2007 107
#7 2006 11
#8 2007 11
#9 2006 13
#10 2007 13
df <- structure(list(year = c(2006L, 2006L, 2006L, 2006L, 2006L, 2007L,
2007L, 2007L, 2007L, 2007L), region = c(1L, 101L, 107L, 11L,
13L, 1L, 101L, 107L, 11L, 13L)), class = "data.frame", row.names = c(NA,
-10L))
Upvotes: 0
Reputation: 173793
In base R you can do:
df[order(factor(df$region, levels = unique(df$region)), df$year),]
#> year region
#> 1 2006 1
#> 6 2007 1
#> 2 2006 101
#> 7 2007 101
#> 3 2006 107
#> 8 2007 107
#> 4 2006 11
#> 9 2007 11
#> 5 2006 13
#> 10 2007 13
Created on 2020-10-19 by the reprex package (v0.3.0)
Upvotes: 0
Reputation: 8844
Try this:
library(dplyr)
df %>% arrange(factor(region, as.character(unique(region))), year)
We fix the order of region
by redefining it as a factor
. The new factor
region will be dropped as soon as the arrange
operation is finished.
Output:
# A tibble: 10 x 2
year region
<dbl> <dbl>
1 2006 1
2 2007 1
3 2006 101
4 2007 101
5 2006 107
6 2007 107
7 2006 11
8 2007 11
9 2006 13
10 2007 13
df
looks like this
# A tibble: 10 x 2
year region
<dbl> <dbl>
1 2006 1
2 2006 101
3 2006 107
4 2006 11
5 2006 13
6 2007 1
7 2007 101
8 2007 107
9 2007 11
10 2007 13
Upvotes: 2