syoon
syoon

Reputation: 3

How to sort a column while keeping the order of another column in R?

I would like to sort a column while keeping the order of another column in R. For instance, I have "year" and "region" variables in the dataset of "sales":

year region
2006     1 
2006   101 
2006   107 
2006    11 
2006    13 
2007     1 
2007   101 
2007   107 
2007    11 
2007    13 

What I want is to have an ascending order for the "year" variable for each region while keeping the order of "region", which is neither ascending nor descending.

year  region
2006     1 
2007     1 
2006   101 
2007   101 
2006   107 
2007   107 
2006    11 
2007    11 
2006    13 
2007    13 

I tried to use the order and arrange functions but they do not give me the result I want. Anyone can help me with this?

Upvotes: 0

Views: 1067

Answers (4)

Raman Mishra
Raman Mishra

Reputation: 2686

This should work:

library(dplyr)

dat <- data.frame(year = c(2006,2007,2006,2007,2006,2007,2006,2007,2006,2007),
                  region=c(1,1,101,101,107,107,11,11,13,13)
                  )

dat %>% arrange(year)

Upvotes: 0

akrun
akrun

Reputation: 886938

Another option is match with arrange

library(dplyr)
df %>%
    arrange(match(region, unique(region)), year) 

-output

#    year region
#1  2006      1
#2  2007      1
#3  2006    101
#4  2007    101
#5  2006    107
#6  2007    107
#7  2006     11
#8  2007     11
#9  2006     13
#10 2007     13

data

df <- structure(list(year = c(2006L, 2006L, 2006L, 2006L, 2006L, 2007L, 
2007L, 2007L, 2007L, 2007L), region = c(1L, 101L, 107L, 11L, 
13L, 1L, 101L, 107L, 11L, 13L)), class = "data.frame", row.names = c(NA, 
-10L))

Upvotes: 0

Allan Cameron
Allan Cameron

Reputation: 173793

In base R you can do:

df[order(factor(df$region, levels = unique(df$region)), df$year),]
#>    year region
#> 1  2006      1
#> 6  2007      1
#> 2  2006    101
#> 7  2007    101
#> 3  2006    107
#> 8  2007    107
#> 4  2006     11
#> 9  2007     11
#> 5  2006     13
#> 10 2007     13

Created on 2020-10-19 by the reprex package (v0.3.0)

Upvotes: 0

ekoam
ekoam

Reputation: 8844

Try this:

library(dplyr)
df %>% arrange(factor(region, as.character(unique(region))), year)

We fix the order of region by redefining it as a factor. The new factor region will be dropped as soon as the arrange operation is finished.

Output:

# A tibble: 10 x 2
    year region
   <dbl>  <dbl>
 1  2006      1
 2  2007      1
 3  2006    101
 4  2007    101
 5  2006    107
 6  2007    107
 7  2006     11
 8  2007     11
 9  2006     13
10  2007     13

df looks like this

# A tibble: 10 x 2
    year region
   <dbl>  <dbl>
 1  2006      1
 2  2006    101
 3  2006    107
 4  2006     11
 5  2006     13
 6  2007      1
 7  2007    101
 8  2007    107
 9  2007     11
10  2007     13

Upvotes: 2

Related Questions