ronnydw
ronnydw

Reputation: 953

why is dplyr arrange not ordering my dataframe?

I have the following data frame

> S
Source: local data frame [1,991 x 3]
Groups: exp

   exp year commval
1  alb 1995     186
2  alb 1997     232
3  alb 1998     244
4  alb 2000     251
5  alb 1996     275
6  alb 1999     290
7  alb 2001     313
8  alb 2002     358
9  alb 2003     471
10 alb 2004     608
.. ...  ...     ...

I want to filter on year == 1995 and than reorder on commval:

> S %>% filter(year == 1995) %>% arrange(commval)
Source: local data frame [130 x 3]
Groups: exp

   exp year commval
1  alb 1995     186
2  are 1995   20266
3  arg 1995   21178
4  arm 1995      60
5  aus 1995   49855
6  aut 1995   50115
7  aze 1995     102
8  bel 1995  150850
9  ben 1995     182
10 bfa 1995     231
.. ...  ...     ...

As you can see the result is not sorted on commval but on exp. What am I doing wrong here?

Some more info on conflicts() and sessionInfo():

> conflicts()
[1] "filter"    "body<-"    "intersect" "kronecker" "setdiff"   "setequal"  "union"    

> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_0.3.0.2  igraph_0.7.1   reshape2_1.4.1

loaded via a namespace (and not attached):
[1] assertthat_0.1  DBI_0.3.1       lazyeval_0.1.10 magrittr_1.5    parallel_3.1.2  plyr_1.8.1     
[7] Rcpp_0.11.3     stringr_0.6.2   tools_3.1.2 

Upvotes: 1

Views: 4636

Answers (2)

Hong Ooi
Hong Ooi

Reputation: 57686

The behaviour of arrange on grouped data has changed a few times across different versions of dplyr. As of release 0.7 (September 2017), by default arrange will not sort by groups, thus

data %>% group_by(grp) %>% arrange(x)

will be sorted by x, without regard for grp (which actually makes the original question moot).

To change this, specify .by_group=TRUE in the call to arrange:

data %>% group_by(grp) %>% arrange(x, .by_group=TRUE)

This will be sorted by grp, and then by x within each grp.

Upvotes: 1

MrFlick
MrFlick

Reputation: 206197

From the output

Source: local data frame [1,991 x 3]
Groups: exp

We can see that your data is grouped by exp. This means that when you arrange, you will be arranging with the groups. If that's not what you want, do

S %>% filter(year == 1995) %>% ungroup() %>% arrange(commval)

to ungroup the data before arranging

Upvotes: 5

Related Questions