Nader Hisham
Nader Hisham

Reputation: 5414

Order by group sum

take a diamond data set in ggplot as an example I want to order the data frame by the total price for each color , so if for example this is the total price for each color

H        5000
I        4000    
E        1000
J        3000

I want to sort the original data frame to display colors in the same order above

so if we have a sample from original data set like this

carat   cut        color    clarity depth   table   price
0.23    Ideal        E        SI2   61.5    55       326
0.21    Premium      E        SI1   59.8    61       326
0.23    Good         E        VS1   56.9    65       327
0.29    Premium      I        VS2   62.4    58       334
0.31    Good         J        SI2   63.3    58       335
0.24    Very Good    J        VVS2  62.8    57       336
0.24    Very Good    I        VVS1  62.3    57       336
0.26    Very Good    H         SI1  61.9    55       337

it should be sorted to be something like this

carat   cut        color    clarity depth   table   price
0.26    Very Good    H        SI1   61.9    55       337    
0.24    Very Good    I        VVS1  62.3    57       336
0.29    Premium      I        VS2   62.4    58       334
0.23    Ideal        E        SI2   61.5    55       326
0.21    Premium      E        SI1   59.8    61       326
0.23    Good         E        VS1   56.9    65       327
0.31    Good         J        SI2   63.3    58       335
0.24    Very Good    J        VVS2  62.8    57       336

because H color has the highest total price and then I and so on.

I can order the color by total price for each color , but I want to order the data set itself by color

as mentioned above .

and it will be better that we can do another ordering for by price for each record so it will be something like this

carat   cut        color    clarity depth   table   price
0.26    Very Good    H        SI1   61.9    55       337    
0.24    Very Good    I        VVS1  62.3    57       336
0.29    Premium      I        VS2   62.4    58       334
0.23    Good         E        VS1   56.9    65       327    
0.23    Ideal        E        SI2   61.5    55       326
0.21    Premium      E        SI1   59.8    61       326    
0.24    Very Good    J        VVS2  62.8    57       336    
0.31    Good         J        SI2   63.3    58       335

Upvotes: 0

Views: 79

Answers (2)

David Arenburg
David Arenburg

Reputation: 92292

With data.table I would do the following (using the data provided by you)

library(data.table)
# convert to `data.table` and assign a TotSum column (per color) by reference
setDT(df)[, TotSum := sum(price), by = color]
# sort your data by total sum (decreasing), color (in case two colors will have the same total price) and by price (decreasing)
setorder(df, -TotSum, color, -price) 
df
#    carat       cut color clarity depth table price TotSum
# 1:  0.23      Good     E     VS1  56.9    65   327    979
# 2:  0.23     Ideal     E     SI2  61.5    55   326    979
# 3:  0.21   Premium     E     SI1  59.8    61   326    979
# 4:  0.24 Very Good     J    VVS2  62.8    57   336    671
# 5:  0.31      Good     J     SI2  63.3    58   335    671
# 6:  0.24 Very Good     I    VVS1  62.3    57   336    670
# 7:  0.29   Premium     I     VS2  62.4    58   334    670
# 8:  0.26 Very Good     H     SI1  61.9    55   337    337

Here we created a new column of TotSum per color and the ordered df by reference by the total sum, color, and the price within each color.

Upvotes: 2

Bryan Hanson
Bryan Hanson

Reputation: 6213

I'm not sure I completely understand the ordering that you want, but if it is first by price and then by color, both in descending order, then you can do:

library('ggplot2')
library('dplyr')
tst <- dplyr::arrange(diamonds, desc(color), desc(price))

Which gives, for head(tst):

  carat     cut color clarity depth table price    x    y    z
1  3.01 Premium     J     SI2  60.7    59 18710 9.35 9.22 5.64
2  3.01 Premium     J     SI2  59.7    58 18710 9.41 9.32 5.59
3  2.22 Premium     J     VS1  60.0    60 18706 8.49 8.43 5.08
4  3.51 Premium     J     VS2  62.5    59 18701 9.66 9.63 6.03
5  2.43 Premium     J     VS2  62.2    57 18692 8.63 8.54 5.34
6  2.42 Premium     J     VS2  61.3    59 18615 8.61 8.58 5.27

Upvotes: 0

Related Questions