Nicholas
Nicholas

Reputation: 3737

Ranking within multiple groups - R

I have a dataframe like this:

country =  c("Austria", "Austria","Austria","Austria", "Brazil", "Brazil", "Brazil", "Brazil", "USA", "USA", "USA", "USA",
             "Austria", "Austria","Austria","Austria", "Brazil", "Brazil", "Brazil", "Brazil", "USA", "USA", "USA", "USA")
tech = c("cars", "cars","cars","cars","cars","cars","cars","cars","cars","cars","cars","cars","planes","planes","planes",
         "planes","planes","planes","planes","planes","planes","planes","planes","planes")
year =  c(2010, 2011, 2012, 2013, 2010, 2011, 2012, 2013, 2010, 2011, 2012, 2013,2010, 2011, 2012, 2013, 2010, 2011, 2012,
          2013, 2010, 2011, 2012, 2013)
value = c(42, 23, 13, 13, 646,454, 23, 234, 12, 123, 1, 23, 23, 54, 2, 77, 584, 66, 767, 6767, 23, 12, 12, 99)


df = data.frame(tech, country, year, value)

I want a new column where I have rankings for the countries by year within each tech using the values.

So, in 2013, for planes, I want to be able to say "USA" ranked 1st (had the highest value for that year and tech).

I know how to rank within one group e.g. if I didnt have tech, and I wanted to rank by country and year, I would do:

df = df %>% 
  group_by(year) %>% 
  arrange(country) %>% 
  mutate(`whatever` = order(order(value, decreasing = TRUE)))

However, I am stumped on how to add tech into the mix i.e. rank by tech for each year for the countries.

Does anyone have any guidance?

Upvotes: 1

Views: 107

Answers (1)

David
David

Reputation: 10152

Does this help you?

library(tibble)
library(dplyr)

country =  c("Austria", "Austria","Austria","Austria", "Brazil", "Brazil", "Brazil", "Brazil", "USA", "USA", "USA", "USA",
             "Austria", "Austria","Austria","Austria", "Brazil", "Brazil", "Brazil", "Brazil", "USA", "USA", "USA", "USA")
tech = c("cars", "cars","cars","cars","cars","cars","cars","cars","cars","cars","cars","cars","planes","planes","planes",
         "planes","planes","planes","planes","planes","planes","planes","planes","planes")
year =  c(2010, 2011, 2012, 2013, 2010, 2011, 2012, 2013, 2010, 2011, 2012, 2013,2010, 2011, 2012, 2013, 2010, 2011, 2012,
          2013, 2010, 2011, 2012, 2013)
value = c(42, 23, 13, 13, 646,454, 23, 234, 12, 123, 1, 23, 23, 54, 2, 77, 584, 66, 767, 6767, 23, 12, 12, 99)


df = tibble(tech, country, year, value)
df
#> # A tibble: 24 x 4
#>    tech  country  year value
#>    <chr> <chr>   <dbl> <dbl>
#>  1 cars  Austria  2010    42
#>  2 cars  Austria  2011    23
#>  3 cars  Austria  2012    13
#>  4 cars  Austria  2013    13
#>  5 cars  Brazil   2010   646
#>  6 cars  Brazil   2011   454
#>  7 cars  Brazil   2012    23
#>  8 cars  Brazil   2013   234
#>  9 cars  USA      2010    12
#> 10 cars  USA      2011   123
#> # … with 14 more rows

df %>% 
  group_by(tech, year) %>% 
  mutate(rank = rank(value)) %>% 
  arrange(tech, year, rank)
#> # A tibble: 24 x 5
#> # Groups:   tech, year [8]
#>    tech  country  year value  rank
#>    <chr> <chr>   <dbl> <dbl> <dbl>
#>  1 cars  USA      2010    12     1
#>  2 cars  Austria  2010    42     2
#>  3 cars  Brazil   2010   646     3
#>  4 cars  Austria  2011    23     1
#>  5 cars  USA      2011   123     2
#>  6 cars  Brazil   2011   454     3
#>  7 cars  USA      2012     1     1
#>  8 cars  Austria  2012    13     2
#>  9 cars  Brazil   2012    23     3
#> 10 cars  Austria  2013    13     1
#> # … with 14 more rows

Created on 2020-03-20 by the reprex package (v0.3.0)

Upvotes: 1

Related Questions