Reputation: 11

How to find the maximum value of every n rows in every column?

My dataset m:

date household energy
2012    a       0.2
2013    a       0.1
2014    a        0
2015    a       0.4
2012    b       0.4 
2013    b       0.7
2014    b       0.3
2015    b       0.2

I want to find the maximum of every 2 rows in energy which will look like this:

date household energy
2012    a       0.2
2015    a       0.4
2013    b       0.7
2014    b       0.3

Upvotes: 1

Answers (4)

akrun

Reputation: 886938

An option with gl

library(dplyr)
df %>%
    group_by(grp = as.integer(gl(n(), 2, n()))) %>%
    slice_max(order_by = energy, n = 1) %>%
    ungroup %>%
    select(-grp)

-output

# A tibble: 4 x 3
   date household energy
  <int> <chr>      <dbl>
1  2012 a            0.2
2  2015 a            0.4
3  2013 b            0.7
4  2014 b            0.3

Or may also arrange

df %>% 
   arrange(household, as.integer(gl(n(), 2, n())), desc(energy)) %>% 
   filter(!duplicated(gl(n(), 2, n())))
  date household energy
1 2012         a    0.2
2 2015         a    0.4
3 2013         b    0.7
4 2014         b    0.3

data

df <- structure(list(date = c(2012L, 2013L, 2014L, 2015L, 2012L, 2013L, 
2014L, 2015L), household = c("a", "a", "a", "a", "b", "b", "b", 
"b"), energy = c(0.2, 0.1, 0, 0.4, 0.4, 0.7, 0.3, 0.2)), row.names = c(NA, 
-8L), class = "data.frame")

Upvotes: 0

Anoushiravan R

Reputation: 21908

This can also be used:

library(dplyr)

df %>%
  group_by(grp = rep(1:(n()/2), each = 2)) %>%
  slice_max(order_by = energy) %>%
  ungroup() %>%
  select(-grp)

# A tibble: 4 x 3
   date household energy
  <int> <chr>      <dbl>
1  2012 a            0.2
2  2015 a            0.4
3  2013 b            0.7
4  2014 b            0.3

Upvotes: 2

Ronak Shah

Reputation: 388807

You can create an additional grouping column for each household and select the max energy row in them.

library(dplyr)

df %>%
  group_by(household, group = ceiling(row_number()/2)) %>%
  slice(which.max(energy)) %>%
  ungroup %>%
  select(-group)

#   date household energy
#  <int> <chr>      <dbl>
#1  2012 a            0.2
#2  2015 a            0.4
#3  2013 b            0.7
#4  2014 b            0.3

data

It is easier to help if you provide data in a reproducible format -

df <- structure(list(date = c(2012L, 2013L, 2014L, 2015L, 2012L, 2013L, 
2014L, 2015L), household = c("a", "a", "a", "a", "b", "b", "b", 
"b"), energy = c(0.2, 0.1, 0, 0.4, 0.4, 0.7, 0.3, 0.2)), 
row.names = c(NA, -8L), class = "data.frame")

Upvotes: 3

Karthik S

Reputation: 11584

Does this work:

library(dplyr)
df %>% group_by(household) %>% filter(dense_rank(desc(energy)) < 3)
# A tibble: 4 x 3
# Groups:   household [2]
   date household energy
  <dbl> <chr>      <dbl>
1  2012 a            0.2
2  2015 a            0.4
3  2012 b            0.4
4  2013 b            0.7

Upvotes: 1

How to find the maximum value of every n rows in every column?

Answers (4)

data

Related Questions