Reputation: 11
My dataset m
:
date household energy
2012 a 0.2
2013 a 0.1
2014 a 0
2015 a 0.4
2012 b 0.4
2013 b 0.7
2014 b 0.3
2015 b 0.2
I want to find the maximum of every 2 rows in energy
which will look like this:
date household energy
2012 a 0.2
2015 a 0.4
2013 b 0.7
2014 b 0.3
Upvotes: 1
Views: 317
Reputation: 886938
An option with gl
library(dplyr)
df %>%
group_by(grp = as.integer(gl(n(), 2, n()))) %>%
slice_max(order_by = energy, n = 1) %>%
ungroup %>%
select(-grp)
-output
# A tibble: 4 x 3
date household energy
<int> <chr> <dbl>
1 2012 a 0.2
2 2015 a 0.4
3 2013 b 0.7
4 2014 b 0.3
Or may also arrange
df %>%
arrange(household, as.integer(gl(n(), 2, n())), desc(energy)) %>%
filter(!duplicated(gl(n(), 2, n())))
date household energy
1 2012 a 0.2
2 2015 a 0.4
3 2013 b 0.7
4 2014 b 0.3
df <- structure(list(date = c(2012L, 2013L, 2014L, 2015L, 2012L, 2013L,
2014L, 2015L), household = c("a", "a", "a", "a", "b", "b", "b",
"b"), energy = c(0.2, 0.1, 0, 0.4, 0.4, 0.7, 0.3, 0.2)), row.names = c(NA,
-8L), class = "data.frame")
Upvotes: 0
Reputation: 21908
This can also be used:
library(dplyr)
df %>%
group_by(grp = rep(1:(n()/2), each = 2)) %>%
slice_max(order_by = energy) %>%
ungroup() %>%
select(-grp)
# A tibble: 4 x 3
date household energy
<int> <chr> <dbl>
1 2012 a 0.2
2 2015 a 0.4
3 2013 b 0.7
4 2014 b 0.3
Upvotes: 2
Reputation: 388807
You can create an additional grouping column for each household
and select the max energy
row in them.
library(dplyr)
df %>%
group_by(household, group = ceiling(row_number()/2)) %>%
slice(which.max(energy)) %>%
ungroup %>%
select(-group)
# date household energy
# <int> <chr> <dbl>
#1 2012 a 0.2
#2 2015 a 0.4
#3 2013 b 0.7
#4 2014 b 0.3
data
It is easier to help if you provide data in a reproducible format -
df <- structure(list(date = c(2012L, 2013L, 2014L, 2015L, 2012L, 2013L,
2014L, 2015L), household = c("a", "a", "a", "a", "b", "b", "b",
"b"), energy = c(0.2, 0.1, 0, 0.4, 0.4, 0.7, 0.3, 0.2)),
row.names = c(NA, -8L), class = "data.frame")
Upvotes: 3
Reputation: 11584
Does this work:
library(dplyr)
df %>% group_by(household) %>% filter(dense_rank(desc(energy)) < 3)
# A tibble: 4 x 3
# Groups: household [2]
date household energy
<dbl> <chr> <dbl>
1 2012 a 0.2
2 2015 a 0.4
3 2012 b 0.4
4 2013 b 0.7
Upvotes: 1