Reputation: 86
I would like to know how to remove the rows with the latest 3 max dates. I get my data through an API so it's dynamic and can't just filter rows from my data frame with specific dates because the latest dates keep changes.
My Data looks like this:
date ptppAll mAverage
<date> <dbl> <dbl>
1 2020-03-01 1.71 NA
2 2020-03-02 7.82 NA
3 2020-03-03 9.81 NA
4 2020-03-04 1.71 4.23
5 2020-03-05 3.42 4.72
6 2020-03-06 0 4.68
7 2020-03-07 5.13 6.19
8 2020-03-08 5.13 6.53
9 2020-03-09 7.54 6.53
10 2020-03-10 20.4 8.04
In the above data, assuming that 2020-03-10 is the max date, I'd like to remove it alongside rows containing dates 2020-03-09, 2020-03-08. So from my example, the code would like this:
date ptppAll mAverage
<date> <dbl> <dbl>
1 2020-03-01 1.71 NA
2 2020-03-02 7.82 NA
3 2020-03-03 9.81 NA
4 2020-03-04 1.71 4.23
5 2020-03-05 3.42 4.72
6 2020-03-06 0 4.68
7 2020-03-07 5.13 6.19
Upvotes: 0
Views: 83
Reputation: 886948
Using base R
head(data[order(data$date),], -3)
data <- structure(list(date = c("2020-03-01", "2020-03-02", "2020-03-03",
"2020-03-04", "2020-03-05", "2020-03-06", "2020- 03-07", "2020-03-08",
"2020-03-09", "2020-03-10"),
ptppAll = c(1.71, 7.82, 9.81, 1.71, 3.42, 0, 5.13, 5.13, 7.54, 20.4),
mAverage = c(NA, NA, NA, 4.23, 4.72, 4.68, 6.19, 6.53, 6.53, 8.04)), class = "data.frame",
row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"))
Upvotes: 0
Reputation: 5456
You simply can use filter:
library(tidyverse)
data <- structure(list(date = c("2020-03-01", "2020-03-02", "2020-03-03",
"2020-03-04", "2020-03-05", "2020-03-06", "2020- 03-07", "2020-03-08",
"2020-03-09", "2020-03-10"),
ptppAll = c(1.71, 7.82, 9.81, 1.71, 3.42, 0, 5.13, 5.13, 7.54, 20.4),
mAverage = c(NA, NA, NA, 4.23, 4.72, 4.68, 6.19, 6.53, 6.53, 8.04)), class = "data.frame",
row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"))
data %>%
filter(date %in% head(sort(unique(.$date)), -3))
# date ptppAll mAverage
# 1 2020-03-01 1.71 NA
# 2 2020-03-02 7.82 NA
# 3 2020-03-03 9.81 NA
# 4 2020-03-04 1.71 4.23
# 5 2020-03-05 3.42 4.72
# 6 2020-03-06 0.00 4.68
# 7 2020-03-07 5.13 6.19
Upvotes: 0
Reputation: 41210
try:
library(dplyr)
data <- structure(list(date = c("2020-03-01", "2020-03-02", "2020-03-03",
"2020-03-04", "2020-03-05", "2020-03-06", "2020-03-07", "2020-03-08",
"2020-03-09", "2020-03-10"),
ptppAll = c(1.71, 7.82, 9.81, 1.71, 3.42, 0, 5.13, 5.13, 7.54, 20.4),
mAverage = c(NA, NA, NA, 4.23, 4.72, 4.68, 6.19, 6.53, 6.53, 8.04)), class = "data.frame",
row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"))
data %>% arrange(date) %>% head(-3)
#> date ptppAll mAverage
#> 1 2020-03-01 1.71 NA
#> 2 2020-03-02 7.82 NA
#> 3 2020-03-03 9.81 NA
#> 4 2020-03-04 1.71 4.23
#> 5 2020-03-05 3.42 4.72
#> 6 2020-03-06 0.00 4.68
#> 7 2020-03-07 5.13 6.19
Created on 2020-08-20 by the reprex package (v0.3.0)
Upvotes: 1