Reputation: 35
I have a table with 5 columns.
Country Flow %Rec Date_Received(with timestamp) Date
DE DEF 10 2020-03-03 05:05:54 2020-03-03
DE DEF 15 2020-03-03 07:25:24 2020-03-03
DE DEF 20 2020-03-03 04:05:54 2020-03-02
DE ABC 40 2020-03-02 03:05:54 2020-03-02
DE ABC 50 2020-03-02 07:05:54 2020-03-02
DE ABC 20 2020-03-01 06:05:54 2020-03-01
I want to find the % received of the last date and first occurance of date_rec. Output required:
Country Flow %Rec Date_Received(with timestamp) Date
DE DEF 10 2020-03-03 05:05:54 2020-03-03
DE ABC 40 2020-03-02 03:05:54 2020-03-02
Upvotes: 1
Views: 49
Reputation: 887048
In R
, we can do slice
after grouping by 'Country', 'Flow'
library(dplyr)
df %>%
group_by(Country, Flow) %>%
slice(1)
# A tibble: 2 x 5
# Groups: Country, Flow [2]
# Country Flow `%Rec` `Date_Received(with timestamp)` Date
# <chr> <chr> <int> <chr> <chr>
#1 DE ABC 40 2020-03-02 03:05:54 2020-03-02
#2 DE DEF 10 2020-03-03 05:05:54 2020-03-03
The above assumes that the 'Date' are ordered
(in the OP's example it is already ordered). If not, then convert to Date
class and use which.max
df %>%
group_by(Country, Flow) %>%
slice(which.max(as.Date(Date)))
# A tibble: 2 x 5
# Groups: Country, Flow [2]
# Country Flow `%Rec` `Date_Received(with timestamp)` Date
# <chr> <chr> <int> <chr> <chr>
#1 DE ABC 40 2020-03-02 03:05:54 2020-03-02
#2 DE DEF 10 2020-03-03 05:05:54 2020-03-03
df <- structure(list(Country = c("DE", "DE", "DE", "DE", "DE", "DE"
), Flow = c("DEF", "DEF", "DEF", "ABC", "ABC", "ABC"), `%Rec` = c(10L,
15L, 20L, 40L, 50L, 20L), `Date_Received(with timestamp)` = c("2020-03-03 05:05:54",
"2020-03-03 07:25:24", "2020-03-03 04:05:54", "2020-03-02 03:05:54",
"2020-03-02 07:05:54", "2020-03-01 06:05:54"), Date = c("2020-03-03",
"2020-03-03", "2020-03-02", "2020-03-02", "2020-03-02", "2020-03-01"
)), class = "data.frame", row.names = c(NA, -6L))
Upvotes: 1
Reputation: 1269633
In SQL, you would use row_number()
or a correlated subquery:
select t.*
from t
where t.date_received = (select max(t2.date_received)
from t t2
where t2.country = t.country and t2.flow = t.flow
);
Upvotes: 0