Reputation: 339
I have a dataframe data
Date Marks
21-01-2019 18
22-01-2019 12
23-01-2019 14
24-01-2019 20
25-01-2019 16
19-02-2019 12
20-02-2019 11
21-02-2019 18
22-02-2019 15
23-02-2019 20
24-02-2019 14
I wanted to group the data by month and that particular year to get the minimum marks. My output looks as shown below:
Date Marks
22-01-2019 12
20-02-2019 11
Upvotes: 2
Views: 2020
Reputation: 40181
To get the minimum per months and years, you can do (it involves dplyr
):
df %>%
group_by(Date = format(as.Date(Date, format = "%d-%m-%Y"), "%m-%Y")) %>%
summarise_all(min)
Date Marks
<chr> <int>
1 01-2019 12
2 02-2019 11
And the same with base R
:
aggregate(Marks ~ cbind(Date = format(as.Date(Date, format = "%d-%m-%Y"), "%m-%Y")),
FUN = min,
data = df)
If you want to return also the corresponding date:
df %>%
group_by(month_year = format(as.Date(Date, format = "%d-%m-%Y"), "%m-%Y")) %>%
slice(which.min(Marks)) %>%
ungroup() %>%
select(-month_year)
Date Marks
<chr> <int>
1 22-01-2019 12
2 20-02-2019 11
And the same with base R
:
df[df$Marks == with(df, ave(Marks, format(as.Date(Date, format = "%d-%m-%Y"), "%m-%Y"),
FUN = function(x) x[which.min(x)])), ]
Upvotes: 2
Reputation: 5952
Since you know the fixed date format, you can just cut off the day part and group by that. You can group_by in order to add the minimum marks for every row, grouping by the year/month, and then filter out all rows which were not equal to the minimum.
table %>%
mutate(ym = substring(Date, 4)) %>%
group_by(ym) %>%
mutate(min_marks = min(Marks)) %>%
ungroup() %>%
filter(Marks == min_marks) %>%
select(-ym, -min_marks)
Edited to account for finding the day(s) with the minimum marks. Note that this may produce multiple rows if there are multiple dates with the same minimum marks.
Upvotes: 3
Reputation: 1364
data.table + lubridate example:
Data
dt = data.table(
Date = seq(as.Date('2001-01-27'), as.Date('2001-02-03'), 1),
Marks = rnorm(8)
)
> dt
Date Marks
1: 2001-01-27 -1.43603728
2: 2001-01-28 -1.02685408
3: 2001-01-29 0.37307426
4: 2001-01-30 -0.79033901
5: 2001-01-31 0.05463855
6: 2001-02-01 1.62495320
7: 2001-02-02 -0.44335638
8: 2001-02-03 -0.66227100
Code
dtsum = dt[, .(Date = Date[which.min(Marks)], MinMark = min(Marks)), .(Month = month(Date))][, 2:3]
> dtsum
Date MinMark
1: 2001-01-27 -1.436037
2: 2001-02-03 -0.662271
Upvotes: 0