Reputation: 79
I'm working with a time series in r which contains daily observations from Nordic Stock Exchanges. I only want to keep the last date of the month for each Company (column).
My data frame, OSE
looks something like this (but With several thousand rows and columns):
Date Statoil DNB
1987-09.16 0,21 1,2
1987-09.17 0,22 1,3
1987-09.18 0,15 1,1
1987-09.21 0,16 1,5
1987-09.22 0,27 1,7
1987-09.23 0,28 1,9
1987-09.24 0,30 1,6
1987-09.25 0,32 1,7
1987-09.28 0,29 1,8
1987-09.29 0,33 2,1
1987-09.30 0,34 1,9
1987-10.01 0,37 1,8
1987-10.02 0,38 2,1
1987-10.05 0,34 2,3
1987-10.06 0,28 2,4
1987-10.07 0,27 2,1
1987-10.08 0,25 2,2
1987-10.09 0,21 2,1
1987-10.12 0,31 1,9
1987-10.13 0,31 2,1
1987-10.14 0,32 2,3
1987-10.15 0,37 2,5
1987-10.16 0,41 2,6
1987-10.19 0,51 2,8
1987-10.20 0,62 3,1
1987-10.21 0,59 3,1
1987-10.22 0,58 3,5
1987-10.23 0,61 3,6
1987-10.26 0,62 3,7
1987-10.27 0,63 3,9
1987-10.28 0,57 4,0
1987-10.29 0,54 4,1
1987-10.30 0,64 4,1
1987-11.02 0,66 4,2
1987-11-03 0,67 3,9
I would want it to look like this:
Date Statoil DNB
1987-09.30 0,34 1,9
1987-10.30 0,64 4,1
Do you guys have any suggestions for deleting the redundant rows, i.e. the rows that are not the last date of the month?
All efforts to help are much appreciated!
Upvotes: 1
Views: 1218
Reputation: 490
You may want to consider converting the dataset to an xts format and then using the to.period() command, which works quickly and easily. For example, let's create a fake daily time series and then subset the last value at each month's end:
library(xts)
set.seed(78)
date.a <-seq(as.Date("2000/10/1"), as.Date("2000/12/31"), "days")
dat <-xts(rnorm(length(date.a)), date.a)
dat.month.end <-to.period(dat, period='months', indexAt='lastof', OHLC=F)
dat.month.end
[,1]
2000-10-31 1.00117650
2000-11-30 -1.15090619
2000-12-31 0.04944459
Upvotes: 0
Reputation: 10671
We could do the same strategy in the tidyverse
, grouping on Month and Year:
library(tidyverse)
library(lubridate)
tib$Date <- ymd(tib$Date) # parse .$Date to date class
tib %>% arrange(desc(Date)) %>% # order dates last to first
group_by(month(Date), year(Date)) %>%
slice(1)
# A tibble: 3 x 5
# Groups: month(Date), year(Date) [3]
Date Statoil DNB `month(Date)` `year(Date)`
<date> <fctr> <fctr> <dbl> <dbl>
1 1987-09-30 0,34 1,9 9 1987
2 1987-10-30 0,64 4,1 10 1987
3 1987-11-03 0,67 3,9 11 1987
Upvotes: 3
Reputation: 887391
In the example, there is no Company
column, so it seems like we need to group by 'month' and get the row of max
Date
library(data.table)
setDT(df1)[, Date := as.IDate(Date, "%Y-%m.%d")]
df1[df1[, .I[which.max(Date)] ,
.(month = month(Date), year = year(Date))]$V1]
# Date Statoil DNB
#1: 1987-09-30 0,34 1,9
#2: 1987-10-30 0,64 4,1
#3: 1987-11-03 0,67 3,9
Upvotes: 3