Reputation: 79
I have a data frame, Returns
which looks something like this:
Date Company LstPrice r
1987-02-27 NOVO NORDISK 'B' 2.29 0.031531532
1987-03-31 NOVO NORDISK 'B' 2.33 0.017467249
1987-04-30 NOVO NORDISK 'B' 2.25 -0.034334764
1987-05-29 NOVO NORDISK 'B' 2.22 -0.013333333
1987-06-30 NOVO NORDISK 'B' 2.47 0.1126126137
1987-07-31 NOVO NORDISK 'B' 2.46 -0.004048583
1987-08-31 NOVO NORDISK 'B' 1.98 -0.195121951
1987-09-30 NOVO NORDISK 'B' 1.90 -0.040404040
1987-02-27 DANSKE BANK 24.29 -0.130637079
1987-03-31 DANSKE BANK 24.97 0.027995060
1987-04-30 DANSKE BANK 25.43 0.018422107
1987-05-29 DANSKE BANK 26.19 0.029885961
1987-06-30 DANSKE BANK 26.50 0.011836579
1987-07-31 DANSKE BANK 26.57 0.002641509
1987-08-31 DANSKE BANK 28.55 0.074520135
1987-09-30 DANSKE BANK 26.25 -0.080560420
I would want to create new data frames for different months. For example, I would want a new data frame with the observations for the first three months, a new data frame for the next three months, and so on. They would look something like this:
Data Frame, FirstThreeMonths
:
Date Company LstPrice r
1987-02-27 NOVO NORDISK 'B' 2.29 0.031531532
1987-03-31 NOVO NORDISK 'B' 2.33 0.017467249
1987-04-30 NOVO NORDISK 'B' 2.25 -0.034334764
1987-02-27 DANSKE BANK 24.29 -0.130637079
1987-03-31 DANSKE BANK 24.97 0.027995060
1987-04-30 DANSKE BANK 25.43 0.018422107
Data Frame, NextThreeMonths
:
Date Company LstPrice r
1987-05-29 NOVO NORDISK 'B' 2.22 -0.013333333
1987-06-30 NOVO NORDISK 'B' 2.47 0.1126126137
1987-07-31 NOVO NORDISK 'B' 2.46 -0.004048583
1987-05-29 DANSKE BANK 26.19 0.029885961
1987-06-30 DANSKE BANK 26.50 0.011836579
1987-07-31 DANSKE BANK 26.57 0.002641509
....and so on (I have data for approx. 2200 companies for the last 30 years, so I will have to create a lot of data frames).
I have tried several different ways, both using if
and for
loops, and the subset
command, but so far I can't get any of them to work. I also tried searching for similar questions, but couldn't find a solution that works for my exact problem. Is there an easy way to do something like this.
Every effort to help is much appreciated!
Upvotes: 1
Views: 64
Reputation: 4187
You need to make a split vector first. For example:
splitter <- cut(as.integer(format(df$Date,'%m')),
breaks = c(0,3,6,9,12),
labels = c('First three','Second three','Third three','Fourth three'))
dflist <- split(df, splitter)
The result:
> dflist
$`First three`
Date Company LstPrice r
1 1987-02-27 NOVO NORDISK B 2.29 0.03153153
2 1987-03-31 NOVO NORDISK B 2.33 0.01746725
9 1987-02-27 DANSKE BANK 24.29 -0.13063708
10 1987-03-31 DANSKE BANK 24.97 0.02799506
$`Second three`
Date Company LstPrice r
3 1987-04-30 NOVO NORDISK B 2.25 -0.03433476
4 1987-05-29 NOVO NORDISK B 2.22 -0.01333333
5 1987-06-30 NOVO NORDISK B 2.47 0.11261261
11 1987-04-30 DANSKE BANK 25.43 0.01842211
12 1987-05-29 DANSKE BANK 26.19 0.02988596
13 1987-06-30 DANSKE BANK 26.50 0.01183658
$`Third three`
Date Company LstPrice r
6 1987-07-31 NOVO NORDISK B 2.46 -0.004048583
7 1987-08-31 NOVO NORDISK B 1.98 -0.195121951
8 1987-09-30 NOVO NORDISK B 1.90 -0.040404040
14 1987-07-31 DANSKE BANK 26.57 0.002641509
15 1987-08-31 DANSKE BANK 28.55 0.074520135
16 1987-09-30 DANSKE BANK 26.25 -0.080560420
$`Fourth three`
[1] Date Company LstPrice r
<0 rows> (or 0-length row.names)
Removing empty dataframes from that list can be done like this:
dflist <- split(df, splitter)
dflist <- dflist[sapply(dflist, nrow) > 0]
Upvotes: 1