Vegard Dyran
Vegard Dyran

Reputation: 79

How to create a new data frame with the rows for specific dates from an already exisiting data frame?

I have a data frame, Returns which looks something like this:

Date         Company            LstPrice    r
1987-02-27   NOVO NORDISK 'B'   2.29        0.031531532
1987-03-31   NOVO NORDISK 'B'   2.33        0.017467249
1987-04-30   NOVO NORDISK 'B'   2.25        -0.034334764
1987-05-29   NOVO NORDISK 'B'   2.22        -0.013333333
1987-06-30   NOVO NORDISK 'B'   2.47        0.1126126137
1987-07-31   NOVO NORDISK 'B'   2.46        -0.004048583
1987-08-31   NOVO NORDISK 'B'   1.98        -0.195121951
1987-09-30   NOVO NORDISK 'B'   1.90        -0.040404040
1987-02-27   DANSKE BANK        24.29       -0.130637079
1987-03-31   DANSKE BANK        24.97       0.027995060
1987-04-30   DANSKE BANK        25.43       0.018422107
1987-05-29   DANSKE BANK        26.19       0.029885961
1987-06-30   DANSKE BANK        26.50       0.011836579
1987-07-31   DANSKE BANK        26.57       0.002641509
1987-08-31   DANSKE BANK        28.55       0.074520135
1987-09-30   DANSKE BANK        26.25       -0.080560420

I would want to create new data frames for different months. For example, I would want a new data frame with the observations for the first three months, a new data frame for the next three months, and so on. They would look something like this:

Data Frame, FirstThreeMonths:

Date         Company            LstPrice    r
1987-02-27   NOVO NORDISK 'B'   2.29        0.031531532
1987-03-31   NOVO NORDISK 'B'   2.33        0.017467249
1987-04-30   NOVO NORDISK 'B'   2.25        -0.034334764
1987-02-27   DANSKE BANK        24.29       -0.130637079
1987-03-31   DANSKE BANK        24.97       0.027995060
1987-04-30   DANSKE BANK        25.43       0.018422107

Data Frame, NextThreeMonths:

Date         Company            LstPrice    r
1987-05-29   NOVO NORDISK 'B'   2.22        -0.013333333
1987-06-30   NOVO NORDISK 'B'   2.47        0.1126126137
1987-07-31   NOVO NORDISK 'B'   2.46        -0.004048583
1987-05-29   DANSKE BANK        26.19       0.029885961
1987-06-30   DANSKE BANK        26.50       0.011836579
1987-07-31   DANSKE BANK        26.57       0.002641509

....and so on (I have data for approx. 2200 companies for the last 30 years, so I will have to create a lot of data frames).

I have tried several different ways, both using if and for loops, and the subset command, but so far I can't get any of them to work. I also tried searching for similar questions, but couldn't find a solution that works for my exact problem. Is there an easy way to do something like this.

Every effort to help is much appreciated!

Upvotes: 1

Views: 64

Answers (1)

h3rm4n
h3rm4n

Reputation: 4187

You need to make a split vector first. For example:

splitter <- cut(as.integer(format(df$Date,'%m')),
                breaks = c(0,3,6,9,12),
                labels = c('First three','Second three','Third three','Fourth three'))

dflist <- split(df, splitter)

The result:

> dflist
$`First three`
         Date        Company LstPrice           r
1  1987-02-27 NOVO NORDISK B     2.29  0.03153153
2  1987-03-31 NOVO NORDISK B     2.33  0.01746725
9  1987-02-27    DANSKE BANK    24.29 -0.13063708
10 1987-03-31    DANSKE BANK    24.97  0.02799506

$`Second three`
         Date        Company LstPrice           r
3  1987-04-30 NOVO NORDISK B     2.25 -0.03433476
4  1987-05-29 NOVO NORDISK B     2.22 -0.01333333
5  1987-06-30 NOVO NORDISK B     2.47  0.11261261
11 1987-04-30    DANSKE BANK    25.43  0.01842211
12 1987-05-29    DANSKE BANK    26.19  0.02988596
13 1987-06-30    DANSKE BANK    26.50  0.01183658

$`Third three`
         Date        Company LstPrice            r
6  1987-07-31 NOVO NORDISK B     2.46 -0.004048583
7  1987-08-31 NOVO NORDISK B     1.98 -0.195121951
8  1987-09-30 NOVO NORDISK B     1.90 -0.040404040
14 1987-07-31    DANSKE BANK    26.57  0.002641509
15 1987-08-31    DANSKE BANK    28.55  0.074520135
16 1987-09-30    DANSKE BANK    26.25 -0.080560420

$`Fourth three`
[1] Date     Company  LstPrice r       
<0 rows> (or 0-length row.names)

Removing empty dataframes from that list can be done like this:

dflist <- split(df, splitter)
dflist <- dflist[sapply(dflist, nrow) > 0]

Upvotes: 1

Related Questions