Max
Max

Reputation: 117

sub setting unbalanced panel data by date range

I have an unbalanced panel data set like this.

date          firms     return
2003-03-01      A       2
2003-04-01      A       5
2003-05-01      A       1
2003-06-01      A       4
2003-07-01      A       4
2003-08-01      A       3
2003-09-01      A       2
2003-10-01      A       5
2003-11-01      A       3
2003-12-01      A       2
2004-01-01      A       8
2004-02-01      A       4
2004-03-01      A       3
2004-04-01      A       5
2004-05-01      A       3
2004-06-01      A       2
2004-07-01      A       2
2004-08-01      A       5
2004-09-01      A       1
2004-10-01      A       4
2004-11-01      A       4
2004-12-01      A       3
2003-03-01      B       3
2003-04-01      B       5
2003-05-01      B       3
2003-06-01      B       2
2003-07-01      B       2
2003-08-01      B       5
2003-09-01      B       3
2003-10-01      B       2
2003-11-01      B       8
2003-12-01      B       4
2004-01-01      B       3
2004-02-01      B       3
2004-03-01      B       5
2004-04-01      B       3
2004-05-01      B       2
2004-06-01      B       2
2004-07-01      B       5
2004-08-01      B       1
2004-09-01      B       4
2004-10-01      B       4
2004-11-01      B       3
2004-12-01      B       3
2005-01-01      B       3
2005-02-01      B       3
2005-03-01      B       5
2005-04-01      B       3
2005-05-01      B       2
2005-06-01      B       2
2005-07-01      B       5
2005-08-01      B       3
2005-09-01      B       2
2005-10-01      B       8
2005-11-01      B       4
2005-12-01      B       4

Data are the monthly unbalanced panel where all firms do not have the same number of observation dates. I want to subset this set into two parts by date. I tried it by using this code but it is not working

requre(data.table)
df1<-testset[date %between% c("2003-01-01", "2004-06-01")]
df2<-testset[date %between% c("2004-07-01", "2006-06-01")]

Can you please give me any better code by which I can subset by any date range I like?

Upvotes: 1

Views: 50

Answers (1)

jay.sf
jay.sf

Reputation: 72693

Assuming your data has this structure.

> str(testset)
'data.frame':   56 obs. of  3 variables:
 $ date  : Factor w/ 34 levels "2003-03-01","2003-04-01",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ firms : Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1 1 1 1 ...
 $ return: int  2 5 1 4 4 3 2 5 3 2 ...

You could transform the date into POSIXct format to get your code running.

testset$date <- as.POSIXct(testset$date)

library(data.table)
df1 <- testset[testset$date %between% c("2003-01-01", "2004-06-01"), ]
df2 <- testset[testset$date %between% c("2004-07-01", "2006-06-01"), ]

Upvotes: 1

Related Questions