Reputation: 95
I am analyzing some stock data and already calculated an illiquidity measure for each stock (cit). I now want to create portfolios from my stocks based on this illiquidity measure for each month in my data set.
I have already looked into the split()-function but have found no way to split my data for each month.
This is a sample of my data set. I have got a large data set behind this sample (22.000.000 rows containing 30 years of stock data).
date cit
1 1990-01-01 1
2 1990-01-01 2
3 1990-01-01 3
4 1990-01-01 4
5 1990-01-01 5
6 1990-01-01 6
7 1990-01-01 7
8 1990-01-01 8
9 1990-01-01 9
10 1990-01-01 10
11 1990-02-01 11
12 1990-02-01 12
13 1990-02-01 13
14 1990-02-01 14
15 1990-02-01 15
16 1990-02-01 16
17 1990-02-01 17
18 1990-02-01 18
19 1990-02-01 19
20 1990-02-01 20
I want to create portfolios of my stock data for each month and based on the cit-measure, e.g. from lowest to highest cit, with the average cit computed for each portfolio. The expected outcome looks like this:
date cit portfolio
1 1990-01-01 1.5 1
2 1990-01-01 3.5 2
3 1990-01-01 5.5 3
4 1990-01-01 7.5 4
5 1990-01-01 9.5 5
6 1990-01-01 11.5 1
7 1990-01-01 13.5 2
8 1990-01-01 15.5 3
9 1990-01-01 17.5 4
10 1990-01-01 19.5 5
Any help is greatly appreciated. Thank you so much!
Upvotes: 1
Views: 69
Reputation: 69
There is nothing pretty about this solution but maybe it helps, the lubridate package is useful to extract the month to split the data.
library(tidyverse);library(magrittr);library(lubridate)
df <- list("date" = c(rep("1990-01-01",10),rep("1990-02-01",10)),
"cit"=1:20) %>% as.data.frame()
df$date %<>% as.Date()
df.res <-
split(df, df$date) %>%
map_df(~{
.x %>%
mutate(portfolio = cut(cit, breaks = 5, labels = 1:5)) %>%
group_by(portfolio) %>%
summarise(meanCit = mean(cit)) %>%
transmute(year= year(.x$date)[1],
month= month(.x$date)[1],
meanCit,
portfolio)
})
df.res
Upvotes: 1