user113156
user113156

Reputation: 7107

creating a weight vector for groups of data

The data I am using is in regards to financial data. The issue I have is that I want to assign weights to a number of firms within a given portfolio. That is if I have 3 firms (as in the example below) I want to assign equal weights to each of the 3 firms in the portfolio, 0.33% for each firm. I think it will also be interested in randomly assign weights to firms within a portfolio when the size of the portfolio increases and having to manually type the specific weights can be cumbersome.

The data looks like the following:

Which can be created using the tidyquant package using the following code:

stock_returns_monthly <- c("AAPL", "GOOG", "NFLX") %>%
  tq_get(get = "stock.prices",
         from = "2010-01-01",
         to = "2015-12-31") %>%
  group_by(symbol) %>%
  tq_transmute(select = adjusted,
               mutate_fun = periodReturn,
               period = "monthly",
               col_rename = "Ra")

stock_returns_monthly_multi <- stock_returns_monthly %>%
  tq_repeat_df(n = 3)

n = 3 sets the number of portfolios to create.

Output:

# A tibble: 6 x 4
# Groups:   portfolio [1]
  portfolio symbol date            Ra
      <int> <chr>  <date>       <dbl>
1         1 AAPL   2010-01-29 -0.103 
2         1 AAPL   2010-02-26  0.0654
3         1 AAPL   2010-03-31  0.148 
4         1 AAPL   2010-04-30  0.111 
5         1 AAPL   2010-05-28 -0.0161
6         1 AAPL   2010-06-30 -0.0208

I have two problems I am trying to acheive:

1) Set equal weights across all the firms for each portfolio, the following code works.

weights <- c(0.33, 0.33, 0.33,
             0.33, 0.33, 0.33,
             0.33, 0.33, 0.33)

However problems occur when the number of firms increase and/or the number of portfolios increase.

2) Randomly assign weights to each firm in each portfolio.

The next step is to create the following table using;

stocks <- c("AAPL", "GOOG", "NFLX")
weights_table <-  tibble(stocks) %>%
  tq_repeat_df(n = 3) %>%
  bind_cols(tibble(weights)) %>%
  group_by(portfolio)

Output:

# A tibble: 9 x 3
# Groups:   portfolio [3]
  portfolio stocks weights
      <int> <chr>    <dbl>
1         1 AAPL     0.330
2         1 GOOG     0.330
3         1 NFLX     0.330
4         2 AAPL     0.330
5         2 GOOG     0.330
6         2 NFLX     0.330
7         3 AAPL     0.330
8         3 GOOG     0.330
9         3 NFLX     0.330

The above results are for the equally weighted data. Again, the problem occurs when the number of firms increase and the portfolio size increases.

Heres the dput link dput data

Upvotes: 1

Views: 1702

Answers (1)

Mankind_2000
Mankind_2000

Reputation: 2208

Taking stock_returns_monthly_multi dataset as df. Note that df seems to be already grouped on portfolio. Using dplyr: weights will be equally divided among number of symbol in each portfolio independently.

library(dplyr)

df <- stock_returns_monthly_multi

df %>% 
   distinct(portfolio, symbol) %>% 
mutate(weights = 1/n())

# A tibble: 9 x 3
# Groups:   portfolio [3]
#  portfolio symbol weights
#      <int> <chr>    <dbl>
#1         1 AAPL     0.333
#2         1 GOOG     0.333
#3         1 NFLX     0.333
#4         2 AAPL     0.333
#5         2 GOOG     0.333
#6         2 NFLX     0.333
#7         3 AAPL     0.333
#8         3 GOOG     0.333
#9         3 NFLX     0.333

EDIT: If you need to randomly assign weights adding to 1, for each portfolio independently. you can evaluate weights, w = x/ sum(x) for each portfolio, where elements of x are i.i.d. runif[0,1]. prop.table can be used to achieve this:

df %>% 
   distinct(portfolio, symbol) %>% 
mutate(weights = prop.table(runif(n())))

#+ + # A tibble: 9 x 3
## Groups:   portfolio [3]
#  portfolio symbol weights
#      <int> <chr>    <dbl>
#1         1 AAPL     0.548
#2         1 GOOG     0.292
#3         1 NFLX     0.160
#4         2 AAPL     0.107
#5         2 GOOG     0.140
#6         2 NFLX     0.754
#7         3 AAPL     0.195
#8         3 GOOG     0.417
#9         3 NFLX     0.387

This is a quick / easy way to achieve it but has issues with statistical accuracy, Refer this very interesting post: Randomly generated weights sum to one. We can code the accepted answer into a function (gen_weight_vec) and use with mutate. Something like:

gen_weight_vec <- function(n){x <- runif(n) 
                              y <- -log(x) 
                              return(y/sum(y))}

df_weight <- df %>% 
                distinct(portfolio, symbol) %>% 
             mutate(weights = gen_weight_vec(n()))

You can check weights sum for each porfolio:

 summarise(df_weight, sum_weights = sum(weights))

## A tibble: 3 x 2
#  portfolio sum_weights
#      <int>       <dbl>
#1         1          1 
#2         2          1
#3         3          1 

Upvotes: 2

Related Questions