putty
putty

Reputation: 764

Divide column values by multiple values based on conditions

I have a csv file that contains indexes for various asset classes and most of them start on different dates. I would like to create new indexes for these asset classes that have the same base year. Below is a subset of the data I have.

indexes <- read.csv("AssetClassIndexes.csv")
indexes$Date <- as.Date(indexes$Date, '%m/%d/%Y')
indexes %>%
    filter(Date > as.Date('2013-01-01')) %>%
    select(Date, Large.Cap.Stocks, Mid.Cap.Stocks, Precious.Metals)   



     Date        Large.Cap.Stocks Mid.Cap.Stocks Precious.Metals
1  2013-01-31         130.9160       58.13547         651.1803
2  2013-02-28         132.6932       58.70621         658.3433
3  2013-03-31         137.6696       61.51427         690.4047
4  2013-04-30         140.3220       61.90042         684.9505
5  2013-05-31         143.6044       63.29899         720.4309
6  2013-06-30         141.6760       62.13056         723.7449
7  2013-07-31         148.8850       65.97987         777.3744
8  2013-08-31         144.5731       63.50743         750.3217
9  2013-09-30         149.1068       66.81690         803.2194
10 2013-10-31         155.9609       69.29937         831.1715
11 2013-11-30         160.7137       70.21606         877.3015
12 2013-12-31         164.7823       72.38485         893.8825
13 2014-01-31         159.0851       70.84785         854.2835
14 2014-02-28         166.3623       74.30846         890.2488
15 2014-03-31         167.7607       74.58250         898.8842
16 2014-04-30         169.0008       73.41721         868.2323
17 2014-05-31         172.9679       74.72066         869.1005
18 2014-06-30         176.5410       77.81163         906.8195
19 2014-07-31         174.1063       74.48576         853.8612
20 2014-08-31         181.0715       78.27180         892.6265
21 2014-09-30         178.5322       74.71220         841.8361

What I would like to do is create multiple base indexes based on various dates.

BaseDates <-
  c(
    '1973-12-31',
    '1981-06-30',
    '1984-03-31',
    '2001-03-31',
    '2007-12-31'
  )

I have the following line of code that allows me to create an index based on one date, but I can't figure out how to do all the base dates above. I'm guessing it involves some sort of apply function; any suggestions?

indexes %>%
  mutate_each(funs(BaseIdx(.,Date,as.Date('1984-06-30'))),-Date)

BaseIdx <- function(x, column, dte) {x / x[column == dte]}

Upvotes: 1

Views: 1273

Answers (1)

oshun
oshun

Reputation: 2349

There are multiple approaches you can take. Your suggested approach moves across each column (mutate_each) dividing values whose date matches a single date. You can iterate this over all your dates with _apply or another command.

An alternate approach below uses lapply to iterate across dates, dividing rows by a vector. The tricky part is the division of a dataframe by rows. Here, the dataframe is transposed (t) and divided by a vector (as.numeric), then retransposed back to the original format (additional methods here).

#indexes = the subsetted [21 x 4] data in your example
#Sample some dates based on your example data
BaseDates <- indexes[seq(1, 21, by=5), "Date"]  

IndexThemALL <- lapply(BaseDates, function(z) {  #z = each BaseDate
                  data.frame(
                    IndexDate = z,
                    Date = indexes$Date,
                    t(t(indexes[, cols])/as.numeric(indexes[indexes$Date == z, cols])) 
                    )
                  })

# Optional: collapse a list into a dataframe
IndexThemALL <- dplyr::rbind_all(IndexThemALL)

#Source: local data frame [105 x 5]
#IndexDate       Date Large.Cap.Stocks Mid.Cap.Stocks Precious.Metals
#1 2013-01-31 2013-01-31         1.000000       1.000000        1.000000
#2 2013-01-31 2013-02-28         1.013575       1.009817        1.011000
#3 2013-01-31 2013-03-31         1.051587       1.058119        1.060236
#4 2013-01-31 2013-04-30         1.071848       1.064762        1.051860

Upvotes: 1

Related Questions