Reputation: 1852
I’m trying to create a custom function which use the drawdown
function from the tseries
package. I want to apply this function to the correct range of values in the function, but even though this is a pretty newbie question, I can’t see a possible solution.
Here’s what my dataframe looks like:
> subSetTrades
Instrument EntryTime ExitTime AccountValue
1 JPM 2007-03-01 2007-04-10 6997
2 JPM 2007-04-10 2007-05-29 7261
3 JPM 2007-05-29 2007-07-18 7545
4 JPM 2007-07-18 2007-07-19 7614
5 JPM 2007-07-19 2007-08-22 7897
6 JPM 2007-08-22 2007-08-28 7678
7 JPM 2007-08-28 2007-09-17 7587
8 JPM 2007-09-17 2007-10-17 7752
9 JPM 2007-10-17 2007-10-29 7717
10 JPM 2007-10-29 2007-11-02 7423
11 KFT 2007-04-13 2007-05-14 6992
12 KFT 2007-05-14 2007-05-21 6944
13 KFT 2007-05-21 2007-07-09 7069
14 KFT 2007-07-09 2007-07-16 6919
15 KFT 2007-07-16 2007-07-27 6713
16 KFT 2007-07-27 2007-09-07 6820
17 KFT 2007-09-07 2007-10-12 6927
18 KFT 2007-10-12 2007-11-28 6983
19 KFT 2007-11-28 2007-12-18 6957
20 KFT 2007-12-18 2008-02-20 7146
If I manually calculate the values I want my function to output, the results are correct:
# Apply the function to the dataframe
with(subSetTrades, tapply(AccountValue, Instrument, MDD_Duration))
JPM KFT
106 85
> # Check the function for JPM
> maxdrawdown(subSetTrades[1:10,4])$from
[1] 5
> maxdrawdown(subSetTrades[1:10,4])$to
[1] 10
> # Get the entry time for JPM on row 5
> # Get the exit time for JPM on row 10
> # Calculate the time difference
> difftime(subSetTrades[10,3], subSetTrades[5,2], units='days')
Time difference of 106 days
# Check the calculations for the other Instrument
> maxdrawdown(subSetTrades[11:20,4])$from
[1] 3
> maxdrawdown(subSetTrades[11:20,4])$to
[1] 5
> # Get the exittime on row 5 for KFT, get the entrytime for KFT on row 3,
# and calculate the time difference
> difftime(subSetTrades[15,3], subSetTrades[13,2])
Time difference of 67 days
As you can see in the above example, my custom function (MDD_Duration
) gives the right values for JPM but gives the wrong values for KFT: instead of 85 the result should be 67. The function MDD_Duration is the following:
MDD_Duration <- function(x){
require(tseries)
# Get starting point
mdd_Start <- maxdrawdown(x)$from
mdd_StartDate <- subSetTrades$EntryTime[mdd_Start]
# Get the endpoint
mdd_End <- maxdrawdown(x)$to
mdd_EndDate <- subSetTrades$ExitTime[mdd_End]
return(difftime(mdd_EndDate, mdd_StartDate, units='days'))
}
Manually retracing the steps of this custom function shows there is a problem with the calculation with the ‘from
’ and ‘to
’ row numbers (i.e. R needs to adjust the values of KFT for the length of the instrument which preceded it, in this case JPM). For the possible solution, R needs to do something like:
Get the ‘from’ value of the maxdrawdown
function if this instrument is the first (i.e. in top of the list). However, if the current instrument is the second (or third, etc), then take into account the length of the previous instrument. So, if instrument JPM has a length of 10, the searching for the values of KFT should start at +10. And the searching for the from
and to
values for instrument 3 should start at the lenght of instrument 1 + the length of instrument 2.
I tried using nrow
into the function (which seems the obvious solution to this answer), which resulted in errors regarding ‘argument of length 0’, even though nrow was used correctly (i.e. the same statement outside the function did work). I also tried to subset the data inside the function, which also didn’t work out. Any ideas are highly welcome. :)
Upvotes: 1
Views: 435
Reputation: 174948
split
is your friend here. If I modify your function so that it expects a data frame with the three variables of interest (AccountValue, EntryTime, ExitTime) like this:
MDD_Duration <- function(x){
# require(tseries)
# Get starting point
mdd_Start <- maxdrawdown(x$AccountValue)$from
mdd_StartDate <- x$EntryTime[mdd_Start]
# Get the endpoint
mdd_End <- maxdrawdown(x$AccountValue)$to
mdd_EndDate <- x$ExitTime[mdd_End]
return(difftime(mdd_EndDate, mdd_StartDate, units='days'))
}
The we can apply it to the splitted version of your data:
> sapply(split(subSetTrades[,-1], subSetTrades[,1]), MDD_Duration)
JPM KFT
106 67
It might be helpful to see what split
is doing to your data:
> split(subSetTrades[,-1], subSetTrades[,1])
$JPM
EntryTime ExitTime AccountValue
1 2007-03-01 2007-04-10 6997
2 2007-04-10 2007-05-29 7261
3 2007-05-29 2007-07-18 7545
4 2007-07-18 2007-07-19 7614
5 2007-07-19 2007-08-22 7897
6 2007-08-22 2007-08-28 7678
7 2007-08-28 2007-09-17 7587
8 2007-09-17 2007-10-17 7752
9 2007-10-17 2007-10-29 7717
10 2007-10-29 2007-11-02 7423
$KFT
EntryTime ExitTime AccountValue
11 2007-04-13 2007-05-14 6992
12 2007-05-14 2007-05-21 6944
13 2007-05-21 2007-07-09 7069
14 2007-07-09 2007-07-16 6919
15 2007-07-16 2007-07-27 6713
16 2007-07-27 2007-09-07 6820
17 2007-09-07 2007-10-12 6927
18 2007-10-12 2007-11-28 6983
19 2007-11-28 2007-12-18 6957
20 2007-12-18 2008-02-20 7146
So as long as you have a function that will accept and work with a data frame/ subset of your data set, we can use split
to form the subsets and lapply
or sapply
to apply our function to those subsets.
You might want to incorporate this into your function MDD_Duration()
:
MDD_Duration2 <- function(x){
FUN <- function(x) {
# Get starting point
mdd_Start <- maxdrawdown(x$AccountValue)$from
mdd_StartDate <- x$EntryTime[mdd_Start]
# Get the endpoint
mdd_End <- maxdrawdown(x$AccountValue)$to
mdd_EndDate <- x$ExitTime[mdd_End]
return(difftime(mdd_EndDate, mdd_StartDate, units='days'))
}
sapply(split(x, droplevels(x[, "Instrument"])), FUN)
}
Where we use the new (in R 2.12.x) function droplevels
on x[, "Instrument"])
to allow the function to work even if we have a single level of data or operate on a subset of the data:
> MDD_Duration2(subSetTrades)
JPM KFT
106 67
> MDD_Duration2(subSetTrades[1:10,])
JPM
106
Upvotes: 2