Mike
Mike

Reputation: 1069

R: aggregating data between events

I have trade data as follows:

TradeNumber OpenTime               CloseTime       Profit     TradeHour Equity  
    1       01/01/2014 13:10    01/01/2014 14:40    10        13  520
    2       01/01/2014 13:25    01/01/2014 13:28    20        13  520
    3       01/01/2014 13:29    01/01/2014 15:40    -50       13  520
    4       01/01/2014 13:30    01/01/2014 14:05    -5        13  520
    5       01/01/2014 14:12    01/01/2014 14:40    12        14  560
    6       01/01/2014 14:21    01/01/2014 14:45    -16       14  560
    7       01/01/2014 14:50    01/01/2014 14:59    -14       14  560
    8       01/01/2014 14:58    01/01/2014 15:05    56        14  560

I am looking to find, for each trade, the sum of the profits of all other trades that closed in the same hour, but before that particular trade, and add it to equity at the time of the trade. So, in the given example, the result would be:

TradeNumber OpenTime              CloseTime       Profit    TradeHour   Equity
1           01/01/2014 13:10    01/01/2014 14:40    10         13   520
2           01/01/2014 13:25    01/01/2014 13:28    20         13   520
3           01/01/2014 13:29    01/01/2014 15:40    -50        13 520 + 20
4           01/01/2014 13:30    01/01/2014 14:05    -5         13   520 + 20
5           01/01/2014 14:12    01/01/2014 14:40    12         14   560
6           01/01/2014 14:21    01/01/2014 14:45    -16        14   560 - 5
7           01/01/2014 14:50    01/01/2014 14:59    -14        14   560+10-5+12-16
8           01/01/2014 14:58    01/01/2014 15:05    56         14   560+10-5+12-16

Trade number 8, for example, opened at 14:58 on 01/01/2014. Prior to it opening, there were 4 other trades that closed in that hour (trades 1, 4, 5 and 6). I would, therefore, like to add the profits from those 4 trades to the equity at the beginning of the hour and place that number in the equity column of the trades data.

       for (i in 1:nrow(tradeData))
        {
          tradeData$EquityUSD1 [i] = tradeData$Equity [i] + sum(tradeData$Profit[tradeData$CloseTime <= tradeData$OpenTime[i] & tradeData$CloseTime >= tradeData$tradeHour[i,1]])
  }

This works, but is quite slow and I would like to speed it up, as there are tens of thousands of trades.

Any ideas? Please let me know if I have omitted any important data/info

Thanks

Upvotes: 2

Views: 104

Answers (2)

xb.
xb.

Reputation: 1677

Haven't test the speed with a larger data set yet..

dt
##   TradeNumber         OpenTime        CloseTime Profit TradeHour Equity
## 1           1 01/01/2014 13:10 01/01/2014 14:40     10        13    520
## 2           2 01/01/2014 13:25 01/01/2014 13:28     20        13    520
## 3           3 01/01/2014 13:29 01/01/2014 15:40    -50        13    520
## 4           4 01/01/2014 13:30 01/01/2014 14:05     -5        13    520
## 5           5 01/01/2014 14:12 01/01/2014 14:40     12        14    560
## 6           6 01/01/2014 14:21 01/01/2014 14:45    -16        14    560
## 7           7 01/01/2014 14:50 01/01/2014 14:59    -14        14    560
## 8           8 01/01/2014 14:58 01/01/2014 15:05     56        14    560

require(data.table)
setDT(dt)

dt[,OpenTime:=as.POSIXct(OpenTime,format="%m/%d/%Y %H:%M")]
dt[,CloseTime:=as.POSIXct(CloseTime,format="%m/%d/%Y %H:%M")]

dt[,Equity.new:=Equity+sum(dt$Profit[hour(OpenTime)==hour(dt$CloseTime) & OpenTime > dt$CloseTime]), by="TradeNumber"]

dt
##    TradeNumber            OpenTime           CloseTime Profit TradeHour Equity Equity.new
## 1:           1 2014-01-01 13:10:00 2014-01-01 14:40:00     10        13    520        520
## 2:           2 2014-01-01 13:25:00 2014-01-01 13:28:00     20        13    520        520
## 3:           3 2014-01-01 13:29:00 2014-01-01 15:40:00    -50        13    520        540
## 4:           4 2014-01-01 13:30:00 2014-01-01 14:05:00     -5        13    520        540
## 5:           5 2014-01-01 14:12:00 2014-01-01 14:40:00     12        14    560        555
## 6:           6 2014-01-01 14:21:00 2014-01-01 14:45:00    -16        14    560        555
## 7:           7 2014-01-01 14:50:00 2014-01-01 14:59:00    -14        14    560        561
## 8:           8 2014-01-01 14:58:00 2014-01-01 15:05:00     56        14    560        561

Upvotes: 1

dwcoder
dwcoder

Reputation: 488

The following code seems to produce the output you want, assuming your data is in a data frame called tradedata:

ddply( tradedata , .(TradeHour) , 
       mutate, 
       Equity=Equity+ cumsum(Profit) - Profit ) 

If you want to include the profit of the particular trade, remove the -Profit. You could run this in parallel by giving ddply the .parallel=TRUE option. An answer with data.table may be quicker, however. It would be interesting to see which works best.

Upvotes: 1

Related Questions