Reputation: 1626
Here is my data.table
structure(list(index = structure(c(17007, 17008, 17009, 17010,
17011, 17014, 17015, 17016, 17017), tzone = "UTC", tclass = "Date", class = "Date"),
close = c(97.339996, 96.669998, 102.949997, 104.339996, 104.209999,
106.050003, 104.480003, 105.790001, 105.870003), daily_return = c(-0.688306993560994,
6.49632681279251, 1.35016905342893, -0.124589807344833, 1.76566933850562,
-1.48043371578217, 1.25382653367649, 0.0756234041438208,
1.5207329313101), group = c(0, 0, 3, 3, 3, 3, 3, 3, 3), qq = c(1,
0, 6, 5, 4, 3, 2, 1, 0)), class = c("data.table", "data.frame"
), row.names = c(NA, -9L), .internal.selfref = <pointer: 0x558d2914e9a0>, sorted = "index")
and it looks like as follows -
index close daily_return group qq
1: 2016-07-25 97.34 -0.6883070 0 1
2: 2016-07-26 96.67 6.4963268 0 0
3: 2016-07-27 102.95 1.3501691 3 6
4: 2016-07-28 104.34 -0.1245898 3 5
5: 2016-07-29 104.21 1.7656693 3 4
6: 2016-08-01 106.05 -1.4804337 3 3
7: 2016-08-02 104.48 1.2538265 3 2
8: 2016-08-03 105.79 0.0756234 3 1
9: 2016-08-04 105.87 1.5207329 3 0
In this data.table, I need to find the return (from column close) by the interval given in qq column for each group. E.g. in row 3, the look forwarding interval is 6 and the return is to be calculated between column 3 and column 9 (3 + 6 lookforward).
When the group changes, the return is the be calculated between the last element of the first group and the first element of the second group. e.g, as the group changes between row 2 and row3, the return is calculated between these two rows.
Following is the desired result -
index close daily_return group qq desired_result
1: 2016-07-25 97.34 -0.6883070 0 1 -0.68
2: 2016-07-26 96.67 6.4963268 0 0 6.49
3: 2016-07-27 102.95 1.3501691 3 6 2.83
4: 2016-07-28 104.34 -0.1245898 3 5 1.46
5: 2016-07-29 104.21 1.7656693 3 4 1.59
6: 2016-08-01 106.05 -1.4804337 3 3 -0.16
7: 2016-08-02 104.48 1.2538265 3 2 1.33
8: 2016-08-03 105.79 0.0756234 3 1 0.07
9: 2016-08-04 105.87 1.5207329 3 0 NA
I am sure there must be an easy way to accomplish this using data.table.
I have tried the following and it works only on specific rows as k is constant.
lead(Delt(close, k = 6, type = "arithmetic") * 100, n = 6)
Is there a way to pass the column qq dynamically in the k variable of Delt command? OR, is there any other solution to this problem?
Upvotes: 0
Views: 212
Reputation: 174328
This does what you asked, but the desired result you are showing doesn't quite match the results one gets by following the algorithm you described, as far as I can tell.
Anyway, the algorithm is simple to accomplish in a loop:
# Creates a zero-filled column to hold the results
df$desired_result <- numeric(length(df$close))
# Loops through each row
for(i in seq_along(df$desired_result))
{
# We don’t want to do anything in the final row
if(i == length(df$desired_result)) break;
# if the row’s qq value is 0, get the difference between this row and next row’s close value
# otherwise use the row’s qq value to look ahead that number of rows
if(df$qq[i] == 0) df$desired_result[i] <- df$close[i + 1] - df$close[i]
else df$desired_result[i] <- df$close[i + df$qq[i]] - df$close[i]
}
Now you can do:
#> df
#> index close daily_return group qq desired_result
#> 1 2016-07-25 97.34 -0.6883070 0 1 -0.670000
#> 2 2016-07-26 96.67 6.4963268 0 0 6.279997
#> 3 2016-07-27 102.95 1.3501691 3 6 2.920006
#> 4 2016-07-28 104.34 -0.1245898 3 5 1.530007
#> 5 2016-07-29 104.21 1.7656693 3 4 1.660004
#> 6 2016-08-01 106.05 -1.4804337 3 3 -0.180000
#> 7 2016-08-02 104.48 1.2538265 3 2 1.390000
#> 8 2016-08-03 105.79 0.0756234 3 1 0.080002
#> 9 2016-08-04 105.87 1.5207329 3 0 0.000000
Upvotes: 2
Reputation: 42564
Here is an approach which uses data.table
syntax without looping:
idx2 <- close[, .I + ifelse(group == shift(group, -1L), qq, 1L)]
close[, return := (close[idx2] - close) / close * 100][]
index close daily_return group qq return 1: 2016-07-25 97.34 -0.6883070 0 1 -0.6883070 2: 2016-07-26 96.67 6.4963268 0 0 6.4963268 3: 2016-07-27 102.95 1.3501691 3 6 2.8363342 4: 2016-07-28 104.34 -0.1245898 3 5 1.4663667 5: 2016-07-29 104.21 1.7656693 3 4 1.5929412 6: 2016-08-01 106.05 -1.4804337 3 3 -0.1697313 7: 2016-08-02 104.48 1.2538265 3 2 1.3303981 8: 2016-08-03 105.79 0.0756234 3 1 0.0756234 9: 2016-08-04 105.87 1.5207329 3 0 NA
idx2
contains the row indices which are computed from the local index .I
plus qq
except when a new group starts in the next row.
idx2
[1] 2 3 9 9 9 9 9 9 NA
The formula to compute return
is taken from help("Delt", "quantmod")
:
Arithmetic differences are used by default: Lag = (x2(t) - x1(t-k))/x1(t-k)
Therefore, it is unclear to me why OP's expected result is different. See also chinsoon's comment.
Upvotes: 3