Reputation: 21
I am a beginner in kdb. As I was practicing window join on test data from NYSE, I came across issues with window join across different dates.
Basically, my table looks like:
t:([] sym:10#`AAPL;date:2021.03.21 2021.03.21 2021.03.21 2021.03.21 2021.03.21 2021.03.22 2021.03.22 2021.03.22 2021.03.22 2021.03.22;price:100 101 105 110 120 130 140 150 160 170;time:10:01 10:04 10:07 10:10 10:13 10:01 10:04 10:07 10:10 10:13)
I am trying to create a sliding window for every 3 minutes on each date and calculate the sum of price in that window. However, I am not sure how to do window join on different groups.
I tried:
w3:-3 0+\:t[`minute];
newdata: wj1[w3;`minute;t;(t;(sum;`price)
but this does not give me the correct result. Could someone please help with this. Thank you!
Upvotes: 0
Views: 493
Reputation: 2775
To do a wj
across dates you need a timestamp column which you can create from date and time:
t:update timeStamp:"P"$"D" sv/: flip string (date;time) from t
t
sym date price time timeStamp
---------------------------------------------------------
AAPL 2021.03.21 100 10:01 2021.03.21D10:01:00.000000000
AAPL 2021.03.21 101 10:04 2021.03.21D10:04:00.000000000
AAPL 2021.03.21 105 10:07 2021.03.21D10:07:00.000000000
AAPL 2021.03.21 110 10:10 2021.03.21D10:10:00.000000000
AAPL 2021.03.21 120 10:13 2021.03.21D10:13:00.000000000
You can then use the timeStamp column like so:
w3:-00:03 00:00 +\:t[`timeStamp]
wj1[w3;`timeStamp;t;(t;(sum;`price))]
sym date price time timeStamp
---------------------------------------------------------
AAPL 2021.03.21 100 10:01 2021.03.21D10:01:00.000000000
AAPL 2021.03.21 201 10:04 2021.03.21D10:04:00.000000000
AAPL 2021.03.21 206 10:07 2021.03.21D10:07:00.000000000
AAPL 2021.03.21 215 10:10 2021.03.21D10:10:00.000000000
AAPL 2021.03.21 230 10:13 2021.03.21D10:13:00.000000000
AAPL 2021.03.22 130 10:01 2021.03.22D10:01:00.000000000
AAPL 2021.03.22 270 10:04 2021.03.22D10:04:00.000000000
AAPL 2021.03.22 290 10:07 2021.03.22D10:07:00.000000000
AAPL 2021.03.22 310 10:10 2021.03.22D10:10:00.000000000
AAPL 2021.03.22 330 10:13 2021.03.22D10:13:00.000000000
If you have more than 1 sym in the table you should apply the parted attribute:
t:update `p#sym from `sym`timeStamp xasc t
Then add sym before timeStamp in the 2nd argument of wj:
q)wj[w3;`sym`timeStamp;select sym, timeStamp from t;(t;(sum;`price))]
sym timeStamp price
----------------------------------------
AAPL 2021.03.21D10:01:00.000000000 100
AAPL 2021.03.21D10:04:00.000000000 201
AAPL 2021.03.21D10:07:00.000000000 206
AAPL 2021.03.21D10:10:00.000000000 215
AAPL 2021.03.21D10:13:00.000000000 230
AAPL 2021.03.22D10:01:00.000000000 250
AAPL 2021.03.22D10:04:00.000000000 270
AAPL 2021.03.22D10:07:00.000000000 290
AAPL 2021.03.22D10:10:00.000000000 310
AAPL 2021.03.22D10:13:00.000000000 330
MSFT 2021.03.21D10:01:00.000000000 468
MSFT 2021.03.21D10:04:00.000000000 915
MSFT 2021.03.21D10:07:00.000000000 668
MSFT 2021.03.21D10:10:00.000000000 403
MSFT 2021.03.21D10:13:00.000000000 604
MSFT 2021.03.22D10:01:00.000000000 775
MSFT 2021.03.22D10:04:00.000000000 697
MSFT 2021.03.22D10:07:00.000000000 829
MSFT 2021.03.22D10:10:00.000000000 799
MSFT 2021.03.22D10:13:00.000000000 382
Upvotes: 1