Alison Park
Alison Park

Reputation: 21

How do I window join across different dates in kdb

I am a beginner in kdb. As I was practicing window join on test data from NYSE, I came across issues with window join across different dates.

Basically, my table looks like:

t:([] sym:10#`AAPL;date:2021.03.21 2021.03.21 2021.03.21 2021.03.21 2021.03.21 2021.03.22 2021.03.22 2021.03.22 2021.03.22 2021.03.22;price:100 101 105 110 120 130 140 150 160 170;time:10:01 10:04 10:07 10:10 10:13 10:01 10:04 10:07 10:10 10:13)

I am trying to create a sliding window for every 3 minutes on each date and calculate the sum of price in that window. However, I am not sure how to do window join on different groups.

I tried:

w3:-3  0+\:t[`minute];
newdata: wj1[w3;`minute;t;(t;(sum;`price)

but this does not give me the correct result. Could someone please help with this. Thank you!

Upvotes: 0

Views: 493

Answers (1)

Matt Moore
Matt Moore

Reputation: 2775

To do a wj across dates you need a timestamp column which you can create from date and time:

t:update timeStamp:"P"$"D" sv/: flip string (date;time) from t
t
sym  date       price time  timeStamp
---------------------------------------------------------
AAPL 2021.03.21 100   10:01 2021.03.21D10:01:00.000000000
AAPL 2021.03.21 101   10:04 2021.03.21D10:04:00.000000000
AAPL 2021.03.21 105   10:07 2021.03.21D10:07:00.000000000
AAPL 2021.03.21 110   10:10 2021.03.21D10:10:00.000000000
AAPL 2021.03.21 120   10:13 2021.03.21D10:13:00.000000000

You can then use the timeStamp column like so:

w3:-00:03 00:00 +\:t[`timeStamp]

wj1[w3;`timeStamp;t;(t;(sum;`price))]
sym  date       price time  timeStamp
---------------------------------------------------------
AAPL 2021.03.21 100   10:01 2021.03.21D10:01:00.000000000
AAPL 2021.03.21 201   10:04 2021.03.21D10:04:00.000000000
AAPL 2021.03.21 206   10:07 2021.03.21D10:07:00.000000000
AAPL 2021.03.21 215   10:10 2021.03.21D10:10:00.000000000
AAPL 2021.03.21 230   10:13 2021.03.21D10:13:00.000000000
AAPL 2021.03.22 130   10:01 2021.03.22D10:01:00.000000000
AAPL 2021.03.22 270   10:04 2021.03.22D10:04:00.000000000
AAPL 2021.03.22 290   10:07 2021.03.22D10:07:00.000000000
AAPL 2021.03.22 310   10:10 2021.03.22D10:10:00.000000000
AAPL 2021.03.22 330   10:13 2021.03.22D10:13:00.000000000

If you have more than 1 sym in the table you should apply the parted attribute:

t:update `p#sym from `sym`timeStamp xasc t

Then add sym before timeStamp in the 2nd argument of wj:

q)wj[w3;`sym`timeStamp;select sym, timeStamp from t;(t;(sum;`price))]
sym  timeStamp                     price
----------------------------------------
AAPL 2021.03.21D10:01:00.000000000 100
AAPL 2021.03.21D10:04:00.000000000 201
AAPL 2021.03.21D10:07:00.000000000 206
AAPL 2021.03.21D10:10:00.000000000 215
AAPL 2021.03.21D10:13:00.000000000 230
AAPL 2021.03.22D10:01:00.000000000 250
AAPL 2021.03.22D10:04:00.000000000 270
AAPL 2021.03.22D10:07:00.000000000 290
AAPL 2021.03.22D10:10:00.000000000 310
AAPL 2021.03.22D10:13:00.000000000 330
MSFT 2021.03.21D10:01:00.000000000 468
MSFT 2021.03.21D10:04:00.000000000 915
MSFT 2021.03.21D10:07:00.000000000 668
MSFT 2021.03.21D10:10:00.000000000 403
MSFT 2021.03.21D10:13:00.000000000 604
MSFT 2021.03.22D10:01:00.000000000 775
MSFT 2021.03.22D10:04:00.000000000 697
MSFT 2021.03.22D10:07:00.000000000 829
MSFT 2021.03.22D10:10:00.000000000 799
MSFT 2021.03.22D10:13:00.000000000 382

Upvotes: 1

Related Questions