Reputation: 10136
I have 2 very large (billions of rows) splayed tables, Trades
and StockPrices
, on a remote server. I want to do an asof join
h:hopen `:RemoteServer:Port
h"aj[`Stock`Date`Time,
select from Trades where Date within 2014.04.01 2014.04.13,
StockPrices
]"
But I just get the error (I'm Studio for KDB+)
An error occurred during execution of the query.
The server sent the response:
splay
Studio Hint: Possibly this error refers to nyi op on splayed table
So what would be the correct way to do such a join?
Also, performance and efficiency is an issue with such a big table -- what should I be doing to ensure the query doesn't take hours and doesn't consume to much of the server's system resources?
Upvotes: 1
Views: 1544
Reputation: 1996
You need to map the splayed StockPrices
table into memory. This can be done by using a select
query:
q)(`::6060)"aj[`sym`time;select from trade;quote]" / bad
'splay
q)(`::6060)"aj[`sym`time;select from trade;select from quote]" / good
sym time prx bid ask
-------------------------------------------
aea 01:01:16.347 637.7554 866.0131 328.1476
aea 01:59:14.108 819.5301 115.053 208.1114
aea 02:42:44.724 69.38325 641.8554 333.3092
This page may be useful for looking up errors from Kdb+: http://code.kx.com/q/ref/error-list/
Regarding optimising performance of aj
see http://code.kx.com/q/ref/joins/#aj-aj0-asof-join
Also, if there isn't an overlap of data between days, it may be faster to run the query on a day by day basis, possibly in parallel.
If there is an overlap of data across days, combining the date & time columns into a single timestamp column would speed up the lookup.
Upvotes: 4