Reputation: 427
I have a splayed Kdb database of symbols, floats, and timestamps. I'd like to convert this to NumPy arrays. However using the following code...
>>> import numpy as np
>>> from pyq import q
>>> d = q.load(':alpha/HDB/')
>>> a = np.array(d)
Returns this error...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/marrowgari/py3/lib/python3.6/site-packages/pyq/_n.py", line 158, in array
return numpy.array(list(self), dtype)
TypeError: iteration over a K scalar, t=-11
Is this because Kdb symbol types do not have a direct analogue in NumPy? If so, how do I correct this?
Upvotes: 2
Views: 897
Reputation: 2268
Suppose your HDB was created as follows:
q)(` sv db,`t`)set .Q.en[db:`:alpha/HDB]([]sym:`A`B`C;a:1 2 3)
`:alpha/HDB/t/
q)\l alpha/HDB
q)t
sym a
-----
A 1
B 2
C 3
Then, first of all you should load it using \l
command, not the load function:
>>> q('\\l alpha/HDB')
k('::')
This will load all your tables and enumeration domains.
Now you should be able to convert the sym column of your table to a numpy array of strings
>>> np.array(q.t.sym)
array(['A', 'B', 'C'], dtype=object)
or to a numpy array of integers:
>>> np.array(q.t.sym.data)
array([0, 1, 2])
You can also convert the entire table to a numpy record array in one go, but you will have to "map" it into the memory first:
>>> np.array(q.t.select())
array([('A', 1), ('B', 2), ('C', 3)], dtype=[('sym', 'O'), ('a', '<i8')])
Upvotes: 3
Reputation: 2981
I don't think .q.load
does what you're expecting - the return of this function is simply a null symbol. I think instead you need to use .q.get
e.g.
jmcmurray@host ~/hdb $ pyq
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> q.load("sym")
k('`sym')
>>> np.array(q.get(":2014.04.21/trades").select())
array([('AAPL', '2014-04-21T08:00:37.853000000', 'O', 25.33, 5048),
('AAPL', '2014-04-21T08:00:58.840000000', 'O', 25.35, 4580),
('AAPL', '2014-04-21T08:01:40.150000000', 'O', 25.35, 5432), ...,
('YHOO', '2014-04-21T16:29:06.868000000', 'L', 35.32, 4825),
('YHOO', '2014-04-21T16:29:43.655000000', 'L', 35.32, 6125),
('YHOO', '2014-04-21T16:29:57.229000000', 'L', 35.36, 41)],
dtype=[('sym', 'O'), ('time', '<M8[ns]'), ('src', 'O'), ('price', '<f8'), ('size', '<i4')])
>>>
Note here I first use .q.load
to load the sym
file, as the symbol columns are enumerated. Then I load one splayed table from my HDB, which should be equivalent to your splayed table.
I also use .select()
on the table as .q.get()
simply maps the table into memory (same as get
in KDB), it's necessary to use select
to pull the actual data into memory.
Upvotes: 2