azuric
azuric

Reputation: 2839

how do you read a pytables table into a pandas dataframe

I have built a pytables table and filled it using append like so:

h5file = open_file("FGBS.h5", mode = "a")
group = h5file.create_group("/", 'hybrid')
table = h5file.create_table(group, 'z4', Hybrid ,filters= tb.Filters(5, "blosc"))

using:

class Hybrid(IsDescription):
    dateTime = Time32Col()
    price = Float64Col()
    quantity = Float64Col()
    bidPrc = Float64Col()
    bidSize = Float64Col()
    askPrc = Float64Col()
    askSize = Float64Col()

and appending to the table:

           if 0 in dictInstrumentsData[message.symbol].bidPrice and 0 in dictInstrumentsData[message.symbol].askPrice:
                hybrid = table.row
                hybrid["dateTime"] = message.timestamp * 0.001
                hybrid["price"] = message.price
                hybrid["quantity"] = message.size
                hybrid["bidPrc"] = dictInstrumentsData[message.symbol].bidPrice[0]
                hybrid["bidSize"] = dictInstrumentsData[message.symbol].bidSize[0]
                hybrid["askPrc"] = dictInstrumentsData[message.symbol].askPrice[0]
                hybrid["askSize"] = dictInstrumentsData[message.symbol].askSize[0]

                hybrid.append()

now I am trying to read it back into a pandas dataframe like this:

a = tb.open_file("FGBS.h5")
table = a.root.quote.z4
c = pd.DataFrame.from_records(table)

but then when I look at c all I get is:

                                          0       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          1       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          2       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          3       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          4       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          5       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          6       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          7       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          8       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          9       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                       ...                        \
0                      ...                         

                                          164401  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164402  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164403  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164404  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164405  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164406  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164407  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164408  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164409  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164410  
0  /hybrid/z4.row (Row), pointing to row #164411  

[1 rows x 164411 columns]

not a dataframe with columns based on the Hybrid columns and rows for each append. Can anyone help show me what I am doing wrong

Upvotes: 1

Views: 2469

Answers (1)

tharen
tharen

Reputation: 1300

You need to explicitly read the data from the table. Table.read will pull in the entire table, and Table.read_where allows you to apply a conditional statement to filter the data that is returned.

a = tb.open_file("FGBS.h5")
table = a.root.quote.z4
c = pd.DataFrame.from_records(table.read())

Upvotes: 4

Related Questions