prl900
prl900

Reputation: 4179

Pandas TimeSeries into MongoDB

I have a general pandas TimeSeries which I want to store in MongoDB. The object ts looks like this:

>ts
2013-01-01 00:00:00     456.852985
2013-01-01 01:00:00     656.015532
2013-01-01 02:00:00     893.159043
...
2013-12-31 21:00:00    1116.526471
2013-12-31 22:00:00    1124.903600
2013-12-31 23:00:00    1065.315890
Freq: H, Length: 8760, dtype: float64

I want to convert this into an array of JSON documents, where one document is one row, to store it in MongoDB. Something like this:

[{"index": 2013-01-01 00:00:00, "col1": 456.852985},
{"index": 2013-01-01 01:00:00, "col1": 656.015532},
{"index": 2013-01-01 02:00:00, "col1": 893.159043},
...
]

I've ben looking into the TimeSeries.to_json() 'orient' options but I can't see they way of getting this format. Is there an easy way of performing this operation in pandas or should I look for a way of creating this structure using an external JSON library?

Upvotes: 3

Views: 1319

Answers (2)

James Blackburn
James Blackburn

Reputation: 4491

Using one row per document will be pretty inefficient - in space and query performance terms.

If you have flexibility on the schema, we've open sourced a library for storing pandas (and other numeric data) easily in MongoDB:

https://github.com/manahl/arctic

Upvotes: 0

Andy Hayden
Andy Hayden

Reputation: 375435

One way is to make it a frame with reset_index so as to use the record orient of to_json:

In [11]: df = s.reset_index(name='col1')

In [12]: df
Out[12]: 
                 index        col1
0  2013-01-01 00:00:00  456.852985
1  2013-01-01 01:00:00  656.015532
2  2013-01-01 02:00:00  893.159043

In [13]: df.to_json(orient='records')
Out[13]: '[{"index":"2013-01-01 00:00:00","col1":456.852985},{"index":"2013-01-01 01:00:00","col1":656.015532},{"index":"2013-01-01 02:00:00","col1":893.159043}]'

Upvotes: 2

Related Questions