Reputation: 370
I am trying to set up a tick database using TICKSTORE from the Arctic package. When I run the example from the docs (link) and write a dataframe with datetime index in UTC, reading back the data returns the df with an index in 'Europe/Berlin', my pc's local timezone.
Is there a way to return the data in the UTC timestamp?
Code example:
from arctic import Arctic, TICK_STORE
from datetime import datetime as dt
import pandas as pd
from arctic.date._mktz import mktz
db = Arctic('localhost')
db.delete_library('temp')
db.initialize_library('temp', lib_type=TICK_STORE)
tickstore_lib = db['temp']
data = [{'A': 120, 'D': 1}, {'A': 122, 'B': 2.0}, {'A': 3, 'B': 3.0, 'D': 1}]
tick_index = [dt(2013, 6, 1, 12, 00, tzinfo=mktz('UTC')),
dt(2013, 6, 1, 11, 00, tzinfo=mktz('UTC')), # Out-of-order
dt(2013, 6, 1, 13, 00, tzinfo=mktz('UTC'))]
data = pd.DataFrame(data, index=tick_index)
tickstore_lib._chunk_size = 3
tickstore_lib.write('SYM', data)
print(tickstore_lib.read('SYM', columns=None).index)
I am running Windows 10, Mongo Server 4.4, Arctic 1.80.0, python 3.7.
Upvotes: 0
Views: 208
Reputation: 11
I guess that Arctic store the index as utc. When you read the data arctic as a wrapper or mongo will convert the index adjusted to the timezone of your operating system.
I do not think avoiding this operation is a good solution. Then you may have to change the source code of the Arctic library. Maybe a better solution is to remove the timezone information in the timestamp column of your dataframe and yield naive UTC time.
I change the last line in your code and add .tz_convert(None)
In[1]:
from arctic import Arctic, TICK_STORE
from datetime import datetime as dt
import pandas as pd
from arctic.date._mktz import mktz
db = Arctic('localhost')
db.delete_library('temp')
db.initialize_library('temp', lib_type=TICK_STORE)
tickstore_lib = db['temp']
data = [{'A': 120, 'D': 1}, {'A': 122, 'B': 2.0}, {'A': 3, 'B': 3.0, 'D': 1}]
tick_index = [dt(2013, 6, 1, 12, 00, tzinfo=mktz('UTC')),
dt(2013, 6, 1, 11, 00, tzinfo=mktz('UTC')), # Out-of-order
dt(2013, 6, 1, 13, 00, tzinfo=mktz('UTC'))]
data = pd.DataFrame(data, index=tick_index)
tickstore_lib._chunk_size = 3
tickstore_lib.write('SYM', data)
print(tickstore_lib.read('SYM', columns=None).index.tz_convert(None))
Out[1]:
NB treating all values as 'exists' - no longer sparse
TimeSeries data is out of order, sorting!
DatetimeIndex(['2013-06-01 11:00:00', '2013-06-01 12:00:00',
'2013-06-01 13:00:00'],
dtype='datetime64[ns]', freq=None)
I found the solution and a very good and detailed explanation in this post.
Upvotes: 0