Reputation: 351
I have 32 gb RAM and I use jupyter and pandas. My dataframe isn't very big, but when I want to write it in Arctic data base I have "MemoryError":
df_q.shape
(157293660, 10)
def memory(df):
mem = df.memory_usage(index=True).sum() / (1024 ** 3)
print(mem)
memory(df_q)
12.8912200034
And I want to write it:
from arctic import Arctic
import arctic as arc
store = Arctic('.....')
lib = store['myLib']
lib.write('quotes', df_q)
MemoryError Traceback (most recent call last) in () 1 memory(df_q) ----> 2 lib.write('quotes', df_q)
/usr/local/lib/python2.7/dist-packages/arctic/decorators.pyc in f_retry(*args, **kwargs) 48 while True: 49 try: ---> 50 return f(*args, **kwargs) 51 except (DuplicateKeyError, ServerSelectionTimeoutError) as e: 52 # Re-raise errors that won't go away.
/usr/local/lib/python2.7/dist-packages/arctic/store/version_store.pyc in write(self, symbol, data, metadata, prune_previous_version, **kwargs) 561 562 handler = self._write_handler(version, symbol, data, **kwargs) --> 563 mongo_retry(handler.write)(self._arctic_lib, version, symbol, data, previous_version, **kwargs) 564 565 # Insert the new version into the version DB
/usr/local/lib/python2.7/dist-packages/arctic/decorators.pyc in f_retry(*args, **kwargs) 48 while True: 49 try: ---> 50 return f(*args, **kwargs) 51 except (DuplicateKeyError, ServerSelectionTimeoutError) as e: 52 # Re-raise errors that won't go away.
/usr/local/lib/python2.7/dist-packages/arctic/store/_pandas_ndarray_store.pyc in write(self, arctic_lib, version, symbol, item, previous_version) 301 def write(self, arctic_lib, version, symbol, item, previous_version): 302 item, md = self.to_records(item) --> 303 super(PandasDataFrameStore, self).write(arctic_lib, version, symbol, item, previous_version, dtype=md) 304 305 def append(self, arctic_lib, version, symbol, item, previous_version):
/usr/local/lib/python2.7/dist-packages/arctic/store/_ndarray_store.pyc in write(self, arctic_lib, version, symbol, item, previous_version, dtype) 385 version['type'] = self.TYPE 386 version['up_to'] = len(item) --> 387 version['sha'] = self.checksum(item) 388 389 if previous_version:
/usr/local/lib/python2.7/dist-packages/arctic/store/_ndarray_store.pyc in checksum(self, item) 370 def checksum(self, item): 371 sha = hashlib.sha1() --> 372 sha.update(item.tostring()) 373 return Binary(sha.digest()) 374
MemoryError:
WTF ? If I use df_q.to_csv() I will wait for years....
Upvotes: 1
Views: 145
Reputation:
Your issue actually is not a memory issue. If you read your errors, it seems that your library is having trouble accessing your data...
1st Error: Says your server has timed out. (ServerSelectionTimeoutError
)
2nd Error: Trying to update MongoDB version.
3rd Error: Retries accessing your server, fails.(ServerSelectionTimeoutError
)
etc. So essentially your problem lies in the Arctic package itself (see last error is a checksum error). You can also deduce this from the fact that df_q.to_csv()
works, however it is very slow since it is not optimized like Artic. I would suggest trying to reinstall the Arctic package
Upvotes: 0