Tal Barda
Tal Barda

Reputation: 623

How can I implement a dictionary in python that is persistent to disk in a way I can recover from a failure?

I have a producer-consumer implementation that I'm implementing with pyzmq (I took the example from here: I have one producer that produce simultaneously objects to several different types of consumer, where each type of consumer, process this object in a different way.

In addition, I have a manager/supervisor that gets messages regarding the states of the processed objects (e.g. sent_from_producer_to_consumer_1, processed_by_consumer_2, etc), that stores, for each object, all of these states in a dictionary that it's key is the object logical ID and it's values is the list of states.

I want to protect my system from total loss of information in case of a system-failure by recovering it to the latest state it has been at.

So my question is - how can I do that? how can I make sure the manager's objects' states is disk persistent?

Upvotes: 0

Views: 327

Answers (3)

VDes
VDes

Reputation: 66

To the options using shelve, pickle or JSON, I would add persidict.

persidict is a Python library designed to make dictionaries persistent by writing their data to disk.

You can integrate persidict into your manager/supervisor component:

  1. Creating a persistent dictionary that stores each key-value pair as a separate file within the specified directory.
    from persidict import FileDirDict
    
    # Initialize the persistent dictionary
    state_dict = FileDirDict(base_dir="path_to_storage_directory")
  1. Store and update object states.
    # Update the state of an object
    object_id = "object_123"
    state_dict[object_id] = "processed_by_consumer_1"
    
    # Append a new state to the existing list of states
    if object_id in state_dict:
        current_states = state_dict[object_id]
        current_states.append("processed_by_consumer_2")
        state_dict[object_id] = current_states
    else:
        state_dict[object_id] = ["processed_by_consumer_2"]

Upvotes: 0

Tal Barda
Tal Barda

Reputation: 623

I found a package/library called shelve that satisfies my wishes. From the documentation :

A “shelf” is a persistent, dictionary-like object ...

Upvotes: 0

msvalkon
msvalkon

Reputation: 12077

For a trivial use-case, you can serialize the data to disk using pickle or if your objects are really simple, you can even use JSON. Databases are also valid for such tasks and scale better.

Pickle works as follows:

>>> import pickle
>>> d = dict(foo=1, bar=2, baz=3)
>>> with open("/tmp/test.pkl", "wb") as f:
...    pickle.dump(d, f)
...
>>> with open("/tmp/test.pkl", "wb") as f:
...    print(pickle.load(f))
{'bar': 2, 'foo': 1, 'baz': 3}

Upvotes: 0

Related Questions