S Andrew
S Andrew

Reputation: 7218

How to append new data to pickle file using python

I am extracting face embedding of an image and appending it in a existing pickle file. But looks like its not working as when I unpickle the file, it do not contains the new data added. Below is code:

file = client_dir + '\embeddings.pickle'
data = {"embeddings": known_embeddings, "names": known_names}
with open(file, 'ab+') as fp:
    pickle.dump(data, fp)
    fp.close()
log("[INFO] Data appended to embeddings.pickle ")

Current pickle file contains below data:

{'embeddings': [array([-0.03656099,  0.11354745, -0.00438912,  0.0367547 ,  0.06391761,
        0.18440282,  0.06150107, -0.17380905,  0.03094344, -0.00182147,
        0.00969766,  0.06890091,  0.04974053, -0.0502388 , -0.03414046,
       -0.13550822, -0.02251128,  0.14556041, -0.04045469,  0.06500552,
        0.0726142 , -0.04139924, -0.04662199,  0.08869533, -0.00061307,
       -0.11912274,  0.13141112, -0.00648551,  0.00296356,  0.03682912,
       -0.15076959,  0.03989822,  0.02799555,  0.03429572,  0.09865954,
        0.14113557, -0.08355764,  0.09193961, -0.00819231, -0.01184336,
       -0.12519744,  0.00668721,  0.0816237 ,  0.00464355, -0.00339399,
        0.07501812,  0.11679655, -0.09211859,  0.06211261, -0.00543289,
        0.10347278,  0.06651585, -0.01512023,  0.09477805,  0.09886038,
       -0.03837246,  0.02265131, -0.14867221,  0.00781244,  0.04845129,
       -0.0363168 , -0.00186919, -0.16163988,  0.09539618,  0.14983718,
        0.09159472, -0.05315595, -0.05073383,  0.01501674, -0.03789762,
        0.07116041,  0.07650694, -0.02975985], dtype=float32)], 'names': ['rock']} 

New data which I am trying to append is below:

{'embeddings': [array([-0.03656099,  0.11354745, -0.00438912,  0.0367547 ,  0.06391761,
        0.18440282,  0.06150107, -0.17380905,  0.03094344, -0.00182147,
        0.00969766,  0.06890091,  0.04974053, -0.0502388 , -0.03414046,
        0.07501812,  0.11679655, -0.09211859,  0.06211261, -0.00543289,
        -0.13550822, -0.02251128,  0.14556041, -0.04045469,  0.06500552,
        0.0726142 , -0.04139924, -0.04662199,  0.08869533, -0.00061307,
       -0.11912274,  0.13141112, -0.00648551,  0.00296356,  0.03682912,
       -0.15076959,  0.03989822,  0.02799555,  0.03429572,  0.09865954,
        0.14113557, -0.08355764,  0.09193961, -0.00819231, -0.01184336,
       -0.12519744,  0.00668721,  0.0816237 ,  0.00464355, -0.00339399,
        0.10347278,  0.06651585, -0.01512023,  0.09477805,  0.09886038,
       -0.03837246,  0.02265131, -0.14867221,  0.00781244,  0.04845129,
       -0.0363168 , -0.00186919, -0.16163988,  0.09539618,  0.14983718,
        0.09159472, -0.05315595, -0.05073383,  0.01501674, -0.03789762,
        0.07116041,  0.07650694, -0.02975985], dtype=float32)], 'names': ['john']}

But when I unpickle the file it only has the the data for rock and not the john. Can anyone please help me what I am doing wrong. Below is the code I am using to unpickle and watch what data is added. May be the way I am unpickling the file is wrong, because when I am appending the data I can see the file size increasing.

import pickle

file = open('G:\\output\\embeddings.pickle', 'rb')

data = pickle.load(file)

file.close()

print(data)

Please help. Thanks

Updated code:

file_path = client_dir + '\embeddings.pickle'
file = open(file_path, 'rb')
old_data = pickle.load(file)
new_embeddings = old_data['embeddings']
new_names = old_data['names']
new_embeddings.append(known_embeddings[0])
new_names.append(known_names[0])
data1 = {"embeddings": new_embeddings, "names": new_names}
with open(file_path, 'ab+') as fp:
    pickle.dump(data1, fp)
    fp.close()
log.error("[INFO] Data appended to embeddings.pickle ")

In the above code, I am first loading the data from the pickle file into list and then appending the new data into the list and then adding all the data (old + new) into the pickle file. Can anyone please tell me if this is the correct way of doing it.

After this as well, when I unpickle the file, I am not getting all the data. Thanks

Upvotes: 1

Views: 9874

Answers (2)

Call Saul
Call Saul

Reputation: 41

It can be done without loading the data first to improve speed: use mode='ab' to create a new file if file doesn't exist, or append data if file exists:

pickle.dump((data), open('data folder/' + filename2save + '.pkl', 'ab'))

Upvotes: 2

Kevin
Kevin

Reputation: 76194

file_path = client_dir + '\embeddings.pickle'
file = open(file_path, 'rb')
old_data = pickle.load(file)
new_embeddings = old_data['embeddings']
new_names = old_data['names']
new_embeddings.append(known_embeddings[0])
new_names.append(known_names[0])
data1 = {"embeddings": new_embeddings, "names": new_names}
with open(file_path, 'ab+') as fp:
    pickle.dump(data1, fp)
    fp.close()
log.error("[INFO] Data appended to embeddings.pickle ")

This looks pretty close to being correct to me. You succesfully load the pickled data and add new elements to it. The problem appears to be the with open(file_path, 'ab+') as fp: call. If you open the file in "a" mode, then the pickle data you write will get added to the end, after the old pickle data. Then, on subsequent executions of your program, pickle.load will only load the old pickle data.

Try overwriting the old pickle data completely with your new pickle data. You can do this by opening in "w" mode instead.

with open(file_path, 'wb') as fp:
    pickle.dump(data1, fp)

Incidentally, you don't need that fp.close() call. A with statement automatically closes the opened file at the end of the block.

Upvotes: 1

Related Questions