penduDev
penduDev

Reputation: 4785

Python - Datatype Retention in saving to Pickle file

I'm saving a dictionary of numpy arrays to a pickle file. And then unpickling them to new variables. Code is like this:

Pickling:

# here the variables 'train_dataset', 'train_labels' etc are all np arrays.
save = {
    'train_dataset': train_dataset,
    'train_labels': train_labels,
    'valid_dataset': valid_dataset,
    'valid_labels': valid_labels,
    'test_dataset': test_dataset,
    'test_labels': test_labels,
    }
pickle.dump(save, f, pickle.HIGHEST_PROTOCOL)

Unpickling:

save = pickle.load(f)
train_dataset_new = save['train_dataset']
train_labels_new = save['train_labels']
valid_dataset_new = save['valid_dataset']
valid_labels_new = save['valid_labels']
test_dataset_new = save['test_dataset']
test_labels_new = save['test_labels']

Will the variables loaded from the pickle file also be np arrays? Please also elaborate a bit if you can.

Thanks

Upvotes: 1

Views: 1152

Answers (1)

devautor
devautor

Reputation: 2586

Quoting directly from the docs:

Read a string from the open file object file and interpret it as a pickle data stream, reconstructing and returning the original object hierarchy.

Little test code to check the datatype of loaded variable which is <type 'numpy.ndarray'>:

import numpy as np
import pickle 

#f = open( "pickled.p", "wb" )

train_dataset = np.ones(5)
train_labels = np.ones(5)
valid_dataset = np.ones(5)
valid_labels = np.ones(5)
test_dataset = np.ones(5)
test_labels = np.ones(5)

print type(train_dataset)  # <type 'numpy.ndarray'>
print train_dataset.shape  # <5L,>

# here the variables 'train_dataset', 'train_labels' etc are all np arrays.
save = {
    'train_dataset': train_dataset,
    'train_labels': train_labels,
    'valid_dataset': valid_dataset,
    'valid_labels': valid_labels,
    'test_dataset': test_dataset,
    'test_labels': test_labels,
    }
pickle.dump(save, open( "save.p", "wb" ), pickle.HIGHEST_PROTOCOL)

save = pickle.load(open( "save.p", "rb" ))
train_dataset_new = save['train_dataset']
train_labels_new = save['train_labels']
valid_dataset_new = save['valid_dataset']
valid_labels_new = save['valid_labels']
test_dataset_new = save['test_dataset']
test_labels_new = save['test_labels']

print type(train_dataset_new)  # <type 'numpy.ndarray'>
print train_dataset_new.shape  # <5L,>

Upvotes: 1

Related Questions