erkevarol
erkevarol

Reputation: 57

Cifar - 10 / Unpickle

I am getting following error when i try to unpickle the cifar-10 dataset. I need to train a model but I can't even get the data for my operations. How can I fix this problem

dict=cPickle.load(fo)

UnpicklingError: invalid load key, '\x06'.

import tensorflow as tf
import os
import numpy as np
import dataset_class
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import glob
from PIL import Image
from scipy.spatial.distance import pdist


def cifar_10_reshape(batch_arg):
    output=np.reshape(batch_arg,(10000,3,32,32)).transpose(0,2,3,1)
    return output

def unpickle(file):
    import _pickle as cPickle
    fo=open(file,'rb')
    dict=cPickle.load(fo)
    fo.close()
    return dict




#Loading cifar-10 data and reshaping it to be batch_sizex32x32x3
batch1=unpickle('cifar-10-batches-py/data_batch_1.bin')
batch2=unpickle('cifar-10-batches-py/data_batch_2.bin')
batch3=unpickle('cifar-10-batches-py/data_batch_3.bin')
batch4=unpickle('cifar-10-batches-py/data_batch_4.bin')
batch5=unpickle('cifar-10-batches-py/data_batch_5.bin')



batch1_data=cifar_10_reshape(batch1['data'])
batch2_data=cifar_10_reshape(batch2['data'])
batch3_data=cifar_10_reshape(batch3['data'])
batch4_data=cifar_10_reshape(batch4['data'])
batch5_data=cifar_10_reshape(batch5['data'])

batch1_labels=batch1['labels']
batch2_labels=batch2['labels']
batch3_labels=batch3['labels']
batch4_labels=batch4['labels']
batch5_labels=batch5['labels']

test_batch=unpickle('cifar-10-batches-py/test_batch')
test_images=cifar_10_reshape(test_batch['data'])
test_labels_data=test_batch['labels']


train_images=np.concatenate((batch1_data,batch2_data,batch3_data,batch4_data,batch5_data),axis=0)
train_labels_data=np.concatenate((batch1_labels,batch2_labels,batch3_labels,batch4_labels,batch5_labels),axis=0)

Upvotes: 1

Views: 3933

Answers (1)

Jack273
Jack273

Reputation: 11

From what I have understood of the CIFAR-10 dataset, the version you are trying to unpickle is in a binary format, while you are not providing any information to the 'unpickler' about the encoding. You might have more luck trying the loading function provided on the CIFAR-10 website (https://www.cs.toronto.edu/~kriz/cifar.html) for python 3.x:

def unpickle(file):
    import pickle
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

Upvotes: 1

Related Questions