Reputation: 663
I have images 1750*1750 and I would like to label them and put them into a file in the same format as CIFAR10. I have seen a similar answer before that gave an answer:
label = [3]
im = Image.open(img)
im = (np.array(im))
print(im)
r = im[:,:,0].flatten()
g = im[:,:,1].flatten()
b = im[:,:,2].flatten()
array = np.array(list(label) + list(r) + list(g) + list(b), np.uint8)
array.tofile("info.bin")
but it doesn't include how to add multiple images in a single file. I have looked at CIFAR10 and tried to append the arrays in the same way, but all I got was the following error:
E tensorflow/core/client/tensor_c_api.cc:485] Read less bytes than requested
Note that I am using Tensorflow to do my computations, and I have been able to isolate the problem from the data.
Upvotes: 0
Views: 787
Reputation: 126184
The CIFAR-10 binary format represents each example as a fixed-length record with the following format:
Assuming you have a list of image filenames called images
, and a list of integers (less than 256) called labels
corresponding to their labels, the following code would write a single file containing these images in CIFAR-10 format:
with open(output_filename, "wb") as f:
for label, img in zip(labels, images):
label = np.array(label, dtype=np.uint8)
f.write(label.tostring()) # Write label.
im = np.array(Image.open(img), dtype=np.uint8)
f.write(im[:, :, 0].tostring()) # Write red channel.
f.write(im[:, :, 1].tostring()) # Write green channel.
f.write(im[:, :, 2].tostring()) # Write blue channel.
Upvotes: 2