How to save a picture array, along with information pertaining to it?

Question

I am scraping cars, and will have many pictures, this part is not a problem. I want to save the car specifications also. I am wondering the best way to do this efficiently. Ideally, I would have something like built-in datasets in many libraries. Such as:

print(dataset)

{

'image': ([255, 203, 145, ...]),

'info': (['Audi', '355 HP', ...])

}

That way, I could easily extract images and info with dataset['info'], or something. I could easily assign both like x, y = dataset.

mathew gunther · Accepted Answer

There are several options, but for structured data like this, it's common to store dictionaries using hdf5.

See python tutorial and full documentation here

http://docs.h5py.org/en/stable/quick.html

Here's a full python example. Notice the dictionary like interface.

import h5py
import numpy as np

#####
#writing output file
#####
my_file = h5py.File("output.h5",'w')
my_file['info']  = np.string_("some_random pixels")    #hdf5 needs numpy to store strings
my_file['image'] = np.random.rand(5,5)
my_file.close()
#####
#reading input file
#####
loaded_file = h5py.File("output.h5",'r')
print(np.array(loaded_file['info']))                  #hdf5 also needs numpy to read strings as well
print(np.array(loaded_file['image']))
loaded_file.close()

How to save a picture array, along with information pertaining to it?

Answers (1)

Related Questions