Reputation: 1525
I'm new to Caffe. I am trying to implement a Fully Convolution Neural Network (FCN-8s) for semantic segmentation. I have image data and label data, which are both images. This is for pixel-wise predictions.
I tried using ImageData as the data type, but it asks for an integer label, which is not applicable to this scenario. Kindly advise as how to I can give Caffe a 2D label. Should I prefer LMDB instead of ImageData? If so, how do I proceed? I could not find any good tutorial/documentation for a situation like this.
Upvotes: 4
Views: 1816
Reputation: 1644
Since you need to achieve pixel-wise predictions, you can't use a single label as ground-truth. Instead, you should use a ground-truth matrix of labels.
One of the Caffe guys wrote a code snippet for creating an LMDB with image data, see here:
import caffe
import lmdb
from PIL import Image
in_db = lmdb.open('image-lmdb', map_size=int(1e12))
with in_db.begin(write=True) as in_txn:
for in_idx, in_ in enumerate(inputs):
# load image:
# - as np.uint8 {0, ..., 255}
# - in BGR (switch from RGB)
# - in Channel x Height x Width order (switch from H x W x C)
im = np.array(Image.open(in_)) # or load whatever ndarray you need
im = im[:,:,::-1]
im = im.transpose((2,0,1))
im_dat = caffe.io.array_to_datum(im)
in_txn.put('{:0>10d}'.format(in_idx), im_dat.SerializeToString())
in_db.close()
Upvotes: 5