Reputation: 421
Background information: I need to load some 16 bit grayscale PNGs.
Does Caffe support loading 16 bit images through the ImageDataLayer
?
After some googling, the answer seems it doesn't.
The ImageDataLayer
relies on this io routine
cv::Mat ReadImageToCVMat(const string& filename,
const int height, const int width, const bool is_color) {
cv::Mat cv_img;
int cv_read_flag = (is_color ? CV_LOAD_IMAGE_COLOR :
CV_LOAD_IMAGE_GRAYSCALE);
cv::Mat cv_img_origin = cv::imread(filename, cv_read_flag);
if (!cv_img_origin.data) {
LOG(ERROR) << "Could not open or find file " << filename;
return cv_img_origin;
}
if (height > 0 && width > 0) {
cv::resize(cv_img_origin, cv_img, cv::Size(width, height));
} else {
cv_img = cv_img_origin;
}
return cv_img;
}
Which uses opencv's cv::imread
function. This function will read the input as 8bits unless the appropiate flag is set
CV_LOAD_IMAGE_ANYDEPTH - If set, return 16-bit/32-bit image when the input has the corresponding depth, otherwise convert it to 8-bit.
Simply adding the appropriate flag will not work because later in the code [io.cpp] they check for 8bit depth:
void CVMatToDatum(const cv::Mat& cv_img, Datum* datum) {
CHECK(cv_img.depth() == CV_8U) << "Image data type must be unsigned byte";
... }
I could just remove the check but I'm afraid it's there for a reason and unpredictable results might happen. Can anybody shine light on this issue?
Upvotes: 3
Views: 1354
Reputation: 322
You can patch ImageDataLayer to read 16bit images like this:
after
int cv_read_flag = (is_color ? CV_LOAD_IMAGE_COLOR :
CV_LOAD_IMAGE_GRAYSCALE);
add
cv_read_flag |= CV_LOAD_IMAGE_ANYDEPTH;
this
CHECK(cv_img.depth() == CV_8U) << "Image data type must be unsigned byte";
becomes
CHECK(cv_img.depth() == CV_8U || cv_img.depth() == CV_16U) << "Image data type must be uint8 or uint16";
bool is16bit = cv_img.depth() == CV_16U;
add pointer of uint16_t type to:
const uchar* ptr = cv_cropped_img.ptr<uchar>(h);
like this
const uint16_t* ptr_16 = cv_cropped_img.ptr<uint16_t>(h);
Then read using appropriate pointer:
Dtype pixel = static_cast<Dtype>(ptr[img_index++]);
becomes
Dtype pixel;
if(is16bit)
pixel = static_cast<Dtype>(ptr_16[img_index++]);
else
pixel = static_cast<Dtype>(ptr[img_index++]);
Upvotes: 2
Reputation: 114876
Caffe works with float32
variables, by default. An image is usually represented as a C
-by-H
-by-W
blob, where C=3
for three color channels. So, working with three channels of type float32
allows you to deal with images in uint16
provided you convert properly to float32
.
I do not have personal experience with "ImageData"
layer, so I cannot comment on how you can or cannot load uint16
image data using this layer.
However, you might find "HDF5Data"
layer useful: you can externally read and convert your images to hdf5
data format (that supports float32
) and then feed the converted data to caffe via "HDF5Data"
layer.
You can find more information on "HDF5Data"
layer here and here.
Upvotes: 1