Enoon
Enoon

Reputation: 421

Are 16 bit images supported by Caffe? If not, how to implement support?

Background information: I need to load some 16 bit grayscale PNGs.

Does Caffe support loading 16 bit images through the ImageDataLayer?

After some googling, the answer seems it doesn't. The ImageDataLayer relies on this io routine

cv::Mat ReadImageToCVMat(const string& filename,
    const int height, const int width, const bool is_color) {
  cv::Mat cv_img;
  int cv_read_flag = (is_color ? CV_LOAD_IMAGE_COLOR :
    CV_LOAD_IMAGE_GRAYSCALE);
  cv::Mat cv_img_origin = cv::imread(filename, cv_read_flag);
  if (!cv_img_origin.data) {
    LOG(ERROR) << "Could not open or find file " << filename;
    return cv_img_origin;
  }
  if (height > 0 && width > 0) {
    cv::resize(cv_img_origin, cv_img, cv::Size(width, height));
  } else {
    cv_img = cv_img_origin;
  }
  return cv_img;
}

Which uses opencv's cv::imread function. This function will read the input as 8bits unless the appropiate flag is set

CV_LOAD_IMAGE_ANYDEPTH - If set, return 16-bit/32-bit image when the input has the corresponding depth, otherwise convert it to 8-bit.

Simply adding the appropriate flag will not work because later in the code [io.cpp] they check for 8bit depth:

void CVMatToDatum(const cv::Mat& cv_img, Datum* datum) {
  CHECK(cv_img.depth() == CV_8U) << "Image data type must be unsigned byte";
... }

I could just remove the check but I'm afraid it's there for a reason and unpredictable results might happen. Can anybody shine light on this issue?

Upvotes: 3

Views: 1354

Answers (2)

Dzugaru
Dzugaru

Reputation: 322

You can patch ImageDataLayer to read 16bit images like this:

  1. Add appropriate flag as you mentioned (io.cpp):

after

int cv_read_flag = (is_color ? CV_LOAD_IMAGE_COLOR :
    CV_LOAD_IMAGE_GRAYSCALE);

add

cv_read_flag |= CV_LOAD_IMAGE_ANYDEPTH;
  1. Modify the check you mentioned (data_transformer.cpp):

this

CHECK(cv_img.depth() == CV_8U) << "Image data type must be unsigned byte";

becomes

CHECK(cv_img.depth() == CV_8U || cv_img.depth() == CV_16U) << "Image data type must be uint8 or uint16";
bool is16bit = cv_img.depth() == CV_16U;
  1. Modify the way DataTransformer reads cv::Mat like this (same function below):

add pointer of uint16_t type to:

const uchar* ptr = cv_cropped_img.ptr<uchar>(h);

like this

const uint16_t* ptr_16 = cv_cropped_img.ptr<uint16_t>(h);

Then read using appropriate pointer:

Dtype pixel = static_cast<Dtype>(ptr[img_index++]);

becomes

Dtype pixel;
if(is16bit)
    pixel = static_cast<Dtype>(ptr_16[img_index++]);
else
    pixel = static_cast<Dtype>(ptr[img_index++]);

Upvotes: 2

Shai
Shai

Reputation: 114876

Caffe works with float32 variables, by default. An image is usually represented as a C-by-H-by-W blob, where C=3 for three color channels. So, working with three channels of type float32 allows you to deal with images in uint16 provided you convert properly to float32.

I do not have personal experience with "ImageData" layer, so I cannot comment on how you can or cannot load uint16 image data using this layer.

However, you might find "HDF5Data" layer useful: you can externally read and convert your images to hdf5 data format (that supports float32) and then feed the converted data to caffe via "HDF5Data" layer.

You can find more information on "HDF5Data" layer here and here.

Upvotes: 1

Related Questions