magneto
magneto

Reputation: 107

Failed to find HDF5 dataset data - Single Label Regression Using Caffe and HDF5 data

I was using @shai 's code for creating my hdf5 file which is available here:

Test labels for regression caffe, float not allowed?

My data is grayscale images (1000 images for a start) and label is one dimensional float numbers

So, I modified his code as:

import h5py, os
import caffe
import numpy as np

SIZE = 224 
with open( 'train.txt', 'r' ) as T :
    lines = T.readlines()
X = np.zeros( (len(lines), 1, SIZE, SIZE), dtype='f4' )  #Changed 3 to 1
y = np.zeros( (len(lines)), dtype='f4' ) #Removed the "1,"
for i,l in enumerate(lines):
    sp = l.split(' ')
    img = caffe.io.load_image( sp[0], color=False ) #Added , color=False
    img = caffe.io.resize( img, (SIZE, SIZE, 1) )   #Changed 3 to 1
    # you may apply other input transformations here...
    X[i] = img
    y[i] = float(sp[1])
with h5py.File('train.h5','w') as H:
    H.create_dataset( 'X', data=X ) # note the name X given to the dataset!
    H.create_dataset( 'y', data=y ) # note the name y given to the dataset!
with open('train_h5_list.txt','w') as L:
    L.write( 'train.h5' ) # list all h5 files you are going to use

But I was getting this error:

ValueError                                Traceback (most recent call last)
<ipython-input-19-8148f7b9e03d> in <module>()
     13     img = caffe.io.resize( img, (SIZE, SIZE, 1) )   #Changed 3 to 1
     14     # you may apply other input transformations here...
---> 15     X[i] = img
     16     y[i] = float(sp[1])

ValueError: could not broadcast input array from shape (224,224,1) into shape (1,224,224)

So I changed the line number 13 from:

img = caffe.io.resize( img, (SIZE, SIZE, 1) ) 

to:

img = caffe.io.resize( img, (1, SIZE, SIZE) ) 

And the code ran fine.

For training, I used this solver.prototxt file:

net: "MyCaffeTrain/train_test.prototxt"

# Note: 1 iteration = 1 forward pass over all the images in one batch

# Carry out a validation test every 500 training iterations.
test_interval: 500 

# test_iter specifies how many forward passes the validation test should carry out
#  a good number is num_val_imgs / batch_size (see batch_size in Data layer in phase TEST in train_test.prototxt)
test_iter: 100 

# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9 
weight_decay: 0.0005

# We want to initially move fast towards the local minimum and as we approach it, we want to move slower
# To this end, there are various learning rates policies available:
#  fixed: always return base_lr.
#  step: return base_lr * gamma ^ (floor(iter / step))
#  exp: return base_lr * gamma ^ iter
#  inv: return base_lr * (1 + gamma * iter) ^ (- power)
#  multistep: similar to step but it allows non uniform steps defined by stepvalue
#  poly: the effective learning rate follows a polynomial decay, to be zero by the max_iter: return base_lr (1 - iter/max_iter) ^ (power)
#  sigmoid: the effective learning rate follows a sigmod decay: return base_lr * ( 1/(1 + exp(-gamma * (iter - stepsize))))
lr_policy: "inv"
gamma: 0.0001
power: 0.75 
#stepsize: 10000 # Drop the learning rate in steps by a factor of gamma every stepsize iterations

# Display every 100 iterations
display: 100 

# The maximum number of iterations
max_iter: 10000

# snapshot intermediate results, that is, every 5000 iterations it saves a snapshot of the weights
snapshot: 5000
snapshot_prefix: "MyCaffeTrain/lenet_multistep"

# solver mode: CPU or GPU
solver_mode: CPU

And my train_test.prototxt file is:

name: "LeNet"
layer {
  name: "mnist"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "MyCaffeTrain/train_h5_list.txt"
    batch_size: 1000
  }
  include: { phase: TRAIN }
}

layer {
  name: "mnist"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "MyCaffeTrain/test_h5_list.txt"
    batch_size: 1000
  }
  include: { phase: TEST }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "EuclideanLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

But when I train, I get this error:

I0914 13:59:33.198423  8251 layer_factory.hpp:74] Creating layer mnist
I0914 13:59:33.198452  8251 net.cpp:96] Creating Layer mnist
I0914 13:59:33.198467  8251 net.cpp:415] mnist -> data
I0914 13:59:33.198510  8251 net.cpp:415] mnist -> label
I0914 13:59:33.198532  8251 net.cpp:160] Setting up mnist
I0914 13:59:33.198549  8251 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: MyCaffeTrain/train_h5_list.txt
I0914 13:59:33.198884  8251 hdf5_data_layer.cpp:94] Number of HDF5 files: 1
F0914 13:59:33.200848  8251 io.cpp:237] Check failed: H5LTfind_dataset(file_id, dataset_name_) Failed to find HDF5 dataset data
*** Check failure stack trace: ***
    @     0x7fcfa9fb05cd  google::LogMessage::Fail()
    @     0x7fcfa9fb2433  google::LogMessage::SendToLog()
    @     0x7fcfa9fb015b  google::LogMessage::Flush()
    @     0x7fcfa9fb2e1e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7fcfaa426b13  caffe::hdf5_load_nd_dataset_helper<>()
    @     0x7fcfaa423ec5  caffe::hdf5_load_nd_dataset<>()
    @     0x7fcfaa34bd3d  caffe::HDF5DataLayer<>::LoadHDF5FileData()
    @     0x7fcfaa345ae6  caffe::HDF5DataLayer<>::LayerSetUp()
    @     0x7fcfaa3fdd75  caffe::Net<>::Init()
    @     0x7fcfaa4001ff  caffe::Net<>::Net()
    @     0x7fcfaa40b935  caffe::Solver<>::InitTrainNet()
    @     0x7fcfaa40cd6e  caffe::Solver<>::Init()
    @     0x7fcfaa40cf36  caffe::Solver<>::Solver()
    @           0x411980  caffe::GetSolver<>()
    @           0x4093a6  train()
    @           0x406de0  main
    @     0x7fcfa9049830  __libc_start_main
    @           0x407319  _start
    @              (nil)  (unknown)

I have tried my best but still cannot find the reason behind this error. Is my created database in correct format? I read somewhere that the data format should be:

N, C, H, W (No. of Data, Channels, Height, Width) 
For my case: 1000,1,224,224

on checking X.shape I get the same result : 1000,1,224,224

I am not getting where I am doing wrong. Any help would be appreciated. Thanks in advance.

Upvotes: 1

Views: 487

Answers (1)

magneto
magneto

Reputation: 107

I solved the problem making following changes to the code:

H.create_dataset( 'data', data=X ) # note the name X given to the dataset! Replaced X by data
H.create_dataset( 'label', data=y ) # note the name y given to the dataset! Replaced y by label

The error was gone.

I'm still having the problem with EuclideanLoss layer though I'll look onto it and post another question if required.

Upvotes: 0

Related Questions