Biggest Rigby
Biggest Rigby

Reputation: 53

Matterport Mask R-CNN - Unpredictable loss values and strange detection results for larger images

I'm having trouble achieving viable results with Mask R-CNN and I can't seem to pinpoint why. I am using a fairly limited dataset (13 images) of large greyscale images (2560 x 2160) where the detection target is very small (mean area of 26 pixels). I have run inspect_nucleus_data.ipynb across my data and verified that the masks and images are being interpreted correctly. I've also followed the wiki guide (https://github.com/matterport/Mask_RCNN/wiki) to have my images read and dealt with as greyscale images rather than just converting them to RGB. Here is one of the images with the detection targets labelled.

annotated image

During training, the loss values are pretty unpredictable, bouncing between around 1 and 2 without ever reaching a steady decline where it seems like it's converging at all. I'm using these config values at the moment; they're the best I've been able to come up with while fighting off OOM errors:

Configurations:
BACKBONE                       resnet101
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     1
BBOX_STD_DEV                   [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE         None
DETECTION_MAX_INSTANCES        450
DETECTION_MIN_CONFIDENCE       0
DETECTION_NMS_THRESHOLD        0.3
FPN_CLASSIF_FC_LAYERS_SIZE     1024
GPU_COUNT                      1
GRADIENT_CLIP_NORM             5.0
IMAGES_PER_GPU                 1
IMAGE_CHANNEL_COUNT            1
IMAGE_MAX_DIM                  1024
IMAGE_META_SIZE                14
IMAGE_MIN_DIM                  1024
IMAGE_MIN_SCALE                0
IMAGE_RESIZE_MODE              square
IMAGE_SHAPE                    [1024 1024    1]
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.001
LOSS_WEIGHTS                   {'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0,     'rpn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0, 'rpn_class_loss': 1.0}
MASK_POOL_SIZE                 14
MASK_SHAPE                     [28, 28]
MAX_GT_INSTANCES               450
MEAN_PIXEL                     [16.49]
MINI_MASK_SHAPE                (56, 56)
NAME                           nucleus
NUM_CLASSES                    2
POOL_SIZE                      7
POST_NMS_ROIS_INFERENCE        1000
POST_NMS_ROIS_TRAINING         2000
PRE_NMS_LIMIT                  6000
ROI_POSITIVE_RATIO             0.33
RPN_ANCHOR_RATIOS              [0.5, 1, 2]
RPN_ANCHOR_SCALES              (2, 4, 8, 16, 32)
RPN_ANCHOR_STRIDE              1
RPN_BBOX_STD_DEV               [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD              0.9
RPN_TRAIN_ANCHORS_PER_IMAGE    512
STEPS_PER_EPOCH                11
TOP_DOWN_PYRAMID_SIZE          256
TRAIN_BN                       False
TRAIN_ROIS_PER_IMAGE           256
USE_MINI_MASK                  True
USE_RPN_ROIS                   True
VALIDATION_STEPS               1
WEIGHT_DECAY                   0.0001

I'm training on all layers. The output I'm getting generally looks like this, with grid-like detections found in weird spots without ever seeming to accurately identify a nucleus. I've added the red square just to highlight a very obvious cluster of nuclei that have been missed:

output image

Here is a binary mask of these same detections so you can see their shape: output mask

Could anyone shed some light on what might be going wrong here?

Upvotes: 3

Views: 1581

Answers (1)

Phan Nguyen
Phan Nguyen

Reputation: 11

Some information is lost when you resize the training image dimension by half. Square also uses a lot of memory. So you might want to use crop instead. Instead of 1024*1024 in square mode, you will have 512*512. You might encounter NAN due to bounding box out of range, in that case you need to make some adjustment to your data feed.

You want to turn off mini mask because that will influence with your accuracy. Using crop mode should help out a ton with memory. So you shouldn't worry.

Upvotes: 1

Related Questions