ai.devmir
ai.devmir

Reputation: 3

How to use MeanAveragePrecision metric from torchmitrics.detection on an object detection model

I have finetuned "fasterrcnn_resnet50_fpn" model from PyTorch for an object detection task, then I wanted to calculate mAP metric for the trained model on a validation dataset. I used MeanAveragePrecision from torchmetrics.detection.

The function that uses the trained model for inference looks as follows:

@torch.no_grad
def generate_bboxes_on_one_img(image, model, device):
    model.to(device)
    model.eval()
    x = [image.to(device)]
    pred_boxes, pred_labels, pred_scores = model(x)[0].values()
    return pred_boxes, pred_labels, pred_scores

To be able to use the previous function on a dataLoader, I set up the dataset instance and the dataLoader as follows:

def collate_fn(batch):
    return list(zip(*batch))

val_dataset = VisDroneDataset(val_images_path, val_annotations_df, transforms=val_transform)
val_data_loader = DataLoader(val_dataset, batch_size=1, shuffle=False, num_workers=0, collate_fn=collate_fn)

Then I wrote the following code to calculate mAP between ground-truths and predictions on each image

mAP = MeanAveragePrecision(iou_type="bbox")
mAP.to(device)

for image, target in val_data_loader:
    original_boxes, original_labels, image_idx, _, _ = target[0].values()
    model.eval()
    x = [img.to(device) for img in image]
    preds_boxes, preds_labels, preds_scores = model(x)[0].values()

    image_PIL = val_dataset.get_image(image_idx)
    upscaled_image, pred_boxes_upscaled, labels = get_inverse_transform(image[0], 
                                                                    pred_boxes, 
                                                                    pred_labels, 
                                                                    *image_PIL.size)
    pred_to_mAP = [
            dict(
                boxes=torch.tensor(box, dtype=torch.float32),
                scores=score,
                labels=label
            )   for box, label, score in zip(pred_boxes_upscaled, pred_labels.clone().detach(), pred_scores.clone().detach())
            ]
            
    gt_to_mAP = [
            dict(
                boxes=original_boxes,
                labels=original_labels
            )   for box, label in zip(test_image_gt_bboxes, test_image_gt_labels)
            ]
    mAP.update(pred_to_mAP, gt_to_mAP)
    pprint(mAP.compute())

    break

I got the following error: ValueError: Expected argument preds and target to have the same length, but got 100 and 127.

I don't understand why preds and target should have the same length.

I read the documentation and it was not that helpful. Please help me with understanding how MeanAveragePrecision works!

Upvotes: 0

Views: 568

Answers (1)

ai.devmir
ai.devmir

Reputation: 3

The cause of the error was the wrong setup of the pred_to_mAP. Instead of

pred_to_mAP = [
        dict(
            boxes=torch.tensor(box, dtype=torch.float32),
            scores=score,
            labels=label
        )   for box, label, score in zip(pred_boxes_upscaled, 
                                         pred_labels.clone().detach(), 
                                         pred_scores.clone().detach())
        ]

it should be

pred_to_mAP = [
        dict(
            boxes=torch.stack([torch.tensor(box) for box in pred_boxes_upscaled]
                             ).astype(torch.float32).to(device),
            scores=pred_scores.to(device),
            labels=pred_labels.to(device)
        ) 
        ]
  • What led to this mistake is my misunderstanding of the provided example in the documentation, it exhibits only one bounding box in the preds and one bounding box in the target and I was confused how to apply the metric "MeanAveragePrecision" on multiple bboxes per image at once. I have seen an example of multi-boxes here.

Upvotes: 0

Related Questions