Different number of bounding boxes per image, tried padding, but boxes with zeros are invalid for training object detection model

Question

I am training a faster r cnn for object detection, or at least trying to. I have an annotated dataset with bounding boxes and labels, however each image has a different number of bounding boxes as each image has a different number of objects. I got an error " stack expects each tensor to be equal size but got [16] at entry 0 and [15] at entry 1 " so I made sure each image had the same number of bounding boxes by using padding. For this I used:

def collate_fn(batch):
    images = []
    labels_list = []
    bboxes_list = []

    for image, labels, bboxes in batch:
        images.append(image)

        labels = torch.tensor(labels, dtype=torch.long)
        bboxes = torch.tensor(bboxes, dtype=torch.float)

        labels_list.append(labels)
        bboxes_list.append(bboxes)

    # Pad the bounding boxes to have a fixed number of bounding boxes per image
    max_num_bboxes = max(bbox.size(0) for bbox in bboxes_list)
    padded_bboxes = []
    padded_labels = []
    for bbox, label in zip(bboxes_list, labels_list):
        num_bboxes = bbox.size(0)
        pad_size = max_num_bboxes - num_bboxes
        if pad_size > 0:
            padding_bbox = torch.zeros(pad_size, 4)   
            padding_label = torch.zeros(pad_size, dtype=torch.long)  # Background label for padded boxes
            bbox = torch.cat((bbox, padding_bbox), dim=0)
            label = torch.cat((label, padding_label), dim=0)
        padded_bboxes.append(bbox)
        padded_labels.append(label)

    return torch.stack(images), (torch.stack(padded_labels), torch.stack(padded_bboxes))

However, now I get the error: "all bounding boxes should have positive height and width. Found invalid box [0.0, 0.0, 0.0, 0.0]. Can anyone help me solve this problem?

I tried padding to make the number of bounding boxes equal

Different number of bounding boxes per image, tried padding, but boxes with zeros are invalid for training object detection model

Answers (0)

Related Questions