Reputation: 494
I have coco style annotations (json format) with Both segmentations And bboxes.
Most of the segmentations are given as list-of-lists of the pixels (polygon).
The problem is that some segmentations are given as a dictionary (with 'counts' and 'size' keys) that represent RLE values, and in these cases the 'iscrowd' key is equal to 1 (normally it is equal to 0).
I would like to convert all the 'annotations' with iscrowd==1 to be represented as polygons instead of RLE.
I do not need the mask as suggested here, but just the json file to have only polygon shaped segmentations.
Here is an example of a few annotations (from the same image), note how in the first two the segmentation is in polygon shape, and the latter two it is in RLE shape:
{'id': 53, 'image_id': 4, 'category_id': 2037037930, 'segmentation': [[344.51, 328.83, 316.02, 399.73, 358.3, 399.78, 375.85, 336.07]], 'area': 2561.4049499999965, 'bbox': [316.02, 328.83, 59.83000000000004, 70.94999999999999], 'iscrowd': 0, 'extra': {}}
{'id': 54, 'image_id': 4, 'category_id': 2037037930, 'segmentation': [[376.43, 233.52, 368.93, 250.71, 375.96, 252.89, 369.4, 269.76, 378.62, 273.83, 372.21, 292.42, 400.09, 302.34, 400.09, 302.11, 400.1, 242.04]], 'area': 1596.5407000000123, 'bbox': [368.93, 233.52, 31.170000000000016, 68.81999999999996], 'iscrowd': 0, 'extra': {}}
{'id': 67, 'image_id': 4, 'category_id': 2037037930, 'segmentation': {'counts': [55026, 2, 396, 4, 394, 7, 391, 9, 389, 12, 386, 14, 384, 17, 381, 19, 379, 21, 377, 24, 374, 26, 372, 29, 369, 31, 367, 33, 365, 36, 362, 38, 360, 41, 357, 43, 355, 46, 352, 48, 350, 50, 348, 53, 345, 55, 343, 58, 340, 38, 1, 21, 338, 37, 5, 21, 335, 37, 7, 21, 335, 34, 10, 19, 338, 32, 12, 16, 340, 33, 11, 14, 342, 33, 11, 11, 346, 33, 11, 8, 348, 33, 10, 7, 350, 33, 8, 8, 351, 34, 5, 11, 351, 33, 3, 13, 351, 49, 351, 49, 352, 49, 351, 49, 351, 49, 352, 48, 352, 49, 351, 49, 352, 46, 354, 44, 356, 41, 359, 39, 362, 36, 364, 35, 365, 35, 366, 35, 365, 35, 365, 35, 366, 34, 366, 34, 366, 35, 366, 34, 366, 34, 366, 32, 368, 29, 372, 25, 375, 23, 377, 20, 381, 18, 382, 19, 381, 19, 382, 18, 382, 18, 382, 19, 382, 18, 382, 18, 382, 19, 381, 19, 382, 16, 384, 13, 387, 9, 392, 5, 395, 2, 73808], 'size': [400, 400]}, 'area': 2598.0, 'bbox': [137, 174, 79, 65], 'iscrowd': 1, 'extra': {}}
{'id': 68, 'image_id': 4, 'category_id': 2037037930, 'segmentation': {'counts': [76703, 2, 396, 4, 394, 7, 391, 9, 389, 11, 387, 14, 384, 16, 382, 19, 379, 21, 377, 23, 375, 26, 372, 28, 370, 30, 368, 33, 365, 35, 364, 37, 363, 37, 364, 36, 364, 37, 364, 36, 364, 36, 364, 37, 364, 36, 364, 37, 363, 37, 364, 36, 364, 37, 364, 36, 364, 36, 364, 37, 364, 15, 1, 20, 364, 13, 4, 19, 365, 10, 6, 20, 363, 9, 8, 20, 361, 9, 11, 20, 358, 9, 13, 20, 356, 11, 14, 19, 354, 14, 13, 20, 351, 16, 13, 20, 348, 20, 13, 19, 346, 22, 13, 20, 343, 24, 13, 20, 341, 27, 13, 20, 338, 29, 13, 20, 336, 32, 13, 19, 334, 34, 13, 20, 331, 37, 12, 20, 331, 37, 13, 19, 332, 36, 12, 21, 331, 37, 8, 24, 332, 36, 5, 28, 331, 37, 1, 31, 331, 69, 332, 69, 331, 69, 332, 68, 332, 69, 331, 69, 332, 68, 332, 69, 332, 68, 332, 69, 331, 69, 332, 68, 332, 48, 1, 20, 331, 45, 5, 19, 332, 41, 8, 19, 332, 38, 12, 19, 332, 36, 13, 19, 332, 37, 12, 20, 331, 37, 13, 19, 332, 36, 13, 19, 332, 37, 13, 19, 332, 36, 13, 19, 332, 37, 12, 19, 332, 37, 13, 19, 332, 36, 13, 19, 332, 37, 13, 19, 332, 36, 12, 20, 332, 36, 10, 22, 332, 37, 6, 26, 332, 36, 4, 28, 332, 37, 1, 28, 335, 63, 337, 61, 339, 59, 342, 56, 344, 53, 348, 50, 350, 48, 352, 46, 355, 43, 357, 40, 360, 38, 363, 35, 365, 33, 368, 30, 370, 28, 372, 25, 376, 22, 378, 20, 381, 17, 383, 15, 385, 12, 389, 9, 391, 7, 394, 4, 396, 2, 40521], 'size': [400, 400]}, 'area': 4551.0, 'bbox': [191, 253, 108, 82], 'iscrowd': 1, 'extra': {}}
I already tried the following:
for annotation in coco_data['annotations']:
if type(annotation['segmentation']) == dict:
# Get the values of the dictionary
height = annotation['segmentation']['size'][0]
width = annotation['segmentation']['size'][1]
counts = annotation['segmentation']['counts']
# Decode the RLE encoded counts
rle = np.array(counts).reshape(-1, 2)
starts, lengths = rle[:, 0], rle[:, 1]
starts -= 1
ends = starts + lengths
pixels = []
for lo, hi in zip(starts, ends):
pixels.extend(range(lo, hi))
pixels = np.array(pixels)
# Convert the 1D pixels array to a 2D array
segments = np.zeros((height, width), dtype=np.uint8)
segments[pixels // width, pixels % width] = 1
segments = np.where(segments == 1)
# Update the segmentation and iscrowd fields
annotation['segmentation'] = [segments[1].tolist(), segments[0].tolist()]
annotation['iscrowd'] = 0
But got the following error:
ValueError Traceback (most recent call last)
<ipython-input-29-1bf7f4af292c> in <module>
17 # Decode the RLE encoded counts
---> 18 rle = np.array(counts).reshape(-1, 2)
19 starts, lengths = rle[:, 0], rle[:, 1]
20 starts -= 1
ValueError: cannot reshape array of size 183 into shape (2)
afaik, it expectes RLE to be an even length? not sure where is the problem and how to solve it.
then i tried something a bit different with import pycocotools.mask as mask
and import skimage.measure as measure
and the following function:
def rle_to_polygon(rle, height, width):
if isinstance(rle, list):
rle = mask.frPyObjects(rle, height, width)
rle = mask.decode(rle)
contours = measure.find_contours(rle, 0.5)
polygon = []
for contour in contours:
contour = np.fliplr(contour) - 1
contour = contour.clip(min=0)
contour = contour.astype(int)
if len(contour) >= 4:
return polygon
I receive
<ipython-input-43-84d17a601509> in rle_to_polygon(rle, height, width)
79 def rle_to_polygon(rle, height, width):
80 if isinstance(rle, list):
---> 81 rle = mask.frPyObjects(rle, height, width)
82 rle = mask.decode(rle)
83 contours = measure.find_contours(rle, 0.5)
pycocotools/_mask.pyx in pycocotools._mask.frPyObjects()
TypeError: object of type 'int' has no len()
Any suggestions would be highly appreciated!
Upvotes: 1
Views: 7554
Reputation: 455
This is my code for the task:
import logging
import cv2
from pycocotools import mask as cocomask
import copy
def rle_to_coco(annotation: dict) -> list[dict]:
"""Transform the rle coco annotation (a single one) into coco style.
In this case, one mask can contain several polygons, later leading to several `Annotation` objects.
In case of not having a valid polygon (the mask is a single pixel) it will be an empty list.
annotation : dict
rle coco style annotation
list of coco style annotations (in dict format)
annotation["segmentation"] = cocomask.frPyObjects(
maskedArr = cocomask.decode(annotation["segmentation"])
contours, _ = cv2.findContours(maskedArr, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
segmentation = []
for contour in contours:
if contour.size >= 6:
if len(segmentation) == 0:
f"Annotation with id {annotation['id']} is not valid, it has no segmentations."
annotations = []
annotations = list()
for i, seg in enumerate(segmentation):
single_annotation = copy.deepcopy(annotation)
single_annotation["segmentation_coords"] = (
single_annotation["bbox"] = list(cv2.boundingRect(seg))
single_annotation["area"] = cv2.contourArea(seg)
single_annotation["instance_id"] = annotation["id"]
single_annotation["annotation_id"] = f"{annotation['id']}_{i}"
return annotations
You need opencv
and pycocotools
to use this code:
pip install opencv-python
pip install pycocotools
Note that the input annotation is one of the items inside the coco dict annotations
key. Something like this:
"image_id": 1,
"category_id": 1,
"bbox": [
"score": 0.8787025809288025,
"segmentation": {
"size": [
"counts": "jXX>1mm03O1N10000000000001NY]mf0"
"id": 1,
"iscrowd": 0,
"attributes": {
"occluded": false
If a mask in rle format contains more than one mask not connected, the function will return a list with each of those masks in coco format.
Hope it helps!
Upvotes: 3