gustavz
gustavz

Reputation: 3170

Boundingbox defintion for opencv object tracking

How is the boundingbox object defined that takes opencv's tracker.init() function? is it (xcenter,ycenter,boxwidht,boxheight) or (xmin,ymin,xmax,ymax) or (ymin,xmin,ymax,xmax) or something completely different?

I am using python and OpenCV 3.3 and i basically do the following on each object i want to track for each frame of a video:

tracker = cv2.trackerKCF_create()
ok = tracker.init(previous_frame,bbox)
bbox = tracker.update(current_frame)

Upvotes: 7

Views: 8354

Answers (2)

Dan Mašek
Dan Mašek

Reputation: 19071

The other post states the answer as a fact, so let's look at how to figure it out on your own.

The Python version of OpenCV is a wrapper around the main C++ API, so when in doubt, it's always useful to consult either the main documentation, or even the source code. There is a short tutorial providing some basic information about the Python bindings.

First, let's look at cv::TrackerKCF. The init member takes the bounding box as an instance of cv::Rect2d (i.e. a variant of cv::Rect_ which represents the parameters using double values):

bool cv::Tracker::init(InputArray image, const Rect2d& boundingBox)

Now, the question is, how is a cv::Rect2d (or in general, the variants of cv::Rect_) represented in Python? I haven't found any part of documentation that states this clearly (although I think it's hinted at in the tutorials), but there is some useful information in the bindings tutorial mentioned earlier:

...
But there may be some basic OpenCV datatypes like Mat, Vec4i, Size. They need to be extended manually. For example, a Mat type should be extended to Numpy array, Size should be extended to a tuple of two integers etc.
...
All such manual wrapper functions are placed in modules/python/src2/cv2.cpp.

Not much, so let's look at the code they point us at. Lines 941-954 are what we're after:

template<>
bool pyopencv_to(PyObject* obj, Rect2d& r, const char* name)
{
    (void)name;
    if(!obj || obj == Py_None)
        return true;
    return PyArg_ParseTuple(obj, "dddd", &r.x, &r.y, &r.width, &r.height) > 0;
}

template<>
PyObject* pyopencv_from(const Rect2d& r)
{
    return Py_BuildValue("(dddd)", r.x, r.y, r.width, r.height);
}

The PyArg_ParseTuple in the first function is quite self-explanatory. A 4-tuple of double (floating point) values, in the order x, y, width and height.

Upvotes: 6

gustavz
gustavz

Reputation: 3170

The Answer is: (xmin,ymin,boxwidth,boxheight)

Upvotes: 15

Related Questions