Reputation: 11
I'm trying to find the matching score between the boundary of shapes that do not exactly match. I use Fourier descriptor to represent these boundaries and find the similarity based on DTW as follows: (X,Y)
is the coordinate points of the boundary sorted in clockwise order.
Z = complex(X, Y);
FD = fft(Z);
FD(1) = 0;
FD = FD/FD(2);
Then for similarity measure:
Dist = dtw(abs(FD1),abs(FD2));
The similarity result is not accurate therefore I would like to check if I applied the Fourier descriptor correctly and any recommendation of a similarity measure.
When I try ifft
to get the shape back, the resulting shape is the reflection of the original shape. How do I solve this issue?
Upvotes: 1
Views: 1165
Reputation: 60760
Note the normalizations you apply to the Fourier components:
FD = fft(Z);
FD(1) = 0; % ignore translation
FD = FD/FD(2); % normalize for size, and rotate
FD(1)
is the DC component, and encodes the translation. FD(2)
is the lowest frequency, it represents (together with FD(end)
) a single sine + cosine component for the outline. With this component you can represent an ellipse. The phase encodes the orientation of the ellipse, the magnitude encodes the size, and the difference in magnitude between FD(2)
and FD(end)
represents the aspect ratio. By dividing by FD(2)
, you normalize for size and impose a specific rotation determined only by the best fit ellipse. This explains why your shape seemed reflected when using the IFFT, it gets rotated to some standard orientation.
In your error measure, you compare only the magnitude of the components of the two shapes. By ignoring the phase, you throw away a lot of what makes the shape. For example, try plotting the IFFT of the magnitude of the Fourier descriptor. You'll see something that does not at all look like your original shape.
However, this is the common thing to do, since it is the easiest way of obtaining rotation invariance. The standardization of the orientation based on the best fit ellipse is very noise-sensitive (with noise being e.g. the location of the samples along the contour). In fact, the value of most of the Fourier descriptor elements is very sensitive to these locations, if the contour is sampled at different locations, the Fourier descriptor changes significantly.
Thus, ideally, the error measure takes both magnitude and phase into account (or both real and imaginary components). But, in practice, it is hard to use the phase information and people mostly compare amplitudes, even if many different shapes will thereby become similar.
Next, since FD(1)
was set to 0 and FD(2)
to 1, these two components are not useful. Other than that, you should look only at the first few components (let's say n
):
FD(1) % ignore, set to 0 above
FD(2) % ignore, set to 1 above
FD(3)
FD(4)
...
FD(n+2)
These frequency components have a related element, which can be found at:
FD(1) -> none
FD(2) -> FD(end)
FD(3) -> FD(end-1)
FD(4) -> FD(end-2)
...
FD(n+2) -> FD(end-n)
I can imagine most people simply ignoring those, but I would certainly include them into the comparison, as they provide additional relevant information.
This leads to 2*n-1
numbers (3:n+2
and end-n:end
). n
is typically 10, but do experiment with smaller values here as well.
The simplest way to compare these is to compute the Euclidean distance, though I'm sure other distance measures can be useful too.
TL;DR:
The dtw
distance measure seems wrong to me, try using a normal Euclidean distance (mean square difference). Further, compare only the first ~10 components, subsequent components will add more noise than useful information. But leave out the first two elements, since these have been make useless by the normalization.
Upvotes: 0