Reputation: 475
I have two "set" of graphs (that is, "xy" graphs on the Cartesian plane). All are similar in shape— effectively noise, except for a "peak" at some unknown point along the horizontal axis in each. For example:
In practice, these graphs consist of two C
-style arrays (of type float
), storing the points along the x
and y
axis, respectively. My goal is to pair the graphs in each "set" such that the peaks in each pair align as closely as possible with respect to the horizontal axis (that is, I want to iterate over one "set" and find the "best match" from the second "set"). I'm not concerned about the "breadth" of the peak, only that they are "well aligned".
In practice, I'm working with hundreds of thousands of such graphs and need to "pair them up" efficiently, time-wise. I've tried a few things, such as sorting the graphs based on the mean of their x
or y
coordinates, but didn't end up with "nicely-aligned" pairs (where I know that better pairings are possible).
For example, this pairing is not very "well aligned" (the pair consists of one graph, in yellow, and the pairing, in blue):
We can assume that the peaks are nearly always of roughly the same "height". The horizontal axis of the graphs are roughly the same range, but are of the precisely same "length".
EDIT:
It may help to clarify that I know that the peaks of the graphs will fall at approximately one of a handful of points along the x-axis (e.g. they may center roughly on x=5
, x=10
, or x=12
). I don't, unfortunately, know where before-hand, nor precisely how many "points" there are (although on the order of 3 or 4). The goal is to group the graphs according to which point they (roughly) center on.
Upvotes: 0
Views: 382
Reputation: 39277
Cross-correlation may be sufficient for what you describe. You can either slide one across the other summing the product of the values at the matching points to find the best alignment, or if as you describe you have a few known points at which the peak will occur, creating those known graph curves and then sliding them some delta amount left or right to see which of the peaks they align with.
As for efficiency, yes this sounds slow: thousands of sums of products, but modern CPUs are incredibly quick at processing such 'dumb' algorithms and may be able to vectorize the code to execute quickly.
Upvotes: 2