Average between arrays of different length

Question

I'm trying to develop a sort of very simple machine learning example to recognize similarity between arrays. For this reason I'm trying to calculate the average between 2 arrays with different length.

For example if I have:

array_1 = [0, 4, 5];
array_2 = [4, 2, 7];

The average is:

average_array = [2, 3, 6];

But how can I manage to calculate the average if I have the following situation:

array_1 = [0, 4, 5, 10, 7];
array_2 = [4, 2, 7];

As you can see the arrays have a different length. Is there an algorithm that I can apply to solve this problems? Does anyone have an idea or some suggestion?

Of course I can consider the missing values of the second array as 0, and evaluate the average as, for example:

average_array = [2, 3, 6, 5, 3.5];

or consider the values as "null" and have:

average_array = [2, 3, 6, 10, 7];

But are this two approach good? Or there is something smarter?

Thanks for your help!!

Sumeet · Accepted Answer

If after taking the the average of the arrays, you intend to take the mod of the difference of the array and the average array, then you are probably in the right direction if you will measure the dissimilarity by the magnitude of the difference.

But for arrays of different lengths I propose that you also take the index of extra elements in consideration.

For

array_1 = [0, 4, 5, 10, 7];
array_2 = [4, 2, 7];

average should be average_array = [2, 3, 6, 6.5, 5.5];

6.5 = (10 + 3(index) + 0(element) ) / 2

and

5.5 = (7 + 4(index) + 0(element))/2

Reason for taking index into consideration is that the length factor is also dealth with this approach. However this is just my 2 cents. May be there are better algorithms out there.

You should also take a look at this post

Average between arrays of different length

Answers (2)

Related Questions