maynull
maynull

Reputation: 2046

Is there any better way to calculate the covariance of two lists than this?

I have two lists, one of which consists of x coordinates and the other consists of y coordinates.

x_coordinates = [1, 2, 3, 4, 5]
y_coordinates = [1, 2, 3, 4, 5]

For example, point 1 is (1,1)

I want to calculate the covariance of the two lists and I have programmed a code, but I think it's somewhat unnecessarily long and messy. I know that I can calculate this just using math.cov, but I wonder if it is possible to program this neatly, maybe with map and lambda functions.

The formula is like this:

(x1 - average_x)*(y1 - average_y) + ... + (xn - average_x)*(yn - average_y) / (the number of the items in one of the lists)

The code:

import math

x_coordinates = [1, 2, 3, 4, 5]
y_coordinates = [1, 2, 3, 4, 5]
covariance_temp_sum = 0

x_mean = math.fsum(x_coordinates) / len(x_coordinates)
y_mean = math.fsum(y_coordinates) / len(y_coordinates)

for n in range(len(x_coordinates)):
    covariance_temp_sum += (x_coordinates[n] - x_mean) * (y_coordinates[n] - y_mean)

covariance = covariance_temp_sum / len(x_coordinates)

Upvotes: 1

Views: 5426

Answers (2)

Carles Mitjans
Carles Mitjans

Reputation: 4866

If you don't want to use external modules, you can approach it like this:

x = [1, 2, 3, 4, 5]
y = [1, 2, 3, 4, 5]
mean_x = sum(x) / len(x)
mean_y = sum(y) / len(y)

sum((a - mean_x) * (b - mean_y) for (a,b) in zip(x,y)) / len(x)

Output: 2

Upvotes: 6

Denziloe
Denziloe

Reputation: 8131

You can do these things elegantly using comprehensions, zip, and multiple variable assignment:

sum((x - x_mean)*(y - y_mean) for x, y in zip(x_coordinates, y_coordinates)) / len(x_coordinates)

Upvotes: 2

Related Questions