Reputation: 2046
I have two lists, one of which consists of x coordinates and the other consists of y coordinates.
x_coordinates = [1, 2, 3, 4, 5]
y_coordinates = [1, 2, 3, 4, 5]
For example, point 1
is (1,1)
I want to calculate the covariance of the two lists and I have programmed a code, but I think it's somewhat unnecessarily long and messy. I know that I can calculate this just using math.cov, but I wonder if it is possible to program this neatly, maybe with map and lambda functions.
The formula is like this:
(x1 - average_x)*(y1 - average_y) + ... + (xn - average_x)*(yn - average_y) / (the number of the items in one of the lists)
The code:
import math
x_coordinates = [1, 2, 3, 4, 5]
y_coordinates = [1, 2, 3, 4, 5]
covariance_temp_sum = 0
x_mean = math.fsum(x_coordinates) / len(x_coordinates)
y_mean = math.fsum(y_coordinates) / len(y_coordinates)
for n in range(len(x_coordinates)):
covariance_temp_sum += (x_coordinates[n] - x_mean) * (y_coordinates[n] - y_mean)
covariance = covariance_temp_sum / len(x_coordinates)
Upvotes: 1
Views: 5426
Reputation: 4866
If you don't want to use external modules, you can approach it like this:
x = [1, 2, 3, 4, 5]
y = [1, 2, 3, 4, 5]
mean_x = sum(x) / len(x)
mean_y = sum(y) / len(y)
sum((a - mean_x) * (b - mean_y) for (a,b) in zip(x,y)) / len(x)
Output: 2
Upvotes: 6
Reputation: 8131
You can do these things elegantly using comprehensions, zip, and multiple variable assignment:
sum((x - x_mean)*(y - y_mean) for x, y in zip(x_coordinates, y_coordinates)) / len(x_coordinates)
Upvotes: 2