Reputation: 15
I have been learning machine learning in python and currently im studying the basics. Im now studying linear regression and trying to implement some mathematical formulas into python code. I managed to write some formulas successfuly but there is this one im having a hard time with: (X - Xmeans) * (Y - Ymeans) It always gives me an error of "list indices must be integers or slices, not numpy.float64" when I try to print it.
I've tried to look for same cases and their solutions in the web but none of them worked.
import numpy
import matplotlib.pyplot as plt
X_positions = numpy.array([2,3,4,5,6])
y_positions = numpy.array([4,5,6,5,7])
plt.plot([X_positions], [y_positions], 'ro')
plt.axis([0,10,0,10])
X_means = sum(X_positions) / len(X_positions)
y_means = sum(y_positions) / len(y_positions)
plt.plot([X_means], [y_means], 'go')
plt.axis([0,10,0,10])
X_minus_X_means = []
y_minus_y_means = []
X_minus_X_means_squared = []
for i in X_positions:
X_minus_X_means.append(i - X_means)
for i in y_positions:
y_minus_y_means.append(i - y_means)
for i in X_minus_X_means:
X_minus_X_means_squared.append(i ** 2)
X_minus_X_means_times_y_minus_y_means = []
#HERE IS THE PROBLEM
for i in X_minus_X_means and y_minus_y_means:
X_minus_X_means_times_y_minus_y_means.append(X_minus_X_means[i] * y_minus_y_means[i])
Upvotes: 1
Views: 129
Reputation: 39072
While I completely support and prefer vectorized operations, I think you should know about the function called zip
which is helpful in situations where you are iterating over two (or more) lists in parallel. So in your case, the problematic part should be changed to the following avoiding any index i
for x, y in zip(X_minus_X_means, y_minus_y_means):
X_minus_X_means_times_y_minus_y_means.append(x * y)
Upvotes: 0
Reputation: 532
both X_minus_X_means
and y_minus_y_means
are list
y_minus_y_means
contain value
[-1.4000000000000004,
-0.40000000000000036,
0.5999999999999996,
-0.40000000000000036,
1.5999999999999996]
so basically inside for i in X_minus_X_means and y_minus_y_means:
value of i
calculated in operation is of type numpy.float64
and inside loop, you are accessing value of X_minus_X_means
and y_minus_y_means
using index
which is i
a float
value
Upvotes: 0
Reputation: 1537
You should really just use the built-in numpy operations and the vectorized operations when possible.
Try something like this:
import numpy as np
import matplotlib.pyplot as plt
X_positions = np.array([2,3,4,5,6])
y_positions = np.array([4,5,6,5,7])
plt.plot([X_positions], [y_positions], 'ro')
plt.axis([0,10,0,10])
X_means = X_positions.mean()
y_means = y_positions.mean()
plt.plot([X_means], [y_means], 'go')
plt.axis([0,10,0,10])
X_minus_X_means = X_positions-X_means
y_minus_y_means = y_positions-y_means
X_minus_X_means_squared = X_minus_X_means**2
X_minus_X_means_times_y_minus_y_means = X_minus_X_means*y_minus_y_means
Upvotes: 3
Reputation: 530
Instead of
for i in X_minus_X_means and y_minus_y_means:
try to write
for i in range(len(X_minus_X_means)):
otherwise i
is not an integer and cannot be used as an index
Upvotes: 1
Reputation: 51934
Perhaps the division is yielding a float
which is causing the index to be a non-integer?
X_means = sum(X_positions) / len(X_positions)
For integer division in python3, the double slash operator //
is available:
X_means = sum(X_positions) // len(X_positions)
You could also use ceil
, floor
, round
, or int(val)
.
Upvotes: 0