Reputation: 11
This code is supposed to recalculate the (centroid), but the problem is that it sometimes returns (null) values, this problem started with large datasets, although it was working fine.
def Calc_New_cent(centroids, Data):
for m in range(k):
for n in range(number_of_features):
centroids.iloc[m, n] = np.mean(Data[Data['clusters'] == m + 1]['{}'.format(n + 1)])
return centroid
Upvotes: 0
Views: 113
Reputation: 21
If it works fine on small datasets, then the problem may are the large datasets themself. They often contain some invalid data. This can result into nan values. You should check the dataset.
Upvotes: 1