Reputation: 95
I Have a Dataframe that has records for the Zone wise sales, need to cluster them based on avg sales
Zone Consumption
North 1
South 3
East 10
North 8
North2 0
South 5
I used the below code
def Clustering(row):
if row['Consumption']<.5*np.mean(['Consumption']):
val='E'
elif row['Consumption']<.75*np.mean(['Consumption']):
val='D'
elif row['Consumption']<1*np.mean(['Consumption']):
val='C'
elif row['Consumption']<1.5*np.mean(['Consumption']):
val='B'
elif row['Consumption']<2.5*np.mean(['Consumption']):
val='A'
else:
val='Z'
return val
Traceback
<ipython-input-21-f08d8263edc0> in Clustering(row)
1 def Clustering(row):
----> 2 if row['Consumption']<.5*np.mean(['Consumption']):
3 val='E'
4 elif row['Consumption']<.75*np.mean(['Consumption']):
5 val='D'
<__array_function__ internals> in mean(*args, **kwargs)
~\anaconda3\lib\site-packages\numpy\core\fromnumeric.py in mean(a, axis, dtype, out, keepdims)
3333
3334 return _methods._mean(a, axis=axis, dtype=dtype,
-> 3335 out=out, **kwargs)
3336
3337
~\anaconda3\lib\site-packages\numpy\core\_methods.py in _mean(a, axis, dtype, out, keepdims)
149 is_float16_result = True
150
--> 151 ret = umr_sum(arr, axis, dtype, out, keepdims)
152 if isinstance(ret, mu.ndarray):
153 ret = um.true_divide(
TypeError: cannot perform reduce with flexible type
My assumption was that the error is caused due to maybe the Sales column having some str values but that isnt the case, how shoud i go abt fixing this.
Upvotes: 0
Views: 127
Reputation: 93181
Have you tried pd.cut
? Assuming df['Consumption'].mean() >= 0
:
# Define the bins, which are double-ended by -INF and INF
bins = np.array([.5, .75, 1, 1.5, 2.5]) * df['Consumption'].mean()
bins = np.hstack((np.NINF, bins, np.inf))
df['Cluster'] = pd.cut(df['Consumption'], bins, labels=list('EDCBAZ')).astype('str')
Upvotes: 1