Reputation: 1227
I am a Python newbie and somehow I can not manage to get a simple histogramm of a column in my dataframe. This is what df['col'].describe()
returns:
count 2.905430e+05
mean 2.732126e+06
std 5.743739e+08
min 3.095194e-03
25% 2.341733e+03
50% 5.092117e+03
75% 1.092925e+04
max 2.089247e+11
Name: avg_power_in_w, dtype: float64
I tried:
df['col'].hist(bins=10)
plt.plot()
Some solutions where suggesting it to use np.histogram(...)
, but that does not feel natural.
Actually a bin size e.g. 1000 and everything above 10000 in one bin would be nice.
Thanks, I'd appreciate a hint.
Upvotes: 0
Views: 182
Reputation: 1227
As mentioned in the comments, it seems the like some outliers made the range of values to big. So best practice was
#make a copy of the dataframe, so the data keeps untouched
df_copy = df.copy()
#change the values in the column
df.loc[df[col] > 10000] = 10000
#the print it as usual
df['col'].hist(bins=10)
plt.plot()
Upvotes: 0