AndreasInfo
AndreasInfo

Reputation: 1227

Can't manage to plot a histogram of pandas series

I am a Python newbie and somehow I can not manage to get a simple histogramm of a column in my dataframe. This is what df['col'].describe() returns:

count    2.905430e+05
mean     2.732126e+06
std      5.743739e+08
min      3.095194e-03
25%      2.341733e+03
50%      5.092117e+03
75%      1.092925e+04
max      2.089247e+11
Name: avg_power_in_w, dtype: float64

I tried:

df['col'].hist(bins=10)
plt.plot()

which results in: enter image description here

Some solutions where suggesting it to use np.histogram(...), but that does not feel natural.

Actually a bin size e.g. 1000 and everything above 10000 in one bin would be nice.

Thanks, I'd appreciate a hint.

Upvotes: 0

Views: 182

Answers (1)

AndreasInfo
AndreasInfo

Reputation: 1227

As mentioned in the comments, it seems the like some outliers made the range of values to big. So best practice was

#make a copy of the dataframe, so the data keeps untouched
df_copy = df.copy()

#change the values in the column
df.loc[df[col] > 10000] = 10000

#the print it as usual
df['col'].hist(bins=10)
plt.plot()

Upvotes: 0

Related Questions