Bamboo
Bamboo

Reputation: 41

Boxplots from count table in Python

I have a count table as dataframe in Python and I want to plot my distribution as a boxplot. E.g.:

df=pandas.DataFrame.from_items([('Quality',[29,30,31,32,33,34,35,36,37,38,39,40]), ('Count', [3,38,512,2646,9523,23151,43140,69250,107597,179374,840596,38243])])

I 'solved' it by repeating my quality value by its count. But I dont think its a good way and my dataframe is getting very very big.

In R there its a one liner:

ggplot(df, aes(x=1,y=Quality,weight=Count)) + geom_boxplot()

This will output:!Boxplot from R1

My aim is to compare the distribution of different groups and it should look like this Can Python solve it like this too?

Upvotes: 4

Views: 2953

Answers (1)

datahero
datahero

Reputation: 101

What are you trying to look at here? The boxplot hereunder will return the following figure.

enter image description here

import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline
df=pd.DataFrame.from_items([('Quality',[29,30,31,32,33,34,35,36,37,38,39,40]), ('Count', [3,38,512,2646,9523,23151,43140,69250,107597,179374,840596,38243])])
plt.figure()
df_box = df.boxplot(column='Quality', by='Count',return_type='axes')

If you want to look at your Quality distibution weighted on Count, you can try plotting an histogramme:

plt.figure()
df_hist = plt.hist(df.Quality, bins=10, range=None, normed=False, weights=df.Count)

Histogramme

Upvotes: 1

Related Questions