Juma Hamdan
Juma Hamdan

Reputation: 69

How to construct a side-by-side boxplot for a pandas dataframe

Men and women are in a column labeled 'sex'. I want to plot them according to their happiness levels. So, one figure two columns and one row.

I have tried to extract each gender:

men = df[df['sex'] == 'Men']
women = df[df['sex'] == 'Women']
df_happy_sex = df[['happy', 'sex']].copy()

![https://ibb.co/yqmWKkf]

Upvotes: 4

Views: 5054

Answers (2)

Trenton McKinney
Trenton McKinney

Reputation: 62543

  • Boxplots in python
  • Boxplots require a numeric component, as they are a visualization of statical data, specifically spread.
  • Use seaborn to make your plots look nicer

Code:

import pandas as pd
import matplotlib.pyplot as plt  # doesn't have color by hue
import seaborn as sns
import numpy as np  # for generating random data
import random  # for random gender selection

np.random.seed(10)
random.seed(10)
df = pd.DataFrame({'age': [x for x in np.random.randint(20, 70, 100)],
                   'feeling': [random.choice(['happy', 'sad']) for _ in range(100)], 
                   'gender': [random.choice(['male', 'female']) for _ in range(100)]})

# display(df)
   age feeling  gender
0   29     sad  female
1   56     sad    male
2   35     sad  female
3   20     sad  female
4   69   happy  female

sns.boxplot(y='age', x='feeling', data=df, hue='gender')
plt.show()

enter image description here

Using groupby with only categorical data:

df = pd.DataFrame({'feeling': [random.choice(['happy', 'sad|']) for _ in range(100)],
                   'gender': [random.choice(['male', 'female']) for _ in range(100)]})

df.groupby(['feeling','gender'])['gender'].count().plot(kind='bar')

enter image description here

Alternate data - feeling as a numeric value:

df = pd.DataFrame({'feeling': [x for x in np.random.randint(0, 101, 100)],
                   'gender': [random.choice(['male', 'female']) for _ in range(100)]})

plt.figure(figsize=(8, 7))
sns.boxplot(y='feeling', x='gender', data=df)
plt.show()

enter image description here

Upvotes: 2

MrCorote
MrCorote

Reputation: 563

import pandas as pd
import matplotlib.pyplot as plt

I've created a imaginary sample of your data-frame.

data = [['men', 55], ['men', 77], ['women', 85],
        ['men', 70], ['women', 68], ['women', 64],
        ['men', 86], ['men', 64], ['women', 54],
        ['men', 43], ['women', 86],  ['women', 91]]

df = pd.DataFrame(data, columns = ['sex', 'happy'])

You can just:

df.boxplot(by=['sex'], sym ='', figsize = [6, 6])

It yields: I guess that is what you want.

Hapiness

Upvotes: 2

Related Questions