Create multiple boxplots from dataframe

I want to create multiple (two in this case) boxplots based on a data in a dataframe

I have the following dataframe:

    Country   Fund                                   R^2            Style
0   Austria  BG EMCore Convertibles Global CHF R T   0.739131   Allocation
1   Austria  BG EMCore Convertibles Global R T       0.740917   Allocation
2   Austria  BG Trend A T                            0.738376   Fixed Income
3   Austria  Banken Euro Bond-Mix A                  0.71161    Fixed Income
4   Austria  Banken KMU-Fonds T                      0.778276   Allocation
5   Brazil   Banken Nachhaltigkeitsfonds T           0.912808   Allocation
6   Brazil   Banken Portfolio-Mix A                  0.857019   Allocation
7   Brazil   Banken Portfolio-Mix T                  0.868856   Fixed Income
8   Brazil   Banken Sachwerte-Fonds T                0.730626   Fixed Income
9   Brazil   Banken Strategie Wachstum T             0.918684   Fixed Income

I want to create a boxplot chart for each country summarized by Style and showing the distribution of R^2. I was thinking of groupby operation but somehow I don't manage to make two charts for each country.

Thanks in advance

Upvotes: 0

Views: 3850

Answers (3)

Came up with some solution myself.

df= "This is the table from the original question"   

uniquenames=df.Country.unique()

# create dictionary of the data with countries set as keys
diction={elem:pd.DataFrame for elem in uniquenames}

# fill dictionary with values
for key in diction.keys():
diction[key]=df[:][df.Country==key]

#plot the data
for i in diction.keys():
diction[i].boxplot(column="R^2",by="Style",
                   figsize=(15,6),patch_artist=True,fontsize=12)
plt.xticks(rotation=90)
plt.title(i,fontsize=12)

Upvotes: 1

Zaraki Kenpachi
Zaraki Kenpachi

Reputation: 5730

Here You go. Description in code.

=^..^=

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from io import StringIO

data = StringIO("""
Country R^2 Style
Austria 0.739131 Allocation
Austria 0.740917 Allocation
Austria 0.738376 Fixed_Income
Austria 0.71161 Fixed_Income
Austria 0.778276 Allocation
Brazil 0.912808 Allocation
Brazil 0.857019 Allocation
Brazil 0.868856 New_Style
Brazil 0.730626 Fixed_Income
Brazil 0.918684 Fixed_Income
Brazil 0.618684 New_Style
""")

# load data into data frame
df = pd.read_csv(data, sep=' ')

# group data by Country
grouped_data = df.groupby(['Country'])

# create list of grouped data frames
df_list = []
country_list = []
for item in list(grouped_data):
    df_list.append(item[1])
    country_list.append(item[0])

# plot box for each Country
for df in df_list:
    country = df['Country'].unique()
    df = df.drop(['Country'], axis=1)
    df = df[['Style', 'R^2']]
    columns_names = list(set(df['Style']))
    # pivot rows into columns
    df = df.assign(g = df.groupby('Style').cumcount()).pivot('g','Style','R^2')
    # plot box
    df.boxplot(column=colums_names)
    plt.title(country[0])
    plt.show()

Output:

enter image description here enter image description here

Upvotes: 2

KRKirov
KRKirov

Reputation: 4004

Use seaborn for this kind of tasks. Here are a couple of options:

Use seaborn's boxplot

import seaborn as sns
sns.set()

# Note - the data is stored in a data frame df
sns.boxplot(x='Country', y='R^2', hue='Style', data=df)

enter image description here

Alternatively, you can use seaborn's FacetGrid.

g = sns.FacetGrid(df, col="Country",  row="Style")
g = g.map(sns.boxplot, 'R^2', orient='v')

enter image description here

Upvotes: 0

Related Questions