JuliusDariusBelosarius
JuliusDariusBelosarius

Reputation: 101

Grouped Boxplot in Seaborn

with the help of some wonderful people around here, I was able to generate my first box plots in seaborn. I have 2 separate seaborn plots that show two comparisons from an excel sheet. What I want to do now is present both the data comparisons (what is shown in the 2 columns below) on the same plot, essentially creating a grouped boxplot. I tried to convert the data to dataframes, concat, and melt it, but was unsuccessful. I am pretty new to python, so I was wondering if you all could help me out. Below is what I have for code.

import pandas as pd
import numpy as np
import xlrd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
from pandas import ExcelWriter
from pandas import ExcelFile
from pandas import DataFrame


excel_file =  'Project File Merger.xlsm'

list_dfs = []

xls = xlrd.open_workbook(excel_file,on_demand=True)

sheet_names = xls.sheet_names()

d_data = {}
for i, sheet_name in enumerate(xls.sheet_names()):
    df = pd.read_excel(excel_file,sheet_name)
    d_data[sheet_names[i]] = df.loc[:,['HMB','PSPPM']]


keys = list(d_data.keys())
values_list1 = list(d_data.values())

print(keys[0])
print(values_list1[0])

Which returns

Check1.xlsm
                             HMB                                  PSPPM
0                            0.141005                             0.429498
1                            0.141005                             0.429498
2                            0.066071                             0.706797
3                                 NaN                             0.080378
4                            0.045815                             0.004076
5                                 NaN                             0.630156
6                                 NaN                             0.723957
7                                 NaN                             0.712118
8                            0.391531                             0.791329
9                            0.036823                             0.506834
10                           0.391531                             0.791329

Now this is where I am stuck. I have a values_list that has 17 element (one for each sheet in the excel file). I would like the data from each sheet to be grouped together. I think I might be running into a problem because there are 2 columns in each list element? Any suggestions would be appreciated!

Upvotes: 0

Views: 623

Answers (1)

Diziet Asahi
Diziet Asahi

Reputation: 40667

I'm not entirely sure to understand your problem fully, in particular in relation to boxplots. But, as far as I understand, you have a dictionary with the name of your excel sheets as the keys, and a DataFrame as the value. And you want to merge all these DataFrame into a single one so you can plot all the values together?

If that's correct, then a simple pd.concat can accept a dictionary and generate a new DataFrame with the keys as indexes. You can then use reset_index() to flatten out the DataFrame:

new_df = pd.concat(d_data).reset_index()

After that, I don't know how you want to draw your boxplot, but you could for example draw the values of one of your column in each of the sheets:

sns.boxplot(x='level_0', y='HMB', data=new_df)

Upvotes: 1

Related Questions