jolykwwl
jolykwwl

Reputation: 1

Producing multiple plots from data-frame using columns as y-axis values (looping through dataframe)

I have a data frame such as below:

https://i.sstatic.net/OKKrz.png

Two categorical variables are impulsivity and treatment and multiple dependent variables (prot_width etc..).

I have managed to produce a boxplot that models the dependent variable by impulsivity and treatment;

sns.boxplot(x='treatment', y='prot_width', hue='impulsivity',
            palette=['b','r'], data=data)
sns.despine(offset=10, trim=True)

which produces the graph below;

https://i.sstatic.net/YfXn1.png

Now what I want to do is produce the exact same graph but for each dependent variable. I want to loop through each dependent variable column renaming the y-axis.

I have searched for for loops etc. but can't work out how to call the columns and more importantly how to change the y-axis during the loop.

Upvotes: 0

Views: 1653

Answers (1)

Parfait
Parfait

Reputation: 107567

Simply loop through the numeric data columns using DataFrame.columns which is an iterable object and then pass iterator variable (here being col) into y argument of boxplot.

for col in data.columns[4:len(data.columns)]:
    sns.boxplot(x='treatment', y=col, hue='impulsivity',
                palette=['b','r'], data=data)
    sns.despine(offset=10, trim=True)

    plt.show()

Alternatively use select_dtypes for all numeric columns:

for col in data.select_dtypes(['float', 'int']).columns:
    ...

Or even filter to leave out non-numeric columns:

for col in data.filter(regex="[^(subject|protrusion|impulsivity|treatment)]").columns:
    ...

To demonstrate with random data:

import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
import seaborn as sns

np.random.seed(9192018)
demo_df = pd.DataFrame({'tool': np.random.choice(['pandas', 'r', 'julia', 'sas', 'stata', 'spss'],500),
                        'os': np.random.choice(['windows', 'mac', 'linux'],500), 
                        'prot_width': np.random.randn(500)*100,
                        'prot_length': np.random.uniform(0,1,500),                   
                        'prot_lwr': np.random.randint(100, size=500)
                       }, columns=['tool', 'os', 'prot_width', 'prot_length', 'prot_lwr'])

for col in demo_df.columns[2:len(demo_df.columns)]:
    sns.boxplot(x='tool', y=col, hue='os', palette=['b','r'], data=demo_df)
    sns.despine(offset=10, trim=True)

    plt.legend(loc='center', ncol = 3, bbox_to_anchor=(0.5, 1.10))
    plt.show()
    plt.clf()

plt.close()

Plot Output

Upvotes: 1

Related Questions