ARJ
ARJ

Reputation: 2080

Iterating over columns in data frame by skipping first column and drawing multiple plots

I have a data frame as following, df.head()

     ID AS_FP   AC_FP   RP11_FP RP11_be AC_be   AS_be   Info
AE02    0.060233    0   0.682884    0.817115    0.591182    0.129252    SAP
AE03    0   0   0   0.889181    0.670113    0.766243    SAP
AE04    0   0   0.033256    0.726193    0.171861    0.103839    others
AE05    0   0   0.034988    0.451329    0.431836    0.219843    others

What I am aiming is to plot each column starting from AS_FP til RP11_beta as lmplot, each x axis is column ending with FP and y axis is its corresponding column ending with be. And I wanted to save it as separate files so I strated iterating through the columns by skipping first column ID, like this,

for ind, column in enumerate(df.columns):
    if column.split('_')[0] == column.split('_')[0]:

But I got lost how to continue, I need to plot

sns.lmplot(x, y, data=df, hue='Info',palette=colors, fit_reg=False,
           size=10,scatter_kws={"s": 700},markers=["o", "v"])

and save each image as seperate file

Upvotes: 0

Views: 92

Answers (1)

Alexey Trofimov
Alexey Trofimov

Reputation: 5007

Straightforward solution:

1) Toy data:

import pandas as pd
from collections import OrderedDict
import matplotlib.pyplot as plt
import seaborn as sns

dct = OrderedDict()
dct["ID"] = ["AE02", "AE03", "AE04", "AE05"]
dct["AS_FP"] = [0.060233, 0, 0, 0]
dct["AC_FP"] = [0, 0,0, 0]
dct["RP11_FP"] = [0.682884, 0, 0.033256, 0.034988]
dct["AS_be"] = [0.129252, 0.766243, 0.103839, 0.219843]
dct["AC_be"] = [0.591182, 0.670113, 0.171861, 0.431836]
dct["RP11_be"] = [0.817115, 0.889181, 0.726193, 0.451329]
dct["Info"] = ["SAP", "SAP", "others", "others"]

df = pd.DataFrame(dct)

2) Iterating through pairs, saving each figure with unique filename:

graph_cols = [col for col in df.columns if ("_FP" in col) or ("_be" in col)]

fps = sorted([col for col in graph_cols if "_FP" in col])
bes = sorted([col for col in graph_cols if "_be" in col])

for x, y in zip(fps, bes):
    snsplot = sns.lmplot(x, y, data=df, fit_reg=False, hue='Info',
           size=10, scatter_kws={"s": 700})
    snsplot.savefig(x.split("_")[0] + ".png")

You can add needed params in lmlplot as you need.

Upvotes: 1

Related Questions