Marc-Olivier Duceppe
Marc-Olivier Duceppe

Reputation: 63

Pandas groupby result splitted into two columns?

I have a pandas dataframe and I want to summarize/reorganize it to produce a figure. I think what I'm looking for involves groupby.

Here's what my dataframe df looks like:

Channel Flag
1       pass
2       pass
3       pass
1       pass
2       pass
3       pass
1       fail
2       fail
3       fail

And this is what I want my dataframe to look like:

Channel pass    fail
1       2       1
2       2       1
3       2       1

Running the following code gives something "close", but not in the format I would like:

In [12]: df.groupby(['Channel', 'Flag']).size()
Out[12]:
Channel Flag         
1       fail        1
        pass        2
2       fail        1
        pass        2
3       fail        1
        pass        2

Maybe this output is actually fine to make my plot. It's just that I already have the code to plot the data with the previous format. I'm adding the code in case it would be relevant:

df_all = pd.DataFrame()
        df_all['All'] = df['Pass'] + df['Fail']
        df_pass = df[['Pass']]  # The double square brackets keep the column name
        df_fail = df[['Fail']]

maxval = max(df_pass.index)  # maximum channel value
layout = FastqPlots.make_layout(maxval=maxval)
value_cts = pd.Series(df_pass['Pass'])
for entry in value_cts.keys():
    layout.template[np.where(layout.structure == entry)] = value_cts[entry]
sns.heatmap(data=pd.DataFrame(layout.template, index=layout.yticks, columns=layout.xticks),
            xticklabels="auto", yticklabels="auto",
            square=True,
            cbar_kws={"orientation": "horizontal"},
            cmap='Blues',
            linewidths=0.20)
ax.set_title("Pass reads output per channel")
plt.tight_layout()  # Get rid of extra margins around the plot
fig.savefig(out + "/channel_output_all.png")

Any help/advice would be much appreciated. Thanks!

Upvotes: 1

Views: 42

Answers (1)

EBDS
EBDS

Reputation: 1734

df.groupby(['Channel', 'Flag'],as_index=False).size().pivot('Channel','Flag','size')

enter image description here

Upvotes: 1

Related Questions