komodovaran_
komodovaran_

Reputation: 2012

Loop over grouped pandas df and export individual plots

The documentation seems a little sparse, as to how every element works, so here goes:

I have a bunch of files that I would like to iterate over and export a plot, for every single file.

df_all.head()

Returns

    Dem-Dexc    Aem-Dexc    Aem-Aexc    S       E     fit     frame filename
0   18150.0595  18548.2451  15263.7451  0.7063  0.5054  0.879   1.0 Traces_exp22_tif_pair16.txt
1   596.9286    7161.7353   1652.8922   0.8244  0.9231  0.879   2.0 Traces_exp22_tif_pair16.txt
2   93.2976     3112.3725   2632.6667   0.5491  0.9709  0.879   3.0 Traces_exp22_tif_pair16.txt
3   1481.1310   4365.4902   769.3333    0.8837  0.7467  0.879   4.0 Traces_exp22_tif_pair16.txt
4   583.1786    6192.6373   1225.5392   0.8468  0.9139  0.879   5.0 Traces_exp22_tif_pair16.txt

And now I would like to group and iterate:

for group in df_all.groupby("filename"):
    plot = sns.regplot(data = group, x = "Dem-Dexc", y = "frame")

But I get TypeError: tuple indices must be integers or slices, not str. Why do I get this?

Upvotes: 1

Views: 150

Answers (1)

jezrael
jezrael

Reputation: 863146

I think you need change:

for group in df_all.groupby("filename")

to:

for i, group in df_all.groupby("filename"):
    plot = sns.regplot(data = group, x = "Dem-Dexc", y = "frame")

for unpack tuples.

Or select second value of tuple by [1]:

for group in df_all.groupby("filename"):
    plot = sns.regplot(data = group[1], x = "Dem-Dexc", y = "frame")

You can check tuple output by:

for group in df_all.groupby("filename"):
    print (group)

('Traces_exp22_tif_pair16.txt',      Dem-Dexc    Aem-Dexc    Aem-Aexc       S       E    fit  frame  \
0  18150.0595  18548.2451  15263.7451  0.7063  0.5054  0.879    1.0   
1    596.9286   7161.7353   1652.8922  0.8244  0.9231  0.879    2.0   
2     93.2976   3112.3725   2632.6667  0.5491  0.9709  0.879    3.0   
3   1481.1310   4365.4902    769.3333  0.8837  0.7467  0.879    4.0   
4    583.1786   6192.6373   1225.5392  0.8468  0.9139  0.879    5.0   

                      filename  
0  Traces_exp22_tif_pair16.txt  
1  Traces_exp22_tif_pair16.txt  
2  Traces_exp22_tif_pair16.txt  
3  Traces_exp22_tif_pair16.txt  
4  Traces_exp22_tif_pair16.txt  )

vs:

for i, group in df_all.groupby("filename"):
    print (group)

     Dem-Dexc    Aem-Dexc    Aem-Aexc       S       E    fit  frame  \
0  18150.0595  18548.2451  15263.7451  0.7063  0.5054  0.879    1.0   
1    596.9286   7161.7353   1652.8922  0.8244  0.9231  0.879    2.0   
2     93.2976   3112.3725   2632.6667  0.5491  0.9709  0.879    3.0   
3   1481.1310   4365.4902    769.3333  0.8837  0.7467  0.879    4.0   
4    583.1786   6192.6373   1225.5392  0.8468  0.9139  0.879    5.0   

                      filename  
0  Traces_exp22_tif_pair16.txt  
1  Traces_exp22_tif_pair16.txt  
2  Traces_exp22_tif_pair16.txt  
3  Traces_exp22_tif_pair16.txt  
4  Traces_exp22_tif_pair16.txt  

If want save output to pictures png:

for i, group in df_all.groupby("filename"):
    plot = sns.regplot(data = group, x = "Dem-Dexc", y = "frame")
    fig = plot.get_figure()
    fig.savefig("{}.png".format(i.split('.')[0]))

Upvotes: 1

Related Questions