Liza
Liza

Reputation: 971

Pandas groupby results on the same plot

I am dealing with the following data frame (only for illustration, actual df is quite large):

   seq          x1         y1
0  2           0.7725      0.2105
1  2           0.8098      0.3456
2  2           0.7457      0.5436
3  2           0.4168      0.7610
4  2           0.3181      0.8790
5  3           0.2092      0.5498
6  3           0.0591      0.6357
7  5           0.9937      0.5364
8  5           0.3756      0.7635
9  5           0.1661      0.8364

Trying to plot multiple line graph for the above coordinates (x as "x1 against y as "y1").

Rows with the same "seq" is one path, and has to be plotted as one separate line, like all the x, y coordinates corresponding the seq = 2 belongs to one line, and so on.

I am able to plot them, but on a separate graphs, I want all the lines on the same graph, Using subplots, but not getting it right.

import matplotlib as mpl
import matplotlib.pyplot as plt

%matplotlib notebook

df.groupby("seq").plot(kind = "line", x = "x1", y = "y1")

This creates 100's of graphs (which is equal to the number of unique seq). Suggest me a way to obtain all the lines on the same graph.

**UPDATE*

To resolve the above problem, I implemented the following code:

     fig, ax = plt.subplots(figsize=(12,8))
     df.groupby('seq').plot(kind='line', x = "x1", y = "y1", ax = ax)
     plt.title("abc")
     plt.show()

Now, I want a way to plot the lines with specific colors. I am clustering path from seq = 2 and 5 in cluster 1; and path from seq = 3 in another cluster.

So, there are two lines under cluster 1 which I want in red and 1 line under cluster 2 which can be green.

How should I proceed with this?

Upvotes: 8

Views: 10374

Answers (5)

Anthony
Anthony

Reputation: 161

based on Serenity's anwser, i make the legend better.

import pandas as pd
import matplotlib.pylab as plt
import numpy as np

# random df
df = pd.DataFrame(np.random.randint(0,10,size=(25, 3)), columns=['ProjID','Xcoord','Ycoord'])

# plot groupby results on the same canvas 
grouped = df.groupby('ProjID')
fig, ax = plt.subplots(figsize=(8,6))
grouped.plot(kind='line', x = "Xcoord", y = "Ycoord", ax=ax)
ax.legend(labels=grouped.groups.keys()) ## better legend
plt.show()

and you can also do it like:

grouped = df.groupby('ProjID')
fig, ax = plt.subplots(figsize=(8,6))
g_plot = lambda x:x.plot(x = "Xcoord", y = "Ycoord", ax=ax, label=x.name)
grouped.apply(g_plot)
plt.show()

and it looks like: enter image description here

Upvotes: 1

brian_ds
brian_ds

Reputation: 367

Here is a working example including the ability to adjust legend names.

grp = df.groupby('groupCol')

legendNames = grp.apply(lambda x: x.name)  #Get group names using the name attribute.
#legendNames = list(grp.groups.keys())  #Alternative way to get group names. Someone else might be able to speak on speed. This might iterate through the grouper and find keys which could be slower? Not sure

plots = grp.plot('x1','y1',legend=True, ax=ax)

for txt, name in zip(ax.legend_.texts, legendNames):
    txt.set_text(name)

Explanation: Legend values get stored in the parameter ax.legend_ which in turn contains a list of Text() objects, with one item per group, where Text class is found within the matplotlib.text api. To set the text object values, you can use the setter method set_text(self, s).

As a side note, the Text class has a number of set_X() methods that allow you to change the font sizes, fonts, colors, etc. I haven't used those, so I don't know for sure they work, but can't see why not.

Upvotes: 0

Serenity
Serenity

Reputation: 36695

You need to init axis before plot like in this example

import pandas as pd
import matplotlib.pylab as plt
import numpy as np

# random df
df = pd.DataFrame(np.random.randint(0,10,size=(25, 3)), columns=['ProjID','Xcoord','Ycoord'])

# plot groupby results on the same canvas 
fig, ax = plt.subplots(figsize=(8,6))
df.groupby('ProjID').plot(kind='line', x = "Xcoord", y = "Ycoord", ax=ax)
plt.show()

enter image description here

Upvotes: 9

mechanical_meat
mechanical_meat

Reputation: 169444

Another way:

for k,g in df.groupby('ProjID'):
  plt.plot(g['Xcoord'],g['Ycoord'])

plt.show()

Upvotes: 5

piRSquared
piRSquared

Reputation: 294498

Consider the dataframe df

df = pd.DataFrame(dict(
        ProjID=np.repeat(range(10), 10),
        Xcoord=np.random.rand(100),
        Ycoord=np.random.rand(100),
    ))

Then we create abstract art like this

df.set_index('Xcoord').groupby('ProjID').Ycoord.plot()

enter image description here

Upvotes: 7

Related Questions