Reputation: 1587
I'm fighting with should be quite an easy task. Creation of line plot with 2 series. So far I managed to do so but I think it is not the fastest way. I wanted to ask if anyone knows how to do it faster/smarter?
The problem which I have is that values of this 2 series are in the same column 'values' and to get series I should split them according to 'category' column. So far I manage to do so by doing few transformations before plotting it. It seems to be not the fastest solution. does anyone know a way to make this plot without transformations which I made in my code below?
My code:
import numpy.random as r
import pandas as pn
#generate values
values= r.random_sample(200)
labels = range(1,101)+range(1,101)
category = [x for x in 100*'a'+100*'b' ]
#create dataframe
df =pn.DataFrame({'labels': labels,
'values': values,
'category': category})
### I tired here to create plot but was unsuccessful so far. And needed to make below transformation.
#transformation
df =df.set_index('labels')
dfA= df[df['category']=='a']
del dfA['category']
dfA.columns=['values_a']
dfB=df[df['category']=='b']
del dfB['category']
dfB.columns=['values_b']
#joining
frames=[dfA,dfB]
dff= pn.concat(frames, axis=1)
#ploting
dff.plot()
Thank you in advance for help!
Upvotes: 3
Views: 2977
Reputation: 1603
You can use seaborn to achieve this as a scatter plot :
import seaborn as sns
sns.lmplot('labels', 'values', data=df, hue='category')
If you prefer a line plot :
import seaborn as sns
sns.pointplot('labels', 'values', data=df, hue='category')
Upvotes: 1
Reputation: 12590
You do have to transform your data since you do not want to plot your columns as they are. But there is an easier way:
>>> df.pivot(index='labels', columns='category', values='values').head()
category a b
labels
1 0.133046 0.762676
2 0.717739 0.774000
3 0.059960 0.547297
4 0.464269 0.951537
5 0.227428 0.987621
>>> df.pivot(index='labels', columns='category', values='values').plot()
Upvotes: 2
Reputation: 862581
IIUC you can use concat
with parameter keys
as column names:
#transformation
df = df.set_index('labels')
dff = pn.concat([df.loc[df['category']=='a', 'values'],
df.loc[df['category']=='b', 'values']],
axis=1,
keys=['values_a', 'values_b'])
print dff
values_a values_b
labels
1 0.240131 0.083861
2 0.137078 0.788497
3 0.017947 0.985262
4 0.053830 0.882618
5 0.772023 0.753158
6 0.258116 0.322541
7 0.837611 0.188269
8 0.551581 0.599734
... ... ...
... ... ...
... ... ...
93 0.413466 0.794807
94 0.791670 0.186960
95 0.033857 0.070732
96 0.805209 0.570014
97 0.691454 0.125113
98 0.564201 0.104882
99 0.656381 0.176520
100 0.007758 0.340838
[100 rows x 2 columns]
EDIT: You can omit concat
and then set legend by ax.legend
:
import matplotlib.pyplot as plt
plt.figure()
df.loc[df['category']=='a', 'values'].plot()
ax = df.loc[df['category']=='b', 'values'].plot()
ax.legend(['values_a','values_b'])
Upvotes: 2