Reputation: 2506
I have a pandas DataFrame that looks like this training.head()
The DataFrame has been sorted by date. I'd like to make a scatterplot where the date of the campaign is on the x axis and the rate of success is on the y axis. I was able to get a line graph by using training.plot(x='date',y='rate')
. However, when I changed that to training.plot(kind='scatter',x='date',y='rate')
I get an error: KeyError: u'no item named date'
Why does my index column go away when I try to make a scatterplot? Also, I bet I need to do something with that date field so that it doesn't get treated like a simple string, don't I?
Extra credit, what would I do if I wanted each of the account numbers to plot with a different color?
Upvotes: 20
Views: 33726
Reputation: 21
The plotting code only considers numeric columns, so the piece of code bellow will give you error:
df['Date'] = pd.to_datetime(df.Date)
try pd.to_numeric
as below and finnaly use scatter plot
. It worked for me!
df['Date'] = pd.to_numeric(df.Date)
Upvotes: 2
Reputation: 4021
I've found it simpler to change the style
of a line chart to not include the connecting lines:
cb_df.plot(figsize=(16, 6), style='o')
Upvotes: 9
Reputation: 28946
If I remember correctly, the plotting code only considers numeric columns. Internally it selects just the numeric columns, so that's why you get the key error.
What's the dtype of date
? If it's a datetime64
, you can recast it as an np.int64
:
df['date_int'] = df.date.astype(np.int64)
And then you're plot.
For the color part, make a dictionary of {account number: color}
. For example:
color_d = {1: 'k', 2: 'b', 3: 'r'}
Then when you plot:
training.plot(kind='scatter',x='date',y='rate', color=df.account.map(color_d))
Upvotes: 13