Kevin Thompson
Kevin Thompson

Reputation: 2506

matplotlib plot datetime in pandas DataFrame

I have a pandas DataFrame that looks like this training.head()

enter image description here

The DataFrame has been sorted by date. I'd like to make a scatterplot where the date of the campaign is on the x axis and the rate of success is on the y axis. I was able to get a line graph by using training.plot(x='date',y='rate'). However, when I changed that to training.plot(kind='scatter',x='date',y='rate') I get an error: KeyError: u'no item named date'

Why does my index column go away when I try to make a scatterplot? Also, I bet I need to do something with that date field so that it doesn't get treated like a simple string, don't I?

Extra credit, what would I do if I wanted each of the account numbers to plot with a different color?

Upvotes: 20

Views: 33726

Answers (3)

Samira Khodai
Samira Khodai

Reputation: 21

The plotting code only considers numeric columns, so the piece of code bellow will give you error:

df['Date'] = pd.to_datetime(df.Date) 

try pd.to_numeric as below and finnaly use scatter plot. It worked for me!

df['Date'] = pd.to_numeric(df.Date)

Upvotes: 2

MarkNS
MarkNS

Reputation: 4021

I've found it simpler to change the style of a line chart to not include the connecting lines:

cb_df.plot(figsize=(16, 6), style='o')

enter image description here

Upvotes: 9

TomAugspurger
TomAugspurger

Reputation: 28946

If I remember correctly, the plotting code only considers numeric columns. Internally it selects just the numeric columns, so that's why you get the key error.

What's the dtype of date? If it's a datetime64, you can recast it as an np.int64:

df['date_int'] = df.date.astype(np.int64)

And then you're plot.

For the color part, make a dictionary of {account number: color}. For example:

color_d = {1: 'k', 2: 'b', 3: 'r'}

Then when you plot:

training.plot(kind='scatter',x='date',y='rate', color=df.account.map(color_d))

Upvotes: 13

Related Questions