user202987
user202987

Reputation: 1246

Python - Trouble plotting datetime index with pandas and matplotlib

I'm loading a data file, extracting certain columns, and plotting them to PDF with matplotlib.

When I load my data file it into Pandas, I get a DateTimeIndex. If I plot the data in this form, all goes well.

The problem arises when I select a subset of the data based on the time, ie:

data = data.ix[data.index >= start_time]
data = data.ix[data.index <= end_time]

Now when I go to plot the data, pandas seems to have changed something because the DateTimeIndex is an array of npdatetime64 types, which are apparently unsupported by matplotlib and throw an error. (something in datetime.fromordinal)

How can I get around this problem?

I've tried plotting:

data.index.value.astype(datetime)

But this still throws an error within matplotlib! (Python int can't be converted to C long)

Is there a way I can prevent pandas for ruining the data in the first place when I ix it?

I'm using Python 2.7, Numpy 1.7, pandas 0.11, matplotlib 1.2.1.

EDIT: It seems that I am experiencing the same problem as seen here: Plot numpy datetime64 with matplotlib

Upvotes: 2

Views: 4990

Answers (1)

Nipun Batra
Nipun Batra

Reputation: 11387

I created a minimal working example in an IPython notebook here.

The trick is to use df.ix as follows:

df_new=df.ix[start_time:end_time]

For reference, i am posting part of the answer from the notebook here:


df

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1668 entries, 
2013-10-12 07:50:00 to 2013-10-23 21:40:00
Freq: 10T
Data columns (total 2 columns):
column_1    1668  non-null values
column_2    1668  non-null values    
dtypes: float64(2)

As you may see, the df is defined from 7:50 on 12th Oct 2013, to 21:40 on 23rd Oct 2013. The following is the plot on the entire span of the df.


df.plot()

enter image description here


Now, we select the data from 14th Oct 9:30 hrs to 16th Oct 9:30 hrs.

df2=df.ix['2013-10-14 09:30':'2013-10-16 09:30']

df2.plot()

enter image description here


You may see that how .ix was used for selecting an interval. You can also do the same operation as follows:

df['2013-10-14 09:30':'2013-10-16 09:30'].plot()

This gives the same result as before.

For more details, you may refer to Chang She's talk and the accompanying IPython notebook on Time Series with Pandas. The following two talks from Wes should also be very helpful

  1. Time series data analysis with Pandas
  2. Time series manipulation with Pandas

Upvotes: 5

Related Questions