smoothjabz
smoothjabz

Reputation: 15

Plotting Time vs Date in matplotlib

I have a .csv file with only two columns in it, date and time:

    04-02-15,11:15
    04-03-15,09:35
    04-04-15,09:10
    04-05-15,18:05
    04-06-15,10:30
    04-07-15,09:20

I need this data to be plotted (preferably in an area graph, haven't gotten that far yet) using matplotlib. I need the y-axis to be time, and the x-axis to be date. I'm having trouble wrapping my head around some of the usage for time/date, and was hoping someone could take a look at my code and offer some guidance:

import numpy as np
from pylab import *
import matplotlib.pyplot as plt
import datetime as DT

data= np.loadtxt('daily_count.csv', delimiter=',',
         dtype={'names': ('date', 'time'),'formats': ('S10', 'S10')} )

x = [DT.datetime.strptime(key,"%m-%d-%y") for (key, value) in data ]
y = [DT.datetime.strptime(key,"%h:%m") for (key, value) in data]

fig = plt.figure()
ax = fig.add_subplot(111)
ax.grid()


fig.autofmt_xdate()
fig.autofmt_ytime()
plt.plot(x,y)
plt.xlabel('Date')
plt.ylabel('Time')
plt.title('Peak Time')
plt.show()

Each time I try to run it, I get this error:

ValueError: time data '04-02-15' does not match format '%h:%m'

I've also got a suspicion about the ticks for the y-axis, which thus far don't seem to be established. I'm very open to suggestions for the rest of this code as well - thanks in advance, internet heroes!

Upvotes: 1

Views: 2143

Answers (1)

jonnybazookatone
jonnybazookatone

Reputation: 2268

So the traceback tells you the problem. It is trying to parse your date as your time, and this is a result of the way you parsed the data in these lines:

data= np.loadtxt('daily_count.csv', delimiter=',',
         dtype={'names': ('date', 'time'),'formats': ('S10', 'S10')} )

x = [DT.datetime.strptime(key,"%m-%d-%y") for (key, value) in data ]
y = [DT.datetime.strptime(key,"%h:%m") for (key, value) in data]

There are multiple solutions, but the root of the 'problem; is that when you use loadtxt and define the names and dtypes, it gives you back a list of tuples, i.e.,

[('04-02-15', '11:15') ('04-03-15', '09:35') ('04-04-15', '09:10')
('04-05-15', '18:05') ('04-06-15', '10:30') ('04-07-15', '09:20')]

So when you looped over it, you actually were accessing constantly the dates:

>>> print [key for (key, value) in data]
>>> ['04-02-15', '04-03-15', '04-04-15', '04-05-15', '04-06-15', '04-07-15']

So you were trying to turn '04-02-15' into the format '%h:%m', which of course will not work.

To get to the point, you can unconfuse the parsed data using the zip function. For example,

print map(list, zip(*data))
['04-02-15', '04-03-15', '04-04-15', '04-05-15', '04-06-15', '04-07-15']
['11:15', '09:35', '09:10', '18:05', '10:30', '09:20']

Also, you need to check the formats for the dates you passed, for example "%h:%m" won't work as %h doesn't exist, and %m means month. You can find a nice summary on the docs, or here: http://strftime.org/.

Or to get to the point:

import numpy as np
from pylab import *
import matplotlib.pyplot as plt
import datetime as DT

data= np.loadtxt('daily_count.csv', delimiter=',',
         dtype={'names': ('date', 'time'),'formats': ('S10', 'S10')} )

dates, times = map(list, zip(*data))
print dates, times

x = [DT.datetime.strptime(date,"%m-%d-%y") for date in dates]
y = [DT.datetime.strptime(time,"%H:%M") for time in times]

fig = plt.figure()
ax = fig.add_subplot(111)
ax.grid()

plt.plot(x,y)
plt.xlabel('Date')
plt.ylabel('Time')
plt.title('Peak Time')
plt.show()

which gives the following plot: enter image description here

Upvotes: 2

Related Questions