gatheloc
gatheloc

Reputation: 13

Date format issues in plot ticks with matplotlib.dates (and datestr2num)

I'm using matplotlib.dates to plot a bar chart with instances occurring on specific dates (presented as a list of strings), and using matplotlib.dates.datestr2num to display two sets of data per date (as per the top answer in Python matplotlib multiple bars).

However, for dates below the 12th day of the month, the plot is interpreting the dates in MM/DD/YY format, while for dates above the 12th day of the month it is interpreting the dates as DD/MM/YY, causing the data to jump around the plot. I think the issue might be in how I'm using datestr2num, but not sure how to force it to be in one format or another.

import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, AutoDateLocator, AutoDateFormatter, datestr2num

days = ['30/01/2019', '31/01/2019', '01/02/2019', '02/02/2019', '03/02/2019', '04/02/2019']
adata = [1, 9, 10, 3, 7, 6]
bdata = [1, 2, 11, 3, 6, 2]

x=datestr2num(days)

w=0.25
fig = plt.figure(figsize = (8,4))
ax = fig.add_subplot(111)
ax.bar(x-w, adata, width=2*w, color='g', align='center', tick_label = days)
ax.bar(x+w, bdata, width=2*w, color='r', align='center', tick_label = days)
ax.xaxis_date()
ax.xaxis.set_major_locator(AutoDateLocator(minticks=3, interval_multiples=False))
ax.xaxis.set_major_formatter(DateFormatter("%d/%m/%y"))

plt.show()

In the example above, adata and bdata are on subsequent dates (30th Jan to 4th Feb), with all the bars shown as next to each other but the plot displays the data between the 2nd of Jan and the 2nd of Apr, making the data appear out of order.

Any help or clarifiaction would be helpful, thanks!

Image of the output from above

Upvotes: 1

Views: 1843

Answers (1)

Cohan
Cohan

Reputation: 4564

It appears that datestr2num() calls dateutil.parser.parse(d, default=default). dateutil.parser.parse has a kwarg dayfirst but datestr2num() does not provide a way to pass the argument forward. If you pass a list of dates to datesttr2num() it will assume month first.

>>> [dateutil.parser.parse(day, dayfirst=True) for day in days]
[
    datetime.datetime(2019, 1, 30, 0, 0), 
    datetime.datetime(2019, 1, 31, 0, 0), 
    datetime.datetime(2019, 2, 1, 0, 0), 
    datetime.datetime(2019, 2, 2, 0, 0), 
    datetime.datetime(2019, 2, 3, 0, 0), 
    datetime.datetime(2019, 2, 4, 0, 0)
]

>>> datestr2num(days)
[737089. 737090. 737061. 737092. 737120. 737151.]

>>> datestr2num(days, dayfirst=True)
Traceback (most recent call last):
File "C:\Users\bcohan\Downloads\so.py", line 22, in <module>
    print(datestr2num(days, dayfirst=True))
TypeError: datestr2num() got an unexpected keyword argument 'dayfirst'

I would suggest processing the dates with datetime first and then running the rest. (Unless it's reasonable to rewrite the strings in the original days list.)

x = datestr2num([
    datetime.strptime(day, '%d/%m/%Y').strftime('%m/%d/%Y')
    for day in days
])

Here is a fully working script.

from datetime import datetime
import matplotlib.pyplot as plt
from matplotlib.dates import (
    DateFormatter, AutoDateLocator, AutoDateFormatter, datestr2num
)

days = [
    '30/01/2019', '31/01/2019', '01/02/2019',
    '02/02/2019', '03/02/2019', '04/02/2019'
]
adata = [1, 9, 10, 3, 7, 6]
bdata = [1, 2, 11, 3, 6, 2]

x = datestr2num([
    datetime.strptime(day, '%d/%m/%Y').strftime('%m/%d/%Y')
    for day in days
])
w = 0.25

fig = plt.figure(figsize=(8, 4))
ax = fig.add_subplot(111)
ax.bar(x - w, adata, width=2 * w, color='g', align='center', tick_label=days)
ax.bar(x + w, bdata, width=2 * w, color='r', align='center', tick_label=days)
ax.xaxis_date()
ax.xaxis.set_major_locator(
    AutoDateLocator(minticks=3, interval_multiples=False))
ax.xaxis.set_major_formatter(DateFormatter("%d/%m/%y"))

plt.show()

enter image description here

Upvotes: 3

Related Questions