mburke05
mburke05

Reputation: 1471

Matplotlib(Seaborn) set_xticks working unexpectedly with datetime and timedelta

I should preface by saying this is all being done in an iPython kernel, but the only actions I have taken are the code below.

I have the following chart which is produced by the following code:

from queries import TOTAL, DEMO, DB_CREDENTIALS, TOTAL_USA_EX, TOTAL_ESPN_EX
import pandas as pd
import pyodbc
pd.options.mode.chained_assignment = None  # default='warn'    
import seaborn as sns
from matplotlib import pyplot as plt
from datetime import datetime, timedelta
mpl.rc('font',family='Arial Rounded MT Bold')
y_label = {'fontsize':14}
title = {'fontsize':30}
s_legend = {'fontsize':14, 'handlelength':7}

with pyodbc.connect(DB_CREDENTIALS) as cnxn:
    df = pd.read_sql(sql=TOTAL_USA_EX, con=cnxn)
    df['date'] = pd.to_datetime(df['date'])
    df_e = pd.read_sql(sql=TOTAL_ESPN_EX, con=cnxn)
    df_e['date'] = pd.to_datetime(df_e['date'])

ex_ = df
ex_['subject'] = ex_['date'] - ex_['date'].min()
ex_['subject'] = ex_['subject'].apply(lambda x: x.days)
ex_['hour'] = ex_['datetime'].apply(lambda x: x.hour)
ex_['minute'] = ex_['datetime'].apply(lambda x: x.minute)
ex_['minute'] = ex_['minute'] // 15
ex_['qh'] = ex_.apply(lambda x: x['minute'] + (x['hour']*4), axis=1)
ex_['imp'] = ex_['imp'].apply(lambda x: round(x/1000000.0,3))
ex_['station'] = 'USA'

ex_e = df_e
ex_e['subject'] = ex_e['date'] - ex_e['date'].min()
ex_e['subject'] = ex_e['subject'].apply(lambda x: x.days)
ex_e['hour'] = ex_e['datetime'].apply(lambda x: x.hour)
ex_e['minute'] = ex_e['datetime'].apply(lambda x: x.minute)
ex_e['minute'] = ex_e['minute'] // 15
ex_e['qh'] = ex_e.apply(lambda x: x['minute'] + (x['hour']*4), axis=1)
ex_e['imp'] = ex_e['imp'].apply(lambda x: round(x/1000000.0,3))
ex_e['station'] = 'ESPN'

data = pd.concat([ex_, ex_e])        

fig, ax = plt.subplots()
fig.set_size_inches(14, 7)
sns.tsplot(time='qh', value='imp', unit='subject', condition='station', 
           ci=80, data=data, ax=ax, linewidth=2, color=["#21A0A0", "#E53D00"])
ax.set_ylabel('IMPRESSIONS (M)', **y_label)
ax.set_xlabel('TIME', **y_label)
ax.set_title('STATION IMPRESSIONS: 80% CONFIDENCE INTERVAL')
ax.set_xticks([x for x in xrange(0,96,8)])
ax.set_xticklabels([(datetime(year=2015,month=12,day=28)+timedelta(minutes=15*(x))).strftime('%H:%M') for x in ax.get_xticks()]);

The x_ticks are set at 15 minute intervals, so the expected behavior would be to set ticks at each 2 hour increment (e.g. xticklabel[0] = 00:00, xticklabel[1] = 02:00, and so on).

However, for some reason the following is produced:

Showing only time.

I add the date and month below to see what exactly is going on, still confusing.

Wrong graph, showing the dates are acting wonky.

So I intuitively attempted to recreate the error by seeing what happens when I try to access the ticks object after the ax is created and seeing if the calculation is getting done, and it reveals some super confusing behavior:

In [19]: 
i = ax.get_xticks()
[(timedelta(minutes=15*(j)), j) for j in i ]

Out [19]:
[(datetime.timedelta(0), 0),
 (datetime.timedelta(-1, 85010, 65408), 8),
 (datetime.timedelta(0, 1515, 98112), 16),
 (datetime.timedelta(0, 125, 163520), 24),
 (datetime.timedelta(-1, 85135, 228928), 32),
 (datetime.timedelta(0, 1640, 261632), 40),
 (datetime.timedelta(0, 250, 327040), 48),
 (datetime.timedelta(-1, 85260, 392448), 56),
 (datetime.timedelta(0, 1765, 425152), 64),
 (datetime.timedelta(0, 375, 490560), 72),
 (datetime.timedelta(-1, 85385, 555968), 80),
 (datetime.timedelta(0, 1890, 588672), 88)]

For my sanity, what is i?

In [20]:
i
Out [20]:
array([ 0,  8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88])

So I open a separate kernel in jupyter and see if I can replicate the same error in a vacuum. And I'm unable:

new kernel

In [1]:
from datetime import datetime, timedelta
i = [x*15 for x in xrange(0,96,8)]
[timedelta(minutes=x) for x in i]

Out [1]:
[datetime.timedelta(0),
 datetime.timedelta(0, 7200),
 datetime.timedelta(0, 14400),
 datetime.timedelta(0, 21600),
 datetime.timedelta(0, 28800),
 datetime.timedelta(0, 36000),
 datetime.timedelta(0, 43200),
 datetime.timedelta(0, 50400),
 datetime.timedelta(0, 57600),
 datetime.timedelta(0, 64800),
 datetime.timedelta(0, 72000),
 datetime.timedelta(0, 79200)]

Can anybody help me from going crazy here?

Two quick edits:

1) The date 12-28-2015 is entirely arbitrary, I don't need the date I just need the time associated with it for my axis. Any date would do, but it shouldn't matter here given the behavior I'm expecting.

2) Just to make sure it wasn't some sort of weird syntax error, similarly in the new kernel this works fine:

In [2]:
from datetime import datetime, timedelta
i = [x for x in xrange(0,96,8)]
[timedelta(minutes=(x)*15) for x in i]
Out [2]:
[datetime.timedelta(0),
 datetime.timedelta(0, 7200),
 datetime.timedelta(0, 14400),
 datetime.timedelta(0, 21600),
 datetime.timedelta(0, 28800),
 datetime.timedelta(0, 36000),
 datetime.timedelta(0, 43200),
 datetime.timedelta(0, 50400),
 datetime.timedelta(0, 57600),
 datetime.timedelta(0, 64800),
 datetime.timedelta(0, 72000),
 datetime.timedelta(0, 79200)]

Upvotes: 1

Views: 2777

Answers (1)

mburke05
mburke05

Reputation: 1471

Thanks to darkf in the #learnprogramming channel; this is caused by the type of the item returned from the ax.get_xticks() method which is numpy.int32; so it's likely returning a pointer reference rather than the actual int.

Corrected line of code:

x.set_xticklabels([(datetime(year=2015,month=12,day=28)+timedelta(minutes=15*(int(x)))).strftime('%H:%M') for x in ax.get_xticks()]);

And graph:

enter image description here

Thanks!

Upvotes: 1

Related Questions