brohjoe
brohjoe

Reputation: 934

Issue with Matplotlib rendering dates, image

I'm having an issue with Matplotlib v 3.1.3 from conda-forge with python 3.7. I have all of the dependencies required for Matplotlib. When I enter this code, which should work. I get splatter art. It's based on this youtube tutorial: https://www.youtube.com/watch?v=LWjaAiKaf8&list=PL-osiE80TeTvipOqomVEeZ1HRrcEvtZB&index=8

import matplotlib.pyplot as plt
import pandas as pd

df_train = pd.read_csv('mydata.csv', date_parser=True)
df_train.columns = ['date', 'col1', 'col2', 'col3', 'col4', 'col5']
df_train['date'] = pd.to_datetime(df_train['date'])
df_train.set_index(['date'])

x_value = df_train['date']
y_value = df_train['col4']
plt.plot_date(x_value, y_value )
plt.gcf().autofmt_xdate()
plt.show

The rendering of the matplotlib chart based on this code looks like this: enter image description here

I tried another approach using the matplotlib DateFormatter and Locator. I got something resembling a line chart underneath a child's scribbling. But it had dates:

df_train = pd.read_csv('mydata.csv', date_parser=True)
df_train.columns = ['date', 'col1', 'col2', 'col3', 'col4', 'col5']
df_train['date'] = pd.to_datetime(df_train['date'])
df_train.set_index(['date'])

    # Visualize data
x_values = df_train['date']
y_values = df_train['col4']
ax = plt.gca()
plt.figure(figsize=(16, 8))
formatter = mpl_dates.DateFormatter("%Y-%m-%d")
ax.xaxis.set_major_formatter(formatter)
locator = mpl_dates.DayLocator()
ax.xaxis.set_major_locator(locator)
plt.plot(x_values, y_values)
plt.show()

enter image description here

Finally, if I change the code to exclude the dates: I get a perfectly rendered chart with no dates:

import matplotlib.pyplot as plt
import pandas as pd

df_train = pd.read_csv('mydata.csv', date_parser=True)
df_train.columns = ['date', 'col1', 'col2', 'col3', 'col4', 'col5']
df_train['date'] = pd.to_datetime(df_train['date'])
df_train.set_index(['date'])

x_value = df_train['date']
y_value = df_train['col4']
plt.plot(df_train['col4']
plt.gcf().autofmt_xdate()
plt.show()

I've tried closing the plots at the end to no avail. I checked the Matplotlib docs and followed it to a 'T' including using the wheel build and creating the conda channel and installing the dependencies and setting the path and includes per the documentation. I'm at my wits end. Can someone more educated on the subject give me a hand? Thanks in advance.

enter image description here

Upvotes: 2

Views: 148

Answers (1)

GinTonic
GinTonic

Reputation: 1000

It seems that the default setting in plot_date() is set to scatterplots (see (https://www.geeksforgeeks.org/matplotlib-pyplot-plot_date-in-python/) in the newer versions of matplotlib.

To achieve a continuous graph based on dates, you can define the interlining in the arguments plt.plot_date(x_value, y_value, '-').

This code works for me:

import matplotlib.pyplot as plt
import pandas as pd

df_train = pd.read_csv('test.csv', date_parser=True)
df_train.columns = ['date', 'col1', 'col2', 'col3', 'col4', 'col5', 'col6']
df_train['date'] = pd.to_datetime(df_train['date'])
df_train.set_index(['date'])

x_value = df_train['date']
y_value = df_train['col4']
plt.plot_date(x_value, y_value, '-')
plt.gcf().autofmt_xdate()
plt.show()

Output:

Graph

This functionality of not using the lineplot by default is indeed questionable given that the plot also automatically changes from scatterplot to lineplot when you just change the color: plt.plot_date(x_value, y_value, 'g').

This might just be a bug in the current versions of mpl.

Upvotes: 1

Related Questions