BruceWayne
BruceWayne

Reputation: 23285

Data on axis is not in expected order

I'm trying to plot some data on a graph, but when I do if I use data_list, it's all wonky.

import numpy as np
from numpy import random 
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as ticker

data_list = [('January', 1645480), ('February', 1608476), ('March', 1557113), ('April', 1391652), ('May', 1090298), ('July', 1150535), ('August', 1125931), ('September', 1158741), ('October', 1305849), ('November', 1407438), ('December', 1501733)]
working_list = [('April', 1391652), ('August', 1125931), ('December', 1501733), ('February', 1608476), ('January', 1645480), ('July', 1150535), ('March', 1557113), ('May', 1090298), ('November', 1407438), ('October', 1305849), ('September', 1158741)]

#### GRAPHING
def create_graph(data):
    x, y = zip(*data)
    plt.plot(x,y)
    axes = plt.gca() # Get the Current Axes
    axes.get_yaxis().get_major_formatter().set_scientific(False)  # Turn off scientific 
    # Show data on Y axis points
    for i, j in zip(x,y):
        plt.annotate(str(j),xy=(i,j))
    plt.show()

def main():
## Graphing
    create_graph(working_list)

if __name__ == "__main__":
    main()

enter image description here

But if I use working_list, it's correct! (It's in the order of the list. I'm simply trying to get the data to show from Jan - December on X-axis)

enter image description here

I've been staring at this for too long -- the data looks to be the exact same in each list, only data_list has "January" first, through December...I'm not sure why that would throw off the graph like it does. I did notice that the X-Axis months and the data on the Y axis are correct -- but the line connector is all off, as is the order I expected...

Upvotes: 4

Views: 2688

Answers (2)

Paul H
Paul H

Reputation: 68246

Matplotlib draws lines from point to point in the order that you provide them.

Ticks on the Axes are sorted according to the type of x-values passed (e.g., numerical for quantities, alphabetical for caetgories, chronologically for dates).

If you want to plot these values as dates, you need to pass actual dates to the plot method.

So reading between the lines and assuming you want your x-axis to go from January to December and the lines drawn accordingly, here's how I'd do it:

from datetime import datetime

from matplotlib import pyplot
from matplotlib import dates

raw_list = [('April', 1391652), ('August', 1125931), ('December', 1501733), ('February', 1608476), ('January', 1645480), ('July', 1150535), ('March', 1557113), ('May', 1090298), ('November', 1407438), ('October', 1305849), ('September', 1158741)]
sorted_list = sorted(working_list, key=lambda x: datetime.strptime(x[0], '%B'))

def create_graph(data):
    fig, ax = pyplot.subplots()
    _x, y = zip(*data)
    x = [datetime.strptime(month, '%B') for month in _x]
    ax.plot(x, y)

    ax.yaxis.get_major_formatter().set_scientific(False)  # Turn off scientific
    ax.xaxis.set_major_locator(dates.MonthLocator(interval=1))
    ax.xaxis.set_major_formatter(dates.DateFormatter('%B'))

    for tick in ax.xaxis.get_ticklabels():
        tick.set_rotation(45)
        tick.set_rotation_mode('anchor')
        tick.set_horizontalalignment('right')

    for i, j in zip(x,y):
        ax.annotate(str(j), xy=(i, j))
    return fig

So create_graph(raw_list) does this:

enter image description here

And So create_graph(sorted_list) does this:

enter image description here

Upvotes: 2

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339755

Matplotlib currently (as of version 2.1) has a problem with the order of categories on the axes. It will always sort the categories prior to plotting and you have no chance of changing that order. This will hopefully be fixed for the next release, but until then you would need to stick to plotting numeric values on the axes.

In this case this would mean you plot the data against some index and later set the ticklabes accordingly. Of course you could also use DateTimes, but that seems a bit overkill is you already have a list of the months available.

import numpy as np
import matplotlib.pyplot as plt


data_list = [('January', 1645480), ('February', 1608476), ('March', 1557113), 
             ('April', 1391652), ('May', 1090298), ('July', 1150535), 
             ('August', 1125931), ('September', 1158741), ('October', 1305849), 
             ('November', 1407438), ('December', 1501733)]

#### GRAPHING
def create_graph(data):
    months, y = zip(*data)
    plt.plot(range(len(months)),y)
    axes = plt.gca() # Get the Current Axes
    axes.get_yaxis().get_major_formatter().set_scientific(False)  
    axes.set_xticks(range(len(months)))
    axes.set_xticklabels(months, rotation=45, ha="right")
    # Show data on Y axis points
    for i, j in enumerate(y):
        plt.annotate(str(j),xy=(i,j))
    plt.show()


create_graph(data_list)

enter image description here

Upvotes: 2

Related Questions