JoelAffolter1997
JoelAffolter1997

Reputation: 47

Color Map of Date as String in Python

I want to plot (in python) the distance I have run on the x axis and the pace at which I have run this distance on the y axis. For each data point, I have a string, which represents the date at which the run occured. Now I want to make a scatter plot, in which the color of the datapoint represents the date at which it occurred. Additionally, the colormap should be reasonably labeled, so as a date or year, not as a number. The data looks something like this:

x = [2.803480599999999, 5.5502475000000056, 6.984381300000002, 4.115224099999998, 5.746583699999995, 8.971469500000019, 12.028179500000032, 13.451193300000014, 12.457393999999972, 12.027555199999998, 16.077930800000015, 5.021229700000006, 11.206380399999999, 7.903262600000004, 11.98195070000001, 12.21701, 10.35045, 10.231890000000002]

y = [11.961321698938578, 5.218986480632915, 5.211628408660906, 4.847852635777481, 4.936266162218553, 5.233256380128127, 5.441388698929861, 5.461721129728066, 5.722170570613203, 5.2698434785261545, 5.645419662253215, 4.617062894639794, 4.973357261130752, 5.906843248930297, 5.256517482861392, 5.537361432952908, 5.339542403148332, 5.376979880224148]

t_string = ['2019-10-7', '2019-10-13', '2019-11-10', '2019-11-16', '2019-11-17', '2019-11-23', '2019-11-24', '2019-11-27', '2019-12-1', '2019-12-4', '2019-12-8', '2019-12-21', '2019-12-23', '2019-12-25', '2019-12-27', '2020-1-2', '2020-1-5', '2020-1-9']

What I tried to do, is to transform the t_string into a number using the Unix format and then transforming this into a day starting from day 0. The result looks like this:

t_num = [0, 6, 34, 40, 41, 47, 48, 51, 55, 58, 62, 75, 77, 79, 81, 87, 90, 94]

I then plotted the data using the matlabplotlib, but as you can see, the labeling of the colormap is not really meaningful...

import matplotlib.pyplot as plt

plt.scatter(x,y,
            c=t_num,
            cmap='viridis')
plt.colorbar()

enter image description here

If I try to use t_string instead of t_num I get the following error message

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/matplotlib/axes/_axes.py", line 4284, in _parse_scatter_color_args
    colors = mcolors.to_rgba_array(c)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/matplotlib/colors.py", line 294, in to_rgba_array
    result[i] = to_rgba(cc, alpha)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/matplotlib/colors.py", line 177, in to_rgba
    rgba = _to_rgba_no_colorcycle(c, alpha)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/matplotlib/colors.py", line 233, in _to_rgba_no_colorcycle
    raise ValueError("Invalid RGBA argument: {!r}".format(orig_c))
ValueError: Invalid RGBA argument: '2019-10-7'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Volumes/JOEL USB/AnalyseMyRun 0.2/scripts/main.py", line 16, in <module>
    plot_distributionby_distance(2)
  File "/Volumes/JOEL USB/AnalyseMyRun 0.2/scripts/data_plot_types.py", line 133, in plot_distributionby_distance
    plt.scatter(list_DistanceTot_sorted,list_PaceAverage_sorted,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/matplotlib/pyplot.py", line 2836, in scatter
    __ret = gca().scatter(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/matplotlib/__init__.py", line 1599, in inner
    return func(ax, *map(sanitize_sequence, args), **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/matplotlib/axes/_axes.py", line 4451, in scatter
    self._parse_scatter_color_args(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/matplotlib/axes/_axes.py", line 4302, in _parse_scatter_color_args
    raise ValueError(
ValueError: 'c' argument must be a mpl color, a sequence of mpl colors or a sequence of numbers, not ['2019-10-7', '2019-10-13', '2019-11-10', '2019-11-16', '2019-11-17', '2019-11-23', '2019-11-24', '2019-11-27', '2019-12-1', '2019-12-4', '2019-12-8', '2019-12-21', '2019-12-23', '2019-12-25', '2019-12-27', '2020-1-2', '2020-1-5', '2020-1-9'].

Has somebody a way to work around this issue? The graph does not need to be done in matplotlib, it is simply the plotting package I know best.

Upvotes: 1

Views: 3089

Answers (1)

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339310

  1. Convert your datestrings to matplotlib internal date format
  2. Use matplotlib.dates' locators and formatters on the colorbar

Example:

import matplotlib.pyplot as plt
from datetime import datetime
import matplotlib.dates as mdates

x = [2.803480599999999, 5.5502475000000056, 6.984381300000002, 4.115224099999998, 5.746583699999995, 8.971469500000019, 12.028179500000032, 13.451193300000014, 12.457393999999972, 12.027555199999998, 16.077930800000015, 5.021229700000006, 11.206380399999999, 7.903262600000004, 11.98195070000001, 12.21701, 10.35045, 10.231890000000002]

y = [11.961321698938578, 5.218986480632915, 5.211628408660906, 4.847852635777481, 4.936266162218553, 5.233256380128127, 5.441388698929861, 5.461721129728066, 5.722170570613203, 5.2698434785261545, 5.645419662253215, 4.617062894639794, 4.973357261130752, 5.906843248930297, 5.256517482861392, 5.537361432952908, 5.339542403148332, 5.376979880224148]

t_string = ['2019-10-7', '2019-10-13', '2019-11-10', '2019-11-16', '2019-11-17', '2019-11-23', '2019-11-24', '2019-11-27', '2019-12-1', '2019-12-4', '2019-12-8', '2019-12-21', '2019-12-23', '2019-12-25', '2019-12-27', '2020-1-2', '2020-1-5', '2020-1-9']
t = [mdates.date2num(datetime.strptime(i, "%Y-%m-%d")) for i in t_string]

fig, ax = plt.subplots()
sc = ax.scatter(x,y, c=t)

loc = mdates.AutoDateLocator()
fig.colorbar(sc, ticks=loc,
                 format=mdates.AutoDateFormatter(loc))

plt.show()

enter image description here

Upvotes: 3

Related Questions