WolfiG
WolfiG

Reputation: 1163

Matplotlib problem: tick re-mapping of x-axis of time-series data

I want to plot time-series data using MatPlotLib. The data is stored in csv format which I process to Pandas DataFrame using pd.read_csv(), which works fine. A data set comprises one time stamp column and around 10 value columns. I convert the time stamp (initially a string of format yyyy-MM-dd hh:mm:ss) via pd.to_datetime(dataFrame['TIMESTAMP'], format='%Y-%m-%d %H:%M:%S') to datetime.

To plot the data I use the following code (generation of sample data is not part of my code):

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns

N = 30
timestamps = pd.date_range('2020-01-16 8:00', periods=N, freq='72s')
# note: the original timestamps aren't evenly spaced, this is just data to test
dataFrame = pd.DataFrame({'TIMESTAMP': timestamps, 'Y1': np.random.normal(100, 30, N), 'Y2': np.random.normal(100, 30, N)})
acqFieldName = 'Y1'

fig = sns.pointplot(x='TIMESTAMP', y=acqFieldName, data=dataFrame, scale=0.75)
timestamps = dataFrame['TIMESTAMP'].dt.time
fig.axes.set_xticklabels(labels=timestamps, rotation=45)
plt.show()

Which results in the following:

Plot resulting from code above

Still, I would like to change the x-axis: the ticks are too dense, so I'd like to have - say - 10 ticks, and I'd like to see the time spend in minutes, format 'mm:ss'.

I tried the following:

fig = sns.pointplot(x='TIMESTAMP', y=acqFieldName, data=dataFrame, scale=0.75)
timestamps = dataFrame['TIMESTAMP'].dt.time

xmin = dataFrame['TIMESTAMP'][0]
xmax = dataFrame['TIMESTAMP'][len(dataFrame['TIMESTAMP']) - 1]

timeDiff: timedelta = xmax - xmin
customTicks = np.linspace(0., timeDiff.seconds, 10)
fig.axes.set_xticklabels(labels=customTicks, rotation=45)
fig.axes.set_xticks(customTicks)
plt.show()

Which results in the following:

enter image description here

obviously not what I want.

My problem would be solved if I could reduce the number of ticks formatted as time, or - better - if the points align with the ticks given as time spent.

Update: suggestion of Zaraki Kenpachi yields

    fig, ax = plt.subplots()
    ax.plot(dataFrame.set_index('TIMESTAMP'), dataFrame[acqFieldName])
    plt.show()

enter image description here


Working solution based on JohanC's answer:

for fileName in glob.glob('*.csv'):
    plt.close()
    # NOTE: CsvFileProcessor is a custom class doing the readout of CSV and conversion to pandas.DataFrame
    dataFrame, acqFieldName, settingParameterCount = CsvFileProcessor.processFile(fileName)

    fig, ax = plt.subplots()
    ax: plt.Subplot = sns.pointplot(x='TIMESTAMP', y=acqFieldName, data=dataFrame, scale=0.75, ax=ax)
    startTime = dataFrame['TIMESTAMP'][0]

    timeProgress = []

    for timeStamp in dataFrame['TIMESTAMP']:
        timePassed = timeStamp - startTime
        timeProgress.append(timePassed)

    custom_ticks = range(0, len(timeProgress), 5)
    timestamps = [f"{datetime.timedelta(seconds=timeProgress[t].seconds)}" for t in custom_ticks]

    # for manipulating the x-axis tick labels:
    # https://stackoverflow.com/questions/51105648/ordering-and-formatting-dates-on-x-axis-in-seaborn-bar-plot
    ax.axes.set_xticklabels(labels=timestamps, rotation=45)
    ax.axes.set_xlabel(xlabel="Processing Time")
    plt.title('Setting Parameters: ' + str(settingParameterCount))
    ax.axes.set_xticks(custom_ticks)
    outFileName = fileName.upper()
    outFileName = outFileName.replace('_DATA.CSV', '')
    outFileName = outFileName + '_READOUT.PNG'
    fig.tight_layout()
    #plt.savefig(outFileName)
    plt.show()

results in:

Final plot

Upvotes: 0

Views: 1770

Answers (2)

JohanC
JohanC

Reputation: 80429

The main confusion comes from this Seaborn point plot having the x ticks as numbers 0,1,2,... and creating some confusing labels for them.

To get what you want, you could just set ticks every, say, 5. And provide custom labels for them. Also add minor ticks to have one tick for every entry.

Demo code:

import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
import pandas as pd
import numpy as np
import seaborn as sns

N = 30
timestamps = pd.date_range('2020-01-16 8:00:00', periods=N, freq='73s')
dataFrame = pd.DataFrame({'TIMESTAMP': timestamps, 'Y1': np.random.normal(100, 30, N), 'Y2': np.random.normal(100, 30, N)})

fig = sns.pointplot(x='TIMESTAMP', y='Y1', data=dataFrame, scale=0.75)

custom_ticks = range(0, len(dataFrame), 5) # ticks every 5
timestamps = [f"{dataFrame['TIMESTAMP'][t].minute:02}:{dataFrame['TIMESTAMP'][t].second:02}" for t in custom_ticks]

fig.axes.set_xticklabels(timestamps)
fig.axes.set_xticks(custom_ticks)
fig.axes.xaxis.set_minor_locator(AutoMinorLocator())

plt.tight_layout()
plt.show()

demo plot

Upvotes: 2

Zaraki Kenpachi
Zaraki Kenpachi

Reputation: 5740

Try simple plot:

import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot(df.set_index('TIMESTAMP'), df[acqFieldName])
plt.show()

Upvotes: 0

Related Questions