BRat
BRat

Reputation: 83

How to plot multiple timeseries data with different start date on the same x-axis in Python Matplotlib?

I am trying to plot three timeseries datasets with different start date on the same x-axis, similar to this question How to plot timeseries with different start date on the same x axis. Except that my x-axis has dates instead of days.

My data frame is structured as:

Date ColA Label
01/01/2019 1.0 Training
02/01/2019 1.0 Training
...
14/09/2020 2.0 Test1
..
06/01/2021 4.0 Test2
...

I have defined each time series as:

train = df.loc['01/01/2019':'05/08/2020', 'ColA']  
test1 = df.loc['14/09/2020':'20/12/2020', 'ColA']  
test2 = df.loc['06/01/2021':'18/03/2021', 'ColA']  

This is how individual time series plot: data1 data2 data3

But when I try to plot them on the same x-axis, it doesn't plot in sequence of dates data_all I am hoping to produce something like this (from MS Excel): enter image description here

Any help would be great!

Thank you

Upvotes: 1

Views: 4648

Answers (2)

user2688158
user2688158

Reputation: 417

Make sure that 'Date' column in your dataframe is imported as datetime variable and not as string.

If you find dtype as "object":

df = pd.read_csv('data.csv')
data['Date']
0      2019-01-01
1      2019-01-02
2      2019-01-03
       

    Name: Date, Length: 830, dtype: object

You need to convert to datetime variable. You can convert in two ways:

  1. df = pd.read_csv('data.csv', parse_dates=['Date'])
    

OR

  1. df = pd.read_csv('data.csv')
    df['Date'] = pd.to_datetime(data['Date'])
    

Both options will give you the same result.

df = pd.read_csv('data.csv', parse_dates=['Date'])
data['Date']
0      2019-01-01
1      2019-01-02
2      2019-01-03
       ...

    Name: Date, Length: 830, dtype: datetime64[ns]

Then, you can just plot:

plt.plot(data['Date'],ColA)

When you define individual time series, make sure to check the formatting of dates. Datetime format in pandas is YYYY-MM-DD. So, use this instead:

train = df.loc['2019-01-01':'2020-08-05', 'ColA'] and so on...

I am assuming that your data is stored as csv (or excel). If so, be careful of how MS Excel may change the formatting of the Date column anytime you open the data file in Excel. Best practice would be to always check the formatting of 'Date' column using

type(data['Date']) after importing dataframe.

Upvotes: 1

Amri Rasyidi
Amri Rasyidi

Reputation: 191

I assume you have a dataframe consists at least of date, record, and label of training, test #1 and test#2
would sharex = True do the trick?

fig, ax = plt.subplots(3,1, sharex = True)

for i,j in zip(data['label'].unique(), range(3)):
    ax[j].plot(x = df[df['label'] == i]['date'], 
               y = df[df['label'] == i]['record'])

EDIT

This should do it

fig, ax = plt.subplots(figsize = (14,6))
color = ['blue','red','orange']

for i,j in zip(df.Label.unique().tolist(), color):
    ax.plot(x = df['Date'][df.Label == i], y = df['ColA'][df.Label == i], 
            color = j, label = j)
plt.legend(loc = 'best')
plt.show()

You basically want to plot multiple times in the same figure of matplotlib. Just use the initial dataset (which includes all the labels), no need to use the separated one.

Upvotes: 0

Related Questions