sogu
sogu

Reputation: 3076

How to order X axis by YearMonth at Seaborn

I use the current version of http://archive.ics.uci.edu/ml/datasets/Air+quality My issue is that I want to create a plot that is ordered by monthly aggregates of different features that plotted on multiple graphs

YearMonth Creation for X Axis

INPUT:
df['DateTime'] = df['Date'].astype(str) + ' ' + df['Time'].astype(str)
df['DateTime'] = pd.to_datetime(df['DateTime'], format='%m/%d/%Y %H:%M:%S')
print(df['DateTime'].iloc[:2])

OUTPUT:
0   2004-11-23 19:00:00
1   2004-11-23 20:00:00
Name: DateTime, dtype: datetime64[ns]



INPUT:
df['Date'] = pd.to_datetime(df['Date'].astype(str), format='%m/%d/%Y')

df['Year'] = df['DateTime'].map(lambda x: x.year)
print(df['Year'].iloc[:2])

OUTPUT:
0    2004
1    2004
Name: Year, dtype: int64



INPUT:
df['YearMonth'] = pd.to_datetime(df.DateTime).dt.to_period('m')
print(df['YearMonth'].iloc[:2])

OUTPUT:
0    2004-11
1    2004-11
Name: YearMonth, dtype: period[M]

Goal project has same results, format

My Plotting

plt.figure(figsize=(30,60))
#fig, axes = plt.subplots(1, 1, figsize=(30, 60), dpi=100)

gasList = ['CO_GT', 'C6H6_GT', 'Nox_GT', 'NO2_GT']
for i, col in enumerate(gasList, start=1):
    plt.subplot(len(showList), 1, i)    
    sns.pointplot(x='YearMonth', y=col, hue='Year', data=df)
    plt.title(col, y=0.5, loc='right')
    #axes.set_xticks(year_month_day)
plt.show()

enter image description here

Ideal plotting

I am trying to achieve the same as this projects

enter image description here

Tried to do to Solve the problem

<class 'pandas.core.frame.DataFrame'>
Int64Index: 9357 entries, 0 to 9356
Data columns (total 17 columns):
 #   Column        Non-Null Count  Dtype         
---  ------        --------------  -----         
 0   Date          9357 non-null   datetime64[ns]
 1   Time          9357 non-null   object        
 2   CO_GT         9357 non-null   float64       
 3   PT08_S1_CO    9357 non-null   float64       
 4   C6H6_GT       9357 non-null   float64       
 5   PT08_S2_NMHC  9357 non-null   float64       
 6   Nox_GT        9357 non-null   float64       
 7   PT08_S3_Nox   9357 non-null   float64       
 8   NO2_GT        9357 non-null   float64       
 9   PT08_S4_NO2   9357 non-null   float64       
 10  PT08_S5_O3    9357 non-null   float64       
 11  T             9357 non-null   float64       
 12  RH            9357 non-null   float64       
 13  AH            9357 non-null   float64       
 14  DateTime      9357 non-null   datetime64[ns]
 15  Year          9357 non-null   int64         
 16  YearMonth     9357 non-null   period[M]     
dtypes: datetime64[ns](2), float64(12), int64(1), object(1), period[M](1)
memory usage: 1.3+ MB
col_one_list = df['YearMonth'].tolist()
plt.figure(figsize=(30,60))

gasList = ['CO_GT', 'C6H6_GT', 'Nox_GT', 'NO2_GT']
for i, col in enumerate(gasList, start=1):
    plt.subplot(len(showList), 1, i)    
    sns.pointplot(x='YearMonth', y=col, hue='Year', data=df, order = col_one_list )
    plt.title(col, y=0.5, loc='right')

plt.show()
plt.figure(figsize=(30,60))

col_two_list = ['2004-03','2004-04', '2004-05', '2004-06', '2004-07', '2004-08', '2004-09', '2004-10', '2004-11','2004-12', '2005-01','2005-02','2005-03', '2005-04']

gasList = ['CO_GT', 'C6H6_GT', 'Nox_GT', 'NO2_GT']
for i, col in enumerate(gasList, start=1):
    plt.subplot(len(showList), 1, i)    
    sns.pointplot(x='YearMonth', y=col, hue='Year', data=df, order = col_two_list )
    plt.title(col, y=0.5, loc='right')

plt.show()

Upvotes: 3

Views: 2601

Answers (1)

Valdi_Bo
Valdi_Bo

Reputation: 30971

Short answer

When you generate your pointplot, pass sorted DataFrame (by YearMonth) and the printout should be just as you wish.

Without the above sort the picture is as you presented (wrong).

Long answer

I prepared a test input file, for just 2 columns, as follows:

DateTime    CO_GT  C6H6_GT
2004-11-01  2.7    12.4
2004-12-01  2.6    10.6
2004-10-01  3.0    13.8
2005-01-01  2.0    9.0
2005-02-01  2.2    8.0
2004-03-01  2.2    10.0
2004-09-01  2.2    12.0
2005-03-01  2.0    8.6
2004-04-01  2.1    10.2
2004-05-01  1.95   10.5
2004-06-01  1.85   10.4
2004-07-01  1.7    10.5
2005-04-01  1.3    4.5
2004-08-01  1.4    6.8

Then I read it, converting DateTime column to datetime type (as early as possible, i.e. just on reading):

df = pd.read_fwf('Input.csv', widths=[12, 7, 7], parse_dates=[0])

The first step is to create "auxiliary" columns:

df['Year'] = df.DateTime.dt.year
df['YearMonth'] = df.DateTime.dt.to_period('m')

And to generate the picture, I ran:

gasList = ['CO_GT', 'C6H6_GT']
plt.figure(figsize=(14, 8))
for i, col in enumerate(gasList, start=1):
    plt.subplot(len(gasList), 1, i)    
    sns.pointplot(x='YearMonth', y=col, hue='Year', data=df.sort_values('DateTime'))
    plt.title(col, y=0.5, loc='right')
plt.show()

The result is:

enter image description here

Upvotes: 3

Related Questions