Reputation: 1563
I have a dataframe called df1
import numpy as np
import matplotlib.pylab as plt
import matplotlib.dates as mdates
from matplotlib import style
import pandas as pd
%pylab inline
import seaborn as sns
sns.set_style('darkgrid')
import io
style.use('ggplot')
from datetime import datetime
import time
df1 = pd.read_csv('C:/Users/Demonstrator/Downloads/Listeequipement.csv',delimiter=';', parse_dates=[0], infer_datetime_format = True)
df1.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 17 entries, 145 to 161
Data columns (total 6 columns):
TIMESTAMP 17 non-null datetime64[ns]
ACT_TIME_AERATEUR_1_F1 17 non-null float64
ACT_TIME_AERATEUR_1_F3 17 non-null float64
ACT_TIME_AERATEUR_1_F5 17 non-null float64
ACT_TIME_AERATEUR_1_F6 17 non-null float64
ACT_TIME_AERATEUR_1_F7 17 non-null float64
dtypes: datetime64[ns](1), float64(5)
memory usage: 952.0 bytes
# build HeatMap
df1['TIMESTAMP']= pd.to_datetime(df_no_missing['TIMESTAMP'], '%d-%m-%y %H:%M:%S')
df1['date'] = df_no_missing['TIMESTAMP'].dt.date
df1['time'] = df_no_missing['TIMESTAMP'].dt.time
date_debut = pd.to_datetime('2015-08-01 23:10:00')
date_fin = pd.to_datetime('2015-08-02 02:00:00')
df1 = df1[(df1['TIMESTAMP'] >= date_debut) & (df1['TIMESTAMP'] < date_fin)]
sns.heatmap(df1.iloc[:,1:6:],annot=True, linewidths=.5)
ax = sns.heatmap(df1.iloc[:, 1:6:], annot=True, linewidths=.5)
ax.set_yticklabels([i.strftime("%Y-%m-%d %H:%M:%S") for i in df1.TIMESTAMP], rotation=0)
It is like this :
TIMESTAMP;ACT_TIME_AERATEUR_1_F1;ACT_TIME_AERATEUR_1_F3;ACT_TIME_AERATEUR_1_F5;ACT_TIME_AERATEUR_1_F6;ACT_TIME_AERATEUR_1_F7
2015-07-31 23:00:00;90;90;90;90;90
2015-07-31 23:10:00;0;0;0;0;0
2015-07-31 23:20:00;0;0;0;0;0
2015-07-31 23:30:00;0;0;0;0;0
2015-07-31 23:40:00;0;0;0;0;0
I try to resample it to have for every 30 minute (timestamp) the mean of the values of ACT_TIME_AERATEUR_1_F1;ACT_TIME_AERATEUR_1_F3;ACT_TIME_AERATEUR_1_F5;ACT_TIME_AERATEUR_1_F6;ACT_TIME_AERATEUR_1_F7.
I try to do like this :
df1.index = pd.to_datetime(df1.index)
print(df1.resample('30min').mean())
But I get something strange :
ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 \
1970-01-01 40.588235 40.588235
ACT_TIME_AERATEUR_1_F5 ACT_TIME_AERATEUR_1_F6 \
1970-01-01 40.588235 40.588235
ACT_TIME_AERATEUR_1_F7
1970-01-01 40.588235
I don't have these dates 1970-01-01 at all .
Any idea please to help me how it imports 1970?
Upvotes: 2
Views: 582
Reputation: 29711
It picks up the default integer index and hence you get those strange values when you perform pd.to_datetime
of those indices. You need to set TIMESTAMP
as the index.
In [2]: df1 = df1.set_index('TIMESTAMP')
In [3]: df1.resample('30min').mean()
Out[3]:
ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 \
TIMESTAMP
2015-07-31 23:00:00 30 30
2015-07-31 23:30:00 0 0
ACT_TIME_AERATEUR_1_F5 ACT_TIME_AERATEUR_1_F6 \
TIMESTAMP
2015-07-31 23:00:00 30 30
2015-07-31 23:30:00 0 0
ACT_TIME_AERATEUR_1_F7
TIMESTAMP
2015-07-31 23:00:00 30
2015-07-31 23:30:00 0
Upvotes: 2