Reputation: 212
I have a data frame which contains multiple records in time—specifically every 4 minutes. I want to plot the time series to get daily multiple values of that temperature. Nevertheless, the data plots every value in a single manner and not daily, as I want.
df = pd.read_csv("my_file.csv")
print (df.head())
Output
Temperature
Date/Time
2015-07-01 00:00:47 25.21
2015-07-01 00:01:48 25.23
2015-07-01 00:02:48 25.33
2015-07-01 00:03:47 25.22
2015-07-01 00:04:48 25.32
When I plot with seaborn I get this:
df = df.reset_index()
sns.relplot(x= "Date/Time", y="Temperature", data=df, kind="line")
plt.show()
This is not what I want to plot; I want to something like this example:
I believe that I have to resample the data, but I get the average of that day. Therefore, one single value and not multiple values for a day.
df = df.resample("H").mean()
print (df.head())
Output:
Temperature
Date/Time
2015-07-01 00:00:00 25.264167
2015-07-01 01:00:00 25.267167
2015-07-01 02:00:00 25.272000
2015-07-01 03:00:00 25.290167
2015-07-01 04:00:00 25.307333
Not what I need. Can you help me?
Upvotes: 0
Views: 967
Reputation: 40697
There must be a better way to bin the timestamps, but I'm drawing a blank right now.
Here is one way to do it: create a new column where you drop part of the date/time information so that all rows that fall in that timeframe share hte same value.
for ex, if you want to bin by hours:
df['Binned time'] = pd.to_datetime(df.index.strftime('%Y-%m-%d %H:00:00'))
or by days:
df['Binned time'] = pd.to_datetime(df.index.strftime('%Y-%m-%d 00:00:00'))
then use lineplot:
sns.lineplot(data=df, x='Binned time', y='data')
Upvotes: 1