Reputation: 485
I have datasets
that record 'X' value for the 20 years in Dataframe format. X records the data with 3 hrs average and sample of data are given below.
Time_stamp X
1992-01-01 03:00:00 10.2
1992-01-01 06:00:00 10.4
1992-01-01 09:00:00 11.8
1992-01-01 12:00:00 12.0
1992-01-01 15:00:00 10.4
1992-01-01 18:00:00 9.4
1992-01-01 21:00:00 10.4
1992-01-02 00:00:00 13.6
1992-01-02 03:00:00 13.2
1992-01-02 06:00:00 11.8
1992-01-02 09:00:00 12.0
1992-01-02 12:00:00 12.8
1992-01-02 15:00:00 12.6
1992-01-02 18:00:00 11.0
1992-01-02 21:00:00 12.2
1992-01-03 00:00:00 13.8
1992-01-03 03:00:00 14.0
1992-01-03 06:00:00 13.4
1992-01-03 09:00:00 14.2
1992-01-03 12:00:00 16.2
1992-01-03 15:00:00 13.2
1992-01-03 18:00:00 13.4
1992-01-03 21:00:00 13.8
1992-01-04 00:00:00 14.8
1992-01-04 03:00:00 13.8
1992-01-04 06:00:00 7.6
1992-01-04 09:00:00 5.8
1992-01-04 12:00:00 4.4
1992-01-04 15:00:00 5.6
1992-01-04 18:00:00 6.0
1992-01-04 21:00:00 7.0
1992-01-05 00:00:00 6.8
1992-01-05 03:00:00 3.4
1992-01-05 06:00:00 5.8
1992-01-05 09:00:00 10.6
1992-01-05 12:00:00 9.2
1992-01-05 15:00:00 10.6
1992-01-05 18:00:00 9.8
1992-01-05 21:00:00 11.2
1992-01-06 00:00:00 12.0
1992-01-06 03:00:00 10.2
1992-01-06 06:00:00 9.0
1992-01-06 09:00:00 9.0
1992-01-06 12:00:00 8.6
1992-01-06 15:00:00 8.4
1992-01-06 18:00:00 8.2
1992-01-06 21:00:00 8.8
1992-01-07 00:00:00 10.0
1992-01-07 03:00:00 9.6
1992-01-07 06:00:00 8.0
1992-01-07 09:00:00 9.6
1992-01-07 12:00:00 10.8
1992-01-07 15:00:00 10.2
1992-01-07 18:00:00 9.8
1992-01-07 21:00:00 10.2
1992-01-08 00:00:00 9.4
1992-01-08 03:00:00 11.4
1992-01-08 06:00:00 12.6
1992-01-08 09:00:00 12.8
1992-01-08 12:00:00 10.4
1992-01-08 15:00:00 11.2
1992-01-08 18:00:00 9.0
1992-01-08 21:00:00 10.2
1992-01-09 00:00:00 8.2
I would like to create seperate dataframe that calcute and records yearly mean, weekly mean and daily mean of the given datasets. I am new to python and just started working with time series data. I found some question related to this here at stackoverflow but did not find appropriate answer related to this aand did not find any idea how to start with. Any help on this ? I wrote this code so far,
import pandas as pd
import numpy as np
datasets['date_minus_time'] = df["Time_stamp"].apply( lambda df :
datetime.datetime(year=datasets.year, month=datasets.month,
day=datasets.day))
datasets.set_index(df["date_minus_time"],inplace=True)
df['count'].resample('D', how='sum')
df['count'].resample('W', how='sum')
df['count'].resample('M', how='sum')
But not getting a clue how to include that data records every 3 hrs. and what should to next for my desired result.
Upvotes: 0
Views: 52
Reputation: 862591
Use to_datetime
for datetimes in column for improve performance and then DataFrame.resample
with parameter on
for specify datetime column:
df['Time_stamp'] = pd.to_datetime(df['Time_stamp'])
df_daily = df.resample('D', on='Time_stamp').mean()
df_monthly = df.resample('M', on='Time_stamp').mean()
df_weekly = df.resample('W', on='Time_stamp').mean()
Upvotes: 1
Reputation: 4215
You may use:
df['Time_stamp'] = pd.to_datetime(df['Time_stamp'], format='%Y-%m-%d %H:%M:%S')
df.set_index('Time_stamp',inplace=True)
df_monthly = df.resample('M').mean()
df_monthly
outputs:
X
Time_stamp
1992-01-31 10.403125
For daily mean use: df_daily = df.resample('D').mean()
which outputs:
X
Time_stamp
1992-01-01 10.657143
1992-01-02 12.400000
1992-01-03 14.000000
1992-01-04 8.125000
1992-01-05 8.425000
1992-01-06 9.275000
1992-01-07 9.775000
1992-01-08 10.875000
1992-01-09 8.200000
Upvotes: 0