user2100039
user2100039

Reputation: 1366

Pandas Shift Date Time Columns Back One Hour

I have data in a DF (df1) that starts and ends like this below and I'm trying to shift the "0" and "1" columns below so that the date and time is moved back one hour so that the date and time start at hour == 0 not hour == 1.

data starts (df1) -

          0    1         2         3         4    5         6       7  
0  20160101  100  7.977169  109404.0  20160101  100  4.028678   814.0   
1  20160101  200  8.420204  128546.0  20160101  200  4.673662  2152.0   
2  20160101  300  9.515370  165931.0  20160101  300  8.019863  8100.0   

data ends (df1) -

             0     1         2        3         4     5         6      7  
8780  20161231  2100  4.198906  11371.0  20161231  2100  0.995571  131.0   
8781  20161231  2200  4.787433  19083.0  20161231  2200  1.029809    NaN   
8782  20161231  2300  3.987506   9354.0  20161231  2300  0.900942    NaN   
8783  20170101     0  3.284947   1815.0  20170101     0  0.899262    NaN   

I need the date and time to start shifted back one hour so start time is hour begin not hour end -

          0    1         2         3         4    5         6       7  
0  20160101  000  7.977169  109404.0  20160101  100  4.028678   814.0   
1  20160101  100  8.420204  128546.0  20160101  200  4.673662  2152.0   
2  20160101  200  9.515370  165931.0  20160101  300  8.019863  8100.0    

and ends like this with the date and time below -

             0     1         2        3         4     5         6      7  
8780  20161231  2000  4.198906  11371.0  20161231  2100  0.995571  131.0   
8781  20161231  2100  4.787433  19083.0  20161231  2200  1.029809    NaN   
8782  20161231  2200  3.987506   9354.0  20161231  2300  0.900942    NaN   
8783  20161231  2300  3.284947   1815.0  20170101     0  0.899262    NaN  

And, i have no real idea of how to accomplish this or how to research it. Thank you,

Upvotes: 5

Views: 4028

Answers (3)

Umar.H
Umar.H

Reputation: 23099

It would be better to create a proper datetime object then simply remove the hours as a sum which will handle any redaction in days. We can then use dt.strftime to re-create your object (string) columns.

s = pd.to_datetime(
    df[0].astype(str) + df[1].astype(str).str.zfill(4), format="%Y%m%d%H%M"
)

0      2016-01-01 01:00:00
1      2016-01-01 02:00:00
2      2016-01-01 03:00:00
8780   2016-12-31 21:00:00
8781   2016-12-31 22:00:00
8782   2016-12-31 23:00:00
8783   2017-01-01 00:00:00
dtype: datetime64[ns]

df[1] = (s - pd.DateOffset(hours=1)).dt.strftime("%H%M").str.lstrip("0").str.zfill(3)
df[0] = (s - pd.DateOffset(hours=1)).dt.strftime("%Y%d%m")

print(df)

             0     1         2         3         4     5         6       7
0     20160101   000  7.977169  109404.0  20160101   100  4.028678   814.0
1     20160101   100  8.420204  128546.0  20160101   200  4.673662  2152.0
2     20160101   200  9.515370  165931.0  20160101   300  8.019863  8100.0
8780  20163112  2000  4.198906   11371.0  20161231  2100  0.995571   131.0
8781  20163112  2100  4.787433   19083.0  20161231  2200  1.029809     NaN
8782  20163112  2200  3.987506    9354.0  20161231  2300  0.900942     NaN
8783  20163112  2300  3.284947    1815.0  20170101     0  0.899262     NaN

Upvotes: 4

Shubham Sharma
Shubham Sharma

Reputation: 71687

Use, DataFrame.shift to shift the columns 0, 1, then use Series.bfill on column 0 of df2 to fill the missing values, then use .fillna on column 1 of df2 to fill the NaN values, finally use Dataframe.join to join the dataframe df2 with the dataframe df1:

df2 = df1[['0', '1']].shift()
df2['0'] = df2['0'].bfill()
df2['1'] = df2['1'].fillna('000')
df2 = df2.join(df1.loc[:, '2':])

# print(df2)
             0     1         2         3         4     5         6       7
0     20160101   000  7.977169  109404.0  20160101   100  4.028678   814.0
1     20160101   100  8.420204  128546.0  20160101   200  4.673662  2152.0
2     20160101   200  9.515370  165931.0  20160101   300  8.019863  8100.0
...
8780  20160101   300  4.198906   11371.0  20161231  2100  0.995571   131.0
8781  20161231  2100  4.787433   19083.0  20161231  2200  1.029809     NaN
8782  20161231  2200  3.987506    9354.0  20161231  2300  0.900942     NaN
8783  20161231  2300  3.284947    1815.0  20170101     0  0.899262     NaN

Upvotes: 2

Vijeth Rai
Vijeth Rai

Reputation: 320

You can do subtraction in pandas (considering that the data in your dataframe are not string type)

I will show you an example on how it can be done

import pandas as pd 

df = pd.DataFrame()

df['time'] = [0,100,500,2100,2300,0]    #creating dataframe 

df['time'] = df['time']-100             #This is what you want to do

Now your data will be subtracted by 100.

There is a case when subtracting 0 you will get -100 as time. For that you can do this:

for i in range(len(df['time'])):
    if df['time'].iloc[i]== -100:
        df['time'].iloc[i]=2300

Upvotes: -1

Related Questions