Reputation: 413
I have a csv file with datetime column. I use pandas.read_csv(file, index_col="Date", parse_date=True)
to read the csv. The datetime columns has 30min freq/res, so the first time of a given date is 00:30:00
, but the last time is not what I want:
As you can see, time 00:00:00 of a given date (here 2015-12-01
) is interpreted as next day.
I couldn't find a way to resolve this. In this example, I want:
2015-12-02 00:00:00
be interpreted as 205-12-01 24:00:00
or something that refers to the correct date.
Does anyone know how to do in in pandas?
Edit: So what I want is when I get the date for this time 00:00:00, it give me date of yesterday (so it sees the time as 23:59:59): I want this:
2015-12-01 23:00:00 Tuesday 2015-12-01
2015-12-01 23:30:00 Tuesday 2015-12-01
2015-12-02 00:00:00 Wednesday 2015-12-02
2015-12-02 00:30:00 Wednesday 2015-12-02
be this:
2015-12-01 23:00:00 Tuesday 2015-12-01
2015-12-01 23:30:00 Tuesday 2015-12-01
2015-12-01 23:59:59 Tuesday 2015-12-01
2015-12-02 00:30:00 Wednesday 2015-12-02
be this
Upvotes: 1
Views: 1806
Reputation: 120439
(A bit late but) You can use dt.normalize
to find values to modify and subtract one second then change other columns according to DateTime column.
Input data:
>>> df
DateTime DayOfWeek Date
0 2015-12-01 23:00:00 Tuesday 2015-12-01
1 2015-12-01 23:30:00 Tuesday 2015-12-01
2 2015-12-02 00:00:00 Wednesday 2015-12-02
3 2015-12-02 00:30:00 Wednesday 2015-12-02
>>> df.dtypes
DateTime datetime64[ns]
DayOfWeek object
Date datetime64[ns]
new = df.loc[df['DateTime'].eq(df['DateTime'].dt.normalize()), ['DateTime']] \
.sub(pd.DateOffset(seconds=1))
new = new.assign(DayOfWeek=new['DateTime'].dt.day_name(),
Date=new['DateTime'].dt.normalize())
df.update(new)
Output result:
>>> df
DateTime DayOfWeek Date
0 2015-12-01 23:00:00 Tuesday 2015-12-01
1 2015-12-01 23:30:00 Tuesday 2015-12-01
2 2015-12-01 23:59:59 Tuesday 2015-12-01
3 2015-12-02 00:30:00 Wednesday 2015-12-02
4 2021-08-30 23:59:59 Monday 2021-08-30
Upvotes: 1
Reputation: 261015
It is actually quite simple if you can use strings. If the time is '00:00:00', subtract one day, convert to string, replace '00:00:00' with '24:00:00'.
import datetime
s = pd.Series(['2015-12-01 23:00:00', '2015-12-01 00:00:00'])
s = pd.to_datetime(s)
s.where(s.dt.time != datetime.time(0),
((s-pd.to_timedelta('1day'))
.dt.strftime('%Y-%m-%d %H:%M:%S')
.str.replace('00:00:00', '24:00:00')
)
)
Output:
0 2015-12-01 23:00:00
1 2015-11-30 24:00:00
Or, for your edit:
df['col1'] = pd.to_datetime(df['col1'])
df['col1'] = df['col1'].where(df['col1'].dt.time != datetime.time(0),
(df['col1']-pd.to_timedelta('1s'))
)
df['col2'] = df['col1'].dt.day_name()
df['col3'] = df['col1'].dt.date
output:
col1 col2 col3
0 2015-12-01 23:00:00 Tuesday 2015-12-01
1 2015-12-01 23:30:00 Tuesday 2015-12-01
2 2015-12-01 23:59:59 Tuesday 2015-12-01
3 2015-12-02 00:30:00 Wednesday 2015-12-02
Upvotes: 2