MDS
MDS

Reputation: 53

Removing decimal seconds from time format data in DataFrame

I've got two large df's from two xlsx spreadsheets and would like to merge them 'on' time stamp ['Time'] data.

The problem is that one data set has recorded time to decimal seconds and the other to whole seconds. This means that the two data set ['Time'] keys never match up...

df1     Time                                 df2     Time
0       00:07:53.755000                      0       00:07:53
1       00:07:54.096000                      1       00:07:54
2       00:07:55.097000                      2       00:07:55
3       00:07:56.099000                      3       00:07:56
4       00:07:57.002000                      4       00:07:57
5       00:07:58.012000                      5       00:07:58

I've tried modifying time formats in Excel but it always retains the millisecond value because of its fractional 24 hour thing. I need to remove the decimal seconds from df1 (or just get them to match somehow!) to allow matching with row data from df2 and am hoping there is a much simpler way to do this in python?

Thanks to any and all advice!

Upvotes: 4

Views: 12523

Answers (3)

jpp
jpp

Reputation: 164773

In df1, you can just set microseconds to 0:

df1['Time'] = pd.to_datetime(df1['Time']).apply(lambda x: x.replace(microsecond=0))

Then perform your merge as normal.

Upvotes: 8

noslenkwah
noslenkwah

Reputation: 1744

I would use pandas strftime

df1['Time'] = pd.to_datetime(df1['Time']).dt.strftime('%H:%M:%S')
df2['Time'] = pd.to_datetime(df2['Time']).dt.strftime('%H:%M:%S')

merged = pd.merge(df1, df2, on='Time')

Upvotes: 3

YOLO
YOLO

Reputation: 21739

This is one way to do this.

# convert time to str and remove millisecond
df1['Time'] = df1['Time'].astype(str)
df1['Time'] = df1['Time'].str.replace('\..*','')

# Just to be sure there are no error because of timeformat
df2['Time'] = df2['Time'].astype(str)

# now we can join
df3 = pd.merge(df1, df2, on='Time')

Upvotes: 2

Related Questions