geeb.24
geeb.24

Reputation: 557

Pandas Delete Part of String

>>> df
                       Time
    5/10/2017 (135) 01:05:03
    5/11/2017 (136) 04:05:06

Given an input date such as this in a DataFrame, how would I delete the Julian Date, (135) and (136), and remove the whitespace in the middle, so that the output looks like:

>>> df
                       Time
    5/10/2017 01:05:03
    5/11/2017 04:05:06

I've tried:

df['Time'].replace('(135)','', regex=True, inplace=True)

which outputs:

>>> df
                    Time
0  5/10/2017 () 01:05:03

I was wondering what I'm doing wrong here.

Upvotes: 0

Views: 2251

Answers (1)

jezrael
jezrael

Reputation: 862611

You can use replace by regex:

First need escape () by \ because special chars in regex, then match all ints by \d+ and last match zero or more whitespaces after ) by \s*.


df['Time'] = df['Time'].str.replace("\(\d+\)\s*", '')
print (df)
                 Time
0  5/10/2017 01:05:03
1  5/11/2017 04:05:06

And if need convert to datetime:

df['Time'] = pd.to_datetime(df['Time'].str.replace("\(\d+\)\s*", ''))
print (df)
                 Time
0 2017-05-10 01:05:03
1 2017-05-11 04:05:06

EDIT:

In your sample are mising escaping chars \ and is possible use instead \d+ [0-9]+:

df['Time'].replace('\([0-9]+\)\s*','', regex=True, inplace=True)
print (df)
                 Time
0  5/10/2017 01:05:03
1  5/11/2017 04:05:06

Upvotes: 3

Related Questions