William
William

Reputation: 81

Extracting date from string and saving in new pandas DataFrame columns

Background:

I have a pandas DataFrame containing a tweet and weather column. The DataFrame columns are current as follows -

enter image description here

Objective:
I am trying to extract the datestamp from the weather column (e.g the datestamp for row index 0 is '(2020-07-14)') and save it in a new date column, with the purpose of filtering on it, e.g filtering to the latest date.

I know how to change a column string value to a datestamp, if it were something like '20140512'. However I have no idea how to identify a datestamp in the current format and extract that into a new column.

Any advice would be greatly appreciated

Upvotes: 3

Views: 1226

Answers (1)

Derek Eden
Derek Eden

Reputation: 4618

you could do something like this, assuming it's in the weather column and always has the same formatting:

df['date'] = pd.to_datetime(df['weather'].str.extract('\((\d{4}-\d{2}-\d{2})\)')[0])

or

import re
df['date'] = pd.to_datetime(df['weather'].apply(lambda x: re.search('\((\d{4}-\d{2}-\d{2})\)', x).group(1)))

Upvotes: 1

Related Questions