Kushagra
Kushagra

Reputation: 61

Extract Datetime information from a string in a DataFrame column

So I have the Edition Column which contains data in unevenly pattern, as some have ',' followed by the date and some have ',-' pattern.

df.head()

17                Paperback,– 1 Nov 2016
18    Mass Market Paperback,– 1 Jan 1991
19                      Paperback,– 2016
20               Hardcover,– 24 Nov 2018
21        Paperback,– Import, 4 Oct 2018

How can I extract the date to a separate column. I tried using str.split() but can't find specific pattern to extract.Is there any method I could do it?

Upvotes: 1

Views: 86

Answers (2)

Suhas Mucherla
Suhas Mucherla

Reputation: 1413

Try using dateutil

from dateutil.parser import parse
 
df['Dt']=[parse(i, fuzzy_with_tokens=True)[0] for i in df['column']]

Upvotes: 2

Ferris
Ferris

Reputation: 5601

obj = df['Edition']
obj.str.split('((?:\d+\s+\w+\s+)?\d{4}$)', expand=True)

or

obj.str.split('[,–]+').str[0]
obj.str.split('[,–]+').str[-1] # date

Upvotes: 3

Related Questions