Reputation: 41
I am trying to take a Pandas dataframe, parse a column that represents dates and add a new column to the dataframe with a simple mm/dd/yyyy format.
Here is the data and libraries:
import pandas as pd
import datetime
from dateutil.parser import parse
df = pd.DataFrame([['row1', 'Tue Jun 16 19:05:44 UTC 2020', 'record1'], ['row2', 'Tue Jun 16 17:10:02 UTC 2020', 'record2'], ['row3', 'Fri Jun 12 17:52:37 UTC 2020', 'record3']], columns=["row", "checkin", "record"])
From picking bits and pieces from around here I crafted this line to parse and add the new column of data:
df['NewDate'] = df.apply(lambda row: datetime.date.strftime(parse(df['checkin']), "%m/%d/%Y"), axis = 1)
But I end up with this error when run, can anyone suggest a fix or easier way to do this, seems like it should be simpler and more pythonic than I am finding
TypeError: ('Parser must be a string or character stream, not Series', 'occurred at index 0')
Thanks for any help you can offer.
Upvotes: 2
Views: 2236
Reputation: 650
Just change df['checkin']
to row['checkin']
as below
df['NewDate'] = df.apply(lambda row: datetime.date.strftime(parse(row['checkin']), "%m/%d/%Y"), axis = 1)
Upvotes: 0
Reputation: 3676
You could do so without apply
df['newDate'] = pd.to_datetime(df.checkin).dt.strftime("%m/%d/%Y")
row checkin record newDate
0 row1 Tue Jun 16 19:05:44 UTC 2020 record1 06/16/2020
1 row2 Tue Jun 16 17:10:02 UTC 2020 record2 06/16/2020
2 row3 Fri Jun 12 17:52:37 UTC 2020 record3 06/12/2020
Upvotes: 2