Reputation: 327
I have the following for loop to convert all values in a column into a datetime format, with errors='coerce' to deal with any that don't fit into the datetime format:
for x in datecols:
df[x] = pd.to_datetime(df[x],errors='coerce')
However, to try and take advance of list comprehensions I'd like to convert it however I'm not getting anywhere.
I've tried the following:
[x for x in datecols pd.to_datetime(df[x],errprs='coerce')]
however it doesn't work.
Thanks!
Upvotes: 0
Views: 215
Reputation: 862591
I think here is better and easier first solution like list comprehension.
Or use DataFrame.apply
:
df[datecols] = df[datecols].apply(pd.to_datetime,errors='coerce')
df[datecols] = df[datecols].apply(lambda x: pd.to_datetime(x,errors='coerce'))
Solution with list comprehension is possible - values are extracted by DataFrame.pop
, joins together by concat
, also for same order of columns (if necessary) is used DataFrame.reindex
:
df = pd.DataFrame({'Date_1':['2020-05-01','2020-06-02','2020-02-30'],
'Date_2':['1999-02-01','2000','2005-10-52'],
'col1':list('abc')})
print (df)
Date_1 Date_2 col1
0 2020-05-01 1999-02-01 a
1 2020-06-02 2000 b
2 2020-02-30 2005-10-52 c
datecols = ['Date_1','Date_2']
cols = df.columns
df1 = pd.concat([pd.to_datetime(df.pop(x),errors='coerce') for x in datecols], axis=1)
df = df.join(df1).reindex(cols, axis=1)
print (df)
Date_1 Date_2 col1
0 2020-05-01 1999-02-01 a
1 2020-06-02 2000-01-01 b
2 NaT NaT c
Upvotes: 3
Reputation: 26886
If you really want a list
comprehension, you have to accept the fact that you will get a list
, so there is no list
comprehension code that is going to be really equivalent to your original loop.
A similar code, though, may look like this:
[pd.to_datetime(df[x], errprs='coerce') for x in datecols]
but it really looks like you may want to use a different approach, like what is suggested in @jezrael answer.
Upvotes: 0
Reputation: 18367
There problem here is that list comprehension will have a hard time re-assigning the values in the columns to the dataframe. In any case, you can re-create the dataframe with it, but the list comprehension itself won't do the trick:
df = pd.DataFrame({'Date_1':['2020-05-01','2020-06-02','AAA'],
'Date_2':['19990201','20000101','20051012']})
Original dataframe:
Date_1 Date_2
0 2020-05-01 19990201
1 2020-06-02 20000101
2 AAA 2005101
Proposed solution:
pd.DataFrame([pd.to_datetime(df[x],errors='coerce',infer_datetime_format=True) for x in df]).T
Output:
Date_1 Date_2
0 2020-05-01 1999-02-01
1 2020-06-02 2000-01-01
2 NaT 2005-10-12
Upvotes: 0