Phil Collins
Phil Collins

Reputation: 327

Python - How to turn this for loop into a list comprehension

I have the following for loop to convert all values in a column into a datetime format, with errors='coerce' to deal with any that don't fit into the datetime format:

for x in datecols:
    df[x] = pd.to_datetime(df[x],errors='coerce')

However, to try and take advance of list comprehensions I'd like to convert it however I'm not getting anywhere.

I've tried the following:

[x for x in datecols pd.to_datetime(df[x],errprs='coerce')]

however it doesn't work.

Thanks!

Upvotes: 0

Views: 215

Answers (3)

jezrael
jezrael

Reputation: 862591

I think here is better and easier first solution like list comprehension.

Or use DataFrame.apply:

df[datecols] = df[datecols].apply(pd.to_datetime,errors='coerce')

df[datecols] = df[datecols].apply(lambda x: pd.to_datetime(x,errors='coerce'))

Solution with list comprehension is possible - values are extracted by DataFrame.pop, joins together by concat, also for same order of columns (if necessary) is used DataFrame.reindex:

df = pd.DataFrame({'Date_1':['2020-05-01','2020-06-02','2020-02-30'],
                   'Date_2':['1999-02-01','2000','2005-10-52'],
                   'col1':list('abc')})

print (df)
       Date_1      Date_2 col1
0  2020-05-01  1999-02-01    a
1  2020-06-02        2000    b
2  2020-02-30  2005-10-52    c

datecols = ['Date_1','Date_2']
cols = df.columns
df1 = pd.concat([pd.to_datetime(df.pop(x),errors='coerce') for x in datecols], axis=1)

df = df.join(df1).reindex(cols, axis=1)
print (df)
      Date_1     Date_2 col1
0 2020-05-01 1999-02-01    a
1 2020-06-02 2000-01-01    b
2        NaT        NaT    c

Upvotes: 3

norok2
norok2

Reputation: 26886

If you really want a list comprehension, you have to accept the fact that you will get a list, so there is no list comprehension code that is going to be really equivalent to your original loop.

A similar code, though, may look like this:

[pd.to_datetime(df[x], errprs='coerce') for x in datecols]

but it really looks like you may want to use a different approach, like what is suggested in @jezrael answer.

Upvotes: 0

Celius Stingher
Celius Stingher

Reputation: 18367

There problem here is that list comprehension will have a hard time re-assigning the values in the columns to the dataframe. In any case, you can re-create the dataframe with it, but the list comprehension itself won't do the trick:

df = pd.DataFrame({'Date_1':['2020-05-01','2020-06-02','AAA'],
                   'Date_2':['19990201','20000101','20051012']})

Original dataframe:

       Date_1    Date_2
0  2020-05-01  19990201
1  2020-06-02  20000101
2         AAA  2005101

Proposed solution:

pd.DataFrame([pd.to_datetime(df[x],errors='coerce',infer_datetime_format=True) for x in df]).T

Output:

      Date_1     Date_2
0 2020-05-01 1999-02-01
1 2020-06-02 2000-01-01
2        NaT 2005-10-12

Upvotes: 0

Related Questions