royalewithcheese
royalewithcheese

Reputation: 502

How to remove the timestamp from a column mix of date and strings using pandas/python?

Supposing I have a column col1 in a dataframe df like this,

col1                 |
2018-01-30 00:00:00  |
Yes                  |
                     |
nan                  |
None                 |

I want to have the following output col1 without the timestamp while keeping the rest of the content intact,

col1                 |
2018-01-30           |
Yes                  |
                     |
nan                  |
None                 |

I used the following line of code,

df['col2'] = pd.to_datetime(df['col1'], errors='coerce')

But it seems to not solve the problem stated above.

I get the following output,

col1      |            
2018-01-30|
NaT       |           
NaT       |              
NaT       |           
NaT       |          

I am looking for a pandas solution but ideally looking for a function that I could re-use using python for the same. Any help is greatly appreciated!

Upvotes: 2

Views: 242

Answers (1)

jezrael
jezrael

Reputation: 862481

One idea is convert datetimes to strings alternatives and then replace missing values to original:

df['col1'] = (pd.to_datetime(df['col1'], errors='coerce').dt.strftime('%Y-%m-%d')
                .fillna(df['col1']))

Or datetimes are converted to python dates:

df['col1'] = pd.to_datetime(df['col1'], errors='coerce').dt.date.fillna(df['col1'])

print (df)
         col1
0  2018-01-30
1         Yes
2            
3         NaN
4        None

Upvotes: 2

Related Questions