Trimming strings (date, time) in pandas dataframe

Question

I am kind of new to python and pandas.

I have a rather large dataset (~500.000 rows). The first column contains the date and time in the form of

                      created_at
0 Sun Jul 26 04:06:58 +0000 2020
1 Sun Jul 26 04:08:22 +0000 2020
2 Sun Jul 26 04:24:10 +0000 2020
3 Sun Jul 26 04:27:10 +0000 2020

As a first step I would like to trim that to only the month and day to get a result like this:

created_at
0 Jul 26
1 Jul 26
2 Jul 26
3 Jul 26

Ideally I would like to have it like this in the end:

created_at
0 07_26
1 07_26
2 07_26
3 07_26

Can anyone help me with some efficient methods to do that? I would really appreciate any help!

jezrael · Accepted Answer

Use parse_dates with column name in read_csv and then for custom format is used Series.dt.strftime:

df = pd.read_csv('file', parse_dates=['created_at'])

#for first
df['created_at'] = df['created_at'].dt.strftime("%b %d")
#for second
df['created_at'] = df['created_at'].dt.strftime('%m_%d')
print (df)
  created_at
0      07_26
1      07_26
2      07_26
3      07_26

Trimming strings (date, time) in pandas dataframe

Answers (2)

Related Questions