Reputation: 1034
How can I convert in pandas
a date format that looks something like this:
2018-08-27 00:00:00.000
2018-08-26 00:00:00.000
2018-08-24 00:00:00.000
2018-08-24 00:00:00.000
2018-08-24 00:00:00.000
2018-08-24 00:00:00.000
2018-08-23 00:00:00.000
2018-08-23 00:00:00.000
2018-08-20 00:00:00.000
2018-08-20 00:00:00.000
to an integer format counting the days since first of January 2010?
Upvotes: 2
Views: 4788
Reputation: 11917
You can simply apply sub on the pandas Timestamp column like this as mentioned by jezrael in his answer which is very direct.
If you want to do the same serially one by one you can do it like this with the help of map
base_date = pd.Timestamp('2010-01-01 00:00:00')
df['days'] = df['date'].map(lambda date : (pd.Timestamp(date) - base_date).days )
Upvotes: 1
Reputation: 862731
Subtract date from column by Series.sub
and convert timedeltas to days by Series.dt.days
:
df['days'] = pd.to_datetime(df['date']).sub(pd.Timestamp('2010-01-01')).dt.days
print (df)
date days
0 2018-08-27 00:00:00.000 3160
1 2018-08-26 00:00:00.000 3159
2 2018-08-24 00:00:00.000 3157
3 2018-08-24 00:00:00.000 3157
4 2018-08-24 00:00:00.000 3157
5 2018-08-24 00:00:00.000 3157
6 2018-08-23 00:00:00.000 3156
7 2018-08-23 00:00:00.000 3156
8 2018-08-20 00:00:00.000 3153
9 2018-08-20 00:00:00.000 3153
Upvotes: 3