Reputation: 13
My dataframe looks like this:
df[['reported_date', 'current_date']].head()
reported_date current_date
0 2016-01-15 13:58:21 2016-01-18 00:00:00
1 2016-01-14 10:51:24 2016-01-18 00:00:00
2 2016-01-15 15:17:35 2016-01-18 00:00:00
3 2016-01-17 17:07:10 2016-01-18 00:00:00
4 2016-01-17 17:08:23 2016-01-18 00:00:00
I can apply date subtraction directly like:
df[['reported_date', 'current_date']].head().apply(lambda x: x[1]-x[0], axis=1)
but when I tried to apply date_range to get the interval between the days I got the following error
df[['reported_date', 'current_date']].head().apply(lambda x: pd.date_range(x[0], x[1], freq='B'), axis=1)
"ValueError: Length of values does not match length of index"
So what's the right way to apply date_range()
to two columns of datetime
?
Thank you in advance.
jian
Upvotes: 1
Views: 1701
Reputation:
pd.date_range
doesn't return an interval. It returns a series (DateTimeIndex
really) of all datetime objects between start and end.
Since start is reported_date
here and is variable, while the end is current_date
and is fixed, you get series of different lengths, which obviously don't fit nicely into a single (new) column.
The subtraction you use before gives you the interval between the dates. So there is no reason to use pd.date_range
: x[1] - x[0]
does exactly what you want.
Upvotes: 3