Reputation: 25
Problem Statement: (Multiple Linear regression) A digital media company (Netflix, etc.) had launched a show. Initially, the show got a good response, but then witnessed a decline in viewership. The company wants to figure out what went wrong.
I want to create an extra column i.e media['days'] which basically keeps a count of the total numbers of days the show is running. Suppose the 1st day of the show is on 1st March 2017, i.e 2017-03-1.
The code I written is as follows.
media['Date'] = pd.to_datetime(media['Date'])
#deriving "days since the show started"
import datetime
d0 = date(2017, 2, 28)
d1 = media.Date #media is a dataframe variable
delta = d1 - d0
media['Day'] = delta
The error which I get is:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in
3 d0 = date(2017, 2, 28)
4 d1 = media.Date #media is a dataframe variable
----> 5 delta = d1 - d0
6 media['Day'] = delta
c:\DEV\work\lib\site-packages\pandas\core\ops\__init__.py in wrapper(left, right)
990 # test_dt64_series_add_intlike, which the index dispatching handles
991 # specifically.
--> 992 result = dispatch_to_index_op(op, left, right, pd.DatetimeIndex)
993 return construct_result(
994 left, result, index=left.index, name=res_name, dtype=result.dtype
c:\DEV\work\lib\site-packages\pandas\core\ops\__init__.py in dispatch_to_index_op(op, left, right,
index_class)
628 left_idx = left_idx._shallow_copy(freq=None)
629 try:
--> 630 result = op(left_idx, right)
631 except NullFrequencyError:
632 # DatetimeIndex and TimedeltaIndex with freq == None raise ValueError
TypeError: unsupported operand type(s) for -: 'DatetimeIndex' and 'datetime.date'
I can see the data type is mis-matching.
d0 is of the type: datetime.date &
d1 is of the type: pandas.core.series.Series
So can anyone help me as to how...I can convert / parse the value of d0 to be exactly same as that of d1.
Upvotes: 0
Views: 580
Reputation: 28233
It is necessary to convert the datetime.date
in order to get the interval. To do this, you have to wrap d0
in pd.to_datetime
.
i.e. the following should work, giving a delta in days, if you want just the integer part, you can use dt
accessor on the datetime series.
delta = d1 - pd.to_datetime(d0)
# or
delta = (d1 - pd.to_datetime(d0)).dt.days
Upvotes: 3