Reputation: 477
I have been wrapping my mind around this for the last two days, and I am on the verge of giving up.
I have a col named Renewed_subscription
, format datetime
. I need to get how many days are between today and the date when a person has to renew his subscription.
from datetime import datetime
dataset['DaysUntilSub'] = dataset.apply(lambda x: (pd.Timestamp.today().strftime('%Y-%m-%d') - str(dataset['Renewed_subscription'])).days, axis=1)
It throws the error:
TypeError Traceback (most recent call last)
C:\Report.py in ()
--> 166 dataset['DaysUntilSub'] = dataset.apply(lambda x: (pd.Timestamp.today().strftime('%Y-%m-%d') - str(dataset['Renewed_subscription'])).days, axis=1)
168 #dataset['DaysUntilSub'] = dataset.apply(lambda x: (pd.Timestamp.today().strftime('%Y-%m-%d') - str(dataset['Renewed_subscription'])).days, axis=1)
C:\Users\katep\Anaconda3\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
4150 if reduce is None: 4151 reduce = True -> 4152 return self._apply_standard(f, axis, reduce=reduce) 4153 else: 4154 return self._apply_broadcast(f, axis)
C:\Users\katep\Anaconda3\lib\site-packages\pandas\core\frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
4246 try:
4247 for i, v in enumerate(series_gen):
-> 4248 results[i] = func(v)
4249 keys.append(v.name)
4250 except Exception as e:
C:\Report.py in (x)
TypeError: ("unsupported operand type(s) for -: 'str' and 'str'", 'occurred at index 42')
I will be very grateful if you could, please, point at my error, because I cannot proceed with the analysis because of it!
Upvotes: 0
Views: 3761
Reputation: 477
The solution that I adopted at the end, thank to @titipat, was:
import dataset
dataset['DaysUntilSub'] = dataset['Renewed_subscription'].map(
lambda x: (x - datetime.today()).days)
Upvotes: 0
Reputation: 5389
As above mentioned, you don't want to cast datetime
to string before subtraction. You can operate using simple -
sign and find how many days
later.
from datetime import datetime
import pandas as pd
# create example dataframe
df = pd.DataFrame([datetime(1985, 4, 10),
datetime(2010, 4, 10),
datetime(2015, 4, 10),
datetime(2017, 4, 10)], columns=['Renewed_subscription'])
# subtraction with today
df['DaysUntilSub'] = df['Renewed_subscription'].map(lambda x: (datetime.today() - x).days)
The dataframe outout
Renewed_subscription DaysUntilSub
0 1985-04-10 11695
1 2010-04-10 2564
2 2015-04-10 738
3 2017-04-10 7
And the same solution without a lambda:
def days_from_today(date):
return (datetime.today() - date).days
df['DaysUntilSub'] = df['Renewed_subscription'].map(days_from_today)
Upvotes: 1
Reputation: 2007
Your issue is that you are trying to subtract a string from another string.
'string' - 'other string'
TypeError Traceback (most recent call last)
<ipython-input-1-e61d76792339> in <module>()
----> 1 'string' - 'other string'
TypeError: unsupported operand type(s) for -: 'str' and 'str'
You need to subtract the datetime objects from each other then convert the timedelta object to a string. For example something like this:
a = datetime.datetime.now()
dataset['DaysUntilSub'] = dataset['Renewed_subscription'].apply(lambda x: (x - a).days, axis=1)
this assumes that your 'Renewed_subscription' is already a datetime object. The resulting column in the dataframe will be a integer value for the number of days.
Upvotes: 0
Reputation: 165
I think the problem is that you are trying to subtract a string from a string.
The code pd.Timestamp.today().strftime('%Y-%m-%d') produces a string '2016-04-31' and the code str(dataset['Renewed_subscription']) converts the dataset to str. The minus operator for date set is not defined for string. I would recommend the following: pd.Timestamp.today() - dataset['Renewed_subscription']
This will give you timedelta object. Now you can convert it to days by calling days function. For example:
>>> import datetime
>>> a = datetime.datetime(2012, 9, 16, 0, 0)
>>> b = datetime.datetime.today()
>>> v = b-a
>>> v.days
1674
>>>
I hope this helps. Thanks
Upvotes: 0