eponkratova
eponkratova

Reputation: 477

Subtract two datetime to get the number of days (python)

I have been wrapping my mind around this for the last two days, and I am on the verge of giving up.

I have a col named Renewed_subscription, format datetime. I need to get how many days are between today and the date when a person has to renew his subscription.

from datetime import datetime
dataset['DaysUntilSub'] = dataset.apply(lambda x: (pd.Timestamp.today().strftime('%Y-%m-%d') - str(dataset['Renewed_subscription'])).days, axis=1)

It throws the error:

TypeError Traceback (most recent call last)

C:\Report.py in ()

--> 166 dataset['DaysUntilSub'] = dataset.apply(lambda x: (pd.Timestamp.today().strftime('%Y-%m-%d') - str(dataset['Renewed_subscription'])).days, axis=1)

168 #dataset['DaysUntilSub'] = dataset.apply(lambda x: (pd.Timestamp.today().strftime('%Y-%m-%d') - str(dataset['Renewed_subscription'])).days, axis=1)

C:\Users\katep\Anaconda3\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)

4150 if reduce is None: 4151 reduce = True -> 4152 return self._apply_standard(f, axis, reduce=reduce) 4153 else: 4154 return self._apply_broadcast(f, axis)

C:\Users\katep\Anaconda3\lib\site-packages\pandas\core\frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)

4246 try:

4247 for i, v in enumerate(series_gen):

-> 4248 results[i] = func(v)

4249 keys.append(v.name)

4250 except Exception as e:

C:\Report.py in (x)

TypeError: ("unsupported operand type(s) for -: 'str' and 'str'", 'occurred at index 42')

I will be very grateful if you could, please, point at my error, because I cannot proceed with the analysis because of it!

Upvotes: 0

Views: 3761

Answers (4)

eponkratova
eponkratova

Reputation: 477

The solution that I adopted at the end, thank to @titipat, was:

import dataset

dataset['DaysUntilSub'] = dataset['Renewed_subscription'].map(
    lambda x: (x - datetime.today()).days)

Upvotes: 0

titipata
titipata

Reputation: 5389

As above mentioned, you don't want to cast datetime to string before subtraction. You can operate using simple - sign and find how many days later.

from datetime import datetime
import pandas as pd

# create example dataframe
df = pd.DataFrame([datetime(1985, 4, 10), 
                   datetime(2010, 4, 10), 
                   datetime(2015, 4, 10), 
                   datetime(2017, 4, 10)], columns=['Renewed_subscription'])
# subtraction with today
df['DaysUntilSub'] = df['Renewed_subscription'].map(lambda x: (datetime.today() - x).days)

The dataframe outout

  Renewed_subscription  DaysUntilSub
0   1985-04-10  11695
1   2010-04-10  2564
2   2015-04-10  738
3   2017-04-10  7

And the same solution without a lambda:

def days_from_today(date):
    return (datetime.today() - date).days

df['DaysUntilSub'] = df['Renewed_subscription'].map(days_from_today)

Upvotes: 1

it's-yer-boy-chet
it's-yer-boy-chet

Reputation: 2007

Your issue is that you are trying to subtract a string from another string.

'string' - 'other string'

TypeError                                 Traceback (most recent call last)
<ipython-input-1-e61d76792339> in <module>()
----> 1 'string' - 'other string'

TypeError: unsupported operand type(s) for -: 'str' and 'str'

You need to subtract the datetime objects from each other then convert the timedelta object to a string. For example something like this:

a = datetime.datetime.now()
dataset['DaysUntilSub'] = dataset['Renewed_subscription'].apply(lambda x: (x - a).days, axis=1)

this assumes that your 'Renewed_subscription' is already a datetime object. The resulting column in the dataframe will be a integer value for the number of days.

Upvotes: 0

A Saxena
A Saxena

Reputation: 165

I think the problem is that you are trying to subtract a string from a string.

The code pd.Timestamp.today().strftime('%Y-%m-%d') produces a string '2016-04-31' and the code str(dataset['Renewed_subscription']) converts the dataset to str. The minus operator for date set is not defined for string. I would recommend the following: pd.Timestamp.today() - dataset['Renewed_subscription']

This will give you timedelta object. Now you can convert it to days by calling days function. For example:

>>> import datetime
>>> a = datetime.datetime(2012, 9, 16, 0, 0)
>>> b = datetime.datetime.today()
>>> v = b-a
>>> v.days
1674
>>>

I hope this helps. Thanks

Upvotes: 0

Related Questions