Jenny Jing Yu
Jenny Jing Yu

Reputation: 195

Convert the month difference into integer

The dataset with datetime columns as below, I converted to the period as month and calculated the difference. I would like to convert the output as an integer, but failed to do it.

ID    StartDate    CurrentDate 
a     2019-03-05   2019-06-18
b     2019-02-05   2019-07-20
c     2019-01-23   2019-07-25

df['start_month_year'] = df['StartDate'].dt.to_period('M')
df['current_month_year'] = df['CurrentDate'].dt.to_period('M')
df['tenure'] = (df['current_month_year'] - df['start_month_year'])
df['tenure'].astype(int)

Upvotes: 1

Views: 933

Answers (2)

Scott Boston
Scott Boston

Reputation: 153460

Let's have some fun with pandas.tseries.offsets:

Using your code, you can get the number of offsets, n, by using the apply to retreive this attribute from the pandas.tseries.offsets object like this:

df['start_month_year'] = df['StartDate'].dt.to_period('M')
df['current_month_year'] = df['CurrentDate'].dt.to_period('M')
df['tenure'] = (df['current_month_year'] - df['start_month_year'])
df['tenure'].apply(lambda x: x.n)

Output:

0    3
1    5
2    6
Name: tenure, dtype: int64

However, I like @WeNYoBen's method better.

Upvotes: 2

BENY
BENY

Reputation: 323226

Here is one way

df.CurrentDate.dt.year*12+df.CurrentDate.dt.month-(df.StartDate.dt.year*12+df.StartDate.dt.month)
Out[512]: 
0    3
1    5
2    6
dtype: int64

To fix your code adding astype before the result became pandas.tseries.offsets.MonthEnd

(df['CurrentDate'].dt.to_period('M').astype(int)-df['StartDate'].dt.to_period('M').astype(int))

Upvotes: 2

Related Questions