Slartibartfast
Slartibartfast

Reputation: 1190

How many days between series data

I have the following Series:

df

Which produces

Date
2001-01-03    True
2002-07-24    True
2002-07-29    True
2008-09-30    True
2008-10-13    True
2008-10-28    True
2008-11-13    True
2008-11-21    True
2008-11-24    True
2008-12-16    True
2009-03-10    True
2009-03-23    True
Name: pct_day, dtype: bool

How can i find out how many days are between values which are true, excluding weekends?

Upvotes: 1

Views: 1081

Answers (4)

oskarseweryn
oskarseweryn

Reputation: 41

To calculate how many days are between two days in column 'Date', use np.busday_count in the loop:

import pandas as pd
import numpy as np
for index, row in df.iterrows():    
    if index>0:
        print(np.busday_count(dayA, row['Date']))
    dayA=row['Date']

For days:

2001-01-03
2002-07-24
2002-07-29
2008-09-30
2008-10-13
2008-10-28
2008-11-13
2008-11-21
2008-11-24
2008-12-16
2009-03-10
2009-03-23

Output would be:

405
3
1611
9
11
12
6
1
16
60
9

Upvotes: 1

mgrollins
mgrollins

Reputation: 651

You can use the to_frame() method on your index to turn the index into a column, then call diff() on that column

df2 = df.index.to_frame()
df2['diff'] = df2[df.0 == True]

Upvotes: 1

Philip Ciunkiewicz
Philip Ciunkiewicz

Reputation: 2791

You can do it like this:

(creating sample DataFrame just for example's sake)

>>> import pandas.util.testing as tm

>>> df = tm.makeTimeDataFrame(freq='M', nper=5)
>>> print(df)

                   A         B         C         D
2000-01-31  1.051346  1.722165 -0.659687  1.026716
2000-02-29  0.352166  1.699898  1.469741 -0.138593
2000-03-31 -0.202217 -0.470095  0.169060 -0.241817
2000-04-30  0.446261  1.518129  2.263510  1.800027
2000-05-31 -0.088365  1.923264  1.763859  0.348480

The diff method will calculate the datetime.timedelta between the two dates, with the zeroth index of course being NaT since there was nothing before it.

>>> df['time_delta'] = df.index.to_series().diff()
>>> print(df)

                   A         B         C         D time_delta
2000-01-31  1.051346  1.722165 -0.659687  1.026716        NaT
2000-02-29  0.352166  1.699898  1.469741 -0.138593    29 days
2000-03-31 -0.202217 -0.470095  0.169060 -0.241817    31 days
2000-04-30  0.446261  1.518129  2.263510  1.800027    30 days
2000-05-31 -0.088365  1.923264  1.763859  0.348480    31 days

Then if you want to get the number of days as a float instead of as a datetime object, you can use the Series.dt accessor:

>>> days = df.time_delta.dt.days
>>> print(days)

2000-01-31     NaN
2000-02-29    29.0
2000-03-31    31.0
2000-04-30    30.0
2000-05-31    31.0
Freq: M, dtype: float64

Upvotes: 2

Rahul P
Rahul P

Reputation: 2663

This seems to work:

import pandas as pd

df = pd.DataFrame({'Date' : pd.date_range(start='2/1/2018', end='2/08/2018', freq='D'),
            'Label': 'True'})

df['DayOfWeek'] = df['Date'].dt.day_name()

df = df[(df.DayOfWeek != 'Saturday') & (df.DayOfWeek != 'Sunday') & (df.Label == 'True')]

df['Diff'] = df['Date'].diff()

Upvotes: 1

Related Questions