Reputation: 391
This is my dataframe.
df1=pd.DataFrame({'user':['A','A','A','A','B','B','B'], 'status':
[1,0,1,0,1,0,1],'bindate':['2019-1-20','','2019-1-27','','2019-1-
2','','2019-1-25'],'unbindate':['','2019-1-25','','2019-1-30','','2019-
1-20','']})
It looks like this
user stutas bindate unbindate
0 A 1 2019-01-20 Nat
1 A 0 Nat 2019-01-25
2 A 1 2019-01-27 Nat
3 A 0 Nat 2019-01-30
4 B 1 2019-01-02 Nat
5 B 0 Nat 2019-01-20
6 B 1 2019-01-25 Nat
I want to use Pandas to come up with a new dataframe that looks like below
user bindate unbindate
0 A 2019-01-20 2019-01-25
1 A 2019-01-27 2019-01-30
2 B 2019-01-02 2019-01-20
3 B 2019-01-25 None
Our server captures the date when user becomes the member as bindate, and captures the date when user exists the member as unbindate. If it is a member, status=1, else status=0
I want to mash the records to create a date range to show the user as a member. Notice user A binds and unbinds twice, so A is not the member in the end. User B binds twice, unbinds once, so B is the member in the end, hence I leave the unbindate as None. I tried to use .loc to grab the data, no luck. Is there a better way to achieve this? Thank you.
Upvotes: 1
Views: 59
Reputation: 148870
If your dataframe is exactly as you have shown with interleaved bind and unbind, shift
could be enough:
df1['unbindate'] = df1.unbindate.shift(-1)
df1 = df1.loc[df1.status != 0].reset_index(drop=True).drop(columns='status')
It gives as expected:
user bindate unbindate
0 A 2019-1-20 2019-1-25
1 A 2019-1-27 2019-1-30
2 B 2019-1-02 2019-1-20
3 B 2019-1-25 NaN
Upvotes: 0
Reputation: 323226
Not quit efficient but work groupby
sorted
the value by isnull
df=df.mask(df=='Nat')
df.groupby('user').apply(lambda x : x.apply(lambda y : sorted(y,key=pd.isnull))).\
dropna(subset=['bindate','unbindate'],thresh=1)
Out[64]:
user stutas bindate unbindate
0 A 1 2019-01-20 2019-01-25
1 A 0 2019-01-27 2019-01-30
4 B 1 2019-01-02 2019-01-20
5 B 0 2019-01-25 NaN
Upvotes: 1