Reputation: 97
What is the easiest way to convert the following ascending data frame:
start end
0 100 500
1 400 700
2 450 580
3 750 910
4 920 940
5 1000 1200
6 1100 1300
into
start end
0 100 700
1 750 910
2 920 940
3 1000 1300
You may notice that rows 0:3 and 5:7 were merged, because these rows overlap or one row is subpart of another: actually, they have only one start and end.
Upvotes: 0
Views: 207
Reputation: 260420
Use a custom group with shift
to identify the overlapping intervals and keep the first start and last end (or min/max if you prefer):
group = df['start'].gt(df['end'].shift()).cumsum()
out = df.groupby(group).agg({'start': 'first', 'end': 'last'})
output:
start end
0 100 580
1 750 910
2 920 940
3 1000 1300
intermediate group
:
0 0
1 0
2 0
3 1
4 2
5 3
6 3
dtype: int64
Upvotes: 1