Reputation: 137
I have data as follows
Gene Distance
0 A 18
1 B 16
2 C 58
3 D 45
4 E 34
If two genes have distance less than 50, they should be combined (in a list) as follows
1 A,B
2 C,D,E
A loop should bread between B and C as the distance between them is more than 50. How can I create such breaks in a loop and for lists many times.
Upvotes: 1
Views: 44
Reputation: 323326
You can do with groupby
(should be faster than loop ..)
df.Gene.groupby(df.Distance.gt(50).cumsum()).apply(list).str.join(',')
Out[347]:
Distance
0 A,B
1 C,D,E
Name: Gene, dtype: object
Upvotes: 1