Reputation: 63
df = pd.DataFrame({'index': [1, 2, 3, 69, 70, 71, 458, 459, 460],
'Animals': ['alligator', 'bee', 'falcon', 'lion',
'monkey', 'parrot', 'shark', 'whale', 'zebra'],})
df
Here is my DataFrame. I would like to groupby
"index" column where first 3 animals which have index 1, 2, 3 goes to one group. The animals with numbers 69, 70, 71 goes to next group and so on.
In the end I would like to have 3 groups made by consecutive values in index column.
The good thing is the numbers in column index are always +1 and then a gap larger than 1 but never just +1
It should be exist some function for that I believe.
Upvotes: 1
Views: 621
Reputation: 2924
Solutions for these invariably go for the compare to the previous row and cumsum the trues.
So perhaps something like this:
df['index'].diff().ne(1).cumsum()
Upvotes: 1
Reputation:
IIUC, you want to group rows depending on whether difference is 1 or not right?. You can use diff
+ ne
+ cumsum
to create different groups, groupby
these groups and use agg
to create lists:
out = df.groupby(df['index'].diff().ne(1).cumsum()).agg(list).reset_index(drop=True)
Output:
index Animals
0 [1, 2, 3] [alligator, bee, falcon]
1 [69, 70, 71] [lion, monkey, parrot]
2 [458, 459, 460] [shark, whale, zebra]
Upvotes: 1