Reputation: 49
I want to create a list of data frames from a bigger data frame based on column value. The column "ID"
can repeat for example 1,2,3,1,2,3,4,5,1,2
.
I want to create a list of data frames by extracting the rows until when the ID repeats over again back to 1. In this case the list should have 3 data with ID's: 1,2,3
, 1,2,3,4,5
and then 1,2
.
Can this be done without using a for loop?
Upvotes: 2
Views: 82
Reputation: 51155
This is a common idiom in numpy
: np.where(np.diff(s) != 1)
.
You can leverage this and np.split
to accomplish what you want:
s = df.ID.values
idx, *_ = np.where(np.diff(s) != 1)
np.split(s, idx + 1)
[array([1, 2, 3], dtype=int64),
array([1, 2, 3, 4, 5], dtype=int64),
array([1, 2], dtype=int64)]
Upvotes: 0
Reputation: 59274
No need for loops.
>>> list(zip(*df.groupby(df.ID.diff().ne(1).cumsum())))[1]
Upvotes: 4