Mike
Mike

Reputation: 49

Extract data from pandas data frame

I want to create a list of data frames from a bigger data frame based on column value. The column "ID" can repeat for example 1,2,3,1,2,3,4,5,1,2.

I want to create a list of data frames by extracting the rows until when the ID repeats over again back to 1. In this case the list should have 3 data with ID's: 1,2,3, 1,2,3,4,5 and then 1,2.

Can this be done without using a for loop?

Upvotes: 2

Views: 82

Answers (2)

user3483203
user3483203

Reputation: 51155

This is a common idiom in numpy: np.where(np.diff(s) != 1).

You can leverage this and np.split to accomplish what you want:

s = df.ID.values
idx, *_ = np.where(np.diff(s) != 1)
np.split(s, idx + 1)

[array([1, 2, 3], dtype=int64),
 array([1, 2, 3, 4, 5], dtype=int64),
 array([1, 2], dtype=int64)]

Upvotes: 0

rafaelc
rafaelc

Reputation: 59274

No need for loops.

>>> list(zip(*df.groupby(df.ID.diff().ne(1).cumsum())))[1]

Upvotes: 4

Related Questions