user15227672
user15227672

Reputation: 51

Pandas get consecutive rows

I would like to split a dataframe from pandas into multiple dataframes.

My dataframe has a column named 'C' with some rows containing a 0. What I hope to do is use the rows with 0's as a delimiter, so I get consecutive rows with values in the column 'C'. I don't think groupby is the way to go, as it takes the rows with the same values which isn't exactly what I'm trying to achieve unless I use it together with .diff() perhaps? I'm not sure. I feel I've tried so many things now, but Python and Pandas isn't my strongest language so I'm not sure what the possibilities are.

I have however managed to get it working using my own logic of for-loops and if-statements going through all the frames manually but it's a slow process and I hope to improve it.

Before:

A C
1 4
2 5
3 0
4 5
5 4
6 5

After:

A C
1 4
2 5
A C
4 5
5 4
6 5

I am using Python if that isn't clear, any help is appreciated. Thanks a lot.

Upvotes: 4

Views: 265

Answers (1)

jezrael
jezrael

Reputation: 862611

You can create list of DataFrames:

#compare for equal 0
m = df['C'].eq(0)
#filter out 0 rows and grouping by cumulative sum of mask
dfs = [x for _, x in df[~m].groupby(m.cumsum())]
print (dfs)
[   A  C
0  1  4
1  2  5,    A  C
3  4  5
4  5  4
5  6  5]

print (dfs[1])
   A  C
3  4  5
4  5  4
5  6  5

Upvotes: 5

Related Questions