Reputation: 51
I would like to split a dataframe from pandas into multiple dataframes.
My dataframe has a column named 'C' with some rows containing a 0. What I hope to do is use the rows with 0's as a delimiter, so I get consecutive rows with values in the column 'C'. I don't think groupby is the way to go, as it takes the rows with the same values which isn't exactly what I'm trying to achieve unless I use it together with .diff() perhaps? I'm not sure. I feel I've tried so many things now, but Python and Pandas isn't my strongest language so I'm not sure what the possibilities are.
I have however managed to get it working using my own logic of for-loops and if-statements going through all the frames manually but it's a slow process and I hope to improve it.
Before:
A | C |
---|---|
1 | 4 |
2 | 5 |
3 | 0 |
4 | 5 |
5 | 4 |
6 | 5 |
After:
A | C |
---|---|
1 | 4 |
2 | 5 |
A | C |
---|---|
4 | 5 |
5 | 4 |
6 | 5 |
I am using Python if that isn't clear, any help is appreciated. Thanks a lot.
Upvotes: 4
Views: 265
Reputation: 862611
You can create list of DataFrames:
#compare for equal 0
m = df['C'].eq(0)
#filter out 0 rows and grouping by cumulative sum of mask
dfs = [x for _, x in df[~m].groupby(m.cumsum())]
print (dfs)
[ A C
0 1 4
1 2 5, A C
3 4 5
4 5 4
5 6 5]
print (dfs[1])
A C
3 4 5
4 5 4
5 6 5
Upvotes: 5