Reputation: 441
i have a dataframe and a column with integer values (in my case 0 and 1). The index is time. I need a list when the "areas" with ones start and end. I could do that with diff and followed by loop.
Example:
import pandas as pd
df = pd.DataFrame(index = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
df['test'] = pd.DataFrame([0, 1, 1, 1, 0, 0, 1, 1, 1, 0], index = df.index)
methodOfLooking = ((2,4),(7,9)) # something like this should be the result
Any ideas of an efficient way to get the result?
Upvotes: 1
Views: 270
Reputation: 31171
You can use diff
and zip
to get the start and end indexes:
ix = df.test.diff().fillna(0)
In [74]: zip(df.index[ix==1],df.index[ix==-1]-1)
Out[74]: [(2, 4), (7, 9)]
Upvotes: 2