Reputation: 1467
I have a pandas DataFrame, and I want to select rows which values start and end with a specific value. For example, in dataFrame df
, I want to select rows that column state
start and end with 1
. That is row 2 5 8 10
. and output two dataframes.
import pandas as pd
data = [['a1',0,'low'],
['a1',0,'low'],
['a1',1,'high'],
['a1',1,'low'],
['a1',1,'low'],
['a1',1,'high'],
['a1',0,'low'],
['a1',0,'low'],
['a2',1,'high'],
['a2',1,'low'],
['a2',1,'low'],
['a2',0,'low'],
['a2',0,'low']]
df = pd.DataFrame(data,columns=['id','state','type'])
df
out :
id state type
0 a1 0 low
1 a1 0 low
2 a1 1 high
3 a1 1 low
4 a1 1 low
5 a1 1 high
6 a1 0 low
7 a1 0 low
8 a2 1 high
9 a2 1 low
10 a2 1 low
11 a2 0 low
12 a2 0 low
Finally, I want two dataframes, like this:
df1
id state type code
2 a1 1 high start
8 a2 1 high start
df2
id state type code
5 a1 1 high end
10 a2 1 low end
Upvotes: 2
Views: 1984
Reputation: 164663
You can use a Boolean mask to select the rows you require:
m1 = df['state'].diff() == 1
m2 = df['state'].shift(-1).diff() == -1
res = df[m1 | m2]
print(res)
id state type
2 a1 1 high
5 a1 1 high
8 a2 1 high
10 a2 1 low
You can use a list comprehension to split into 2 dataframes:
df1, df2 = [res.iloc[i::2] for i in range(int(len(res.index)/2))]
print(df1, df2, sep='\n\n')
id state type
2 a1 1 high
8 a2 1 high
id state type
5 a1 1 high
10 a2 1 low
Upvotes: 3