running man
running man

Reputation: 1467

How to select rows which values start and end with a specific value in pandas?

I have a pandas DataFrame, and I want to select rows which values start and end with a specific value. For example, in dataFrame df, I want to select rows that column state start and end with 1. That is row 2 5 8 10. and output two dataframes.

import pandas as pd

data = [['a1',0,'low'],
        ['a1',0,'low'],
        ['a1',1,'high'],
        ['a1',1,'low'],
        ['a1',1,'low'],
        ['a1',1,'high'],
        ['a1',0,'low'],
        ['a1',0,'low'],
        ['a2',1,'high'],
        ['a2',1,'low'],
        ['a2',1,'low'],
        ['a2',0,'low'],
        ['a2',0,'low']]

df = pd.DataFrame(data,columns=['id','state','type'])
df

out :

    id  state   type
0   a1     0    low
1   a1     0    low
2   a1     1    high
3   a1     1    low
4   a1     1    low
5   a1     1    high
6   a1     0    low
7   a1     0    low
8   a2     1    high
9   a2     1    low
10  a2     1    low
11  a2     0    low
12  a2     0    low

Finally, I want two dataframes, like this:

df1

    id  state   type  code
2   a1     1    high  start
8   a2     1    high  start

df2

    id  state   type  code
5   a1     1    high  end
10  a2     1    low   end

Upvotes: 2

Views: 1984

Answers (1)

jpp
jpp

Reputation: 164663

You can use a Boolean mask to select the rows you require:

m1 = df['state'].diff() == 1
m2 = df['state'].shift(-1).diff() == -1

res  = df[m1 | m2]

print(res)

    id  state  type
2   a1      1  high
5   a1      1  high
8   a2      1  high
10  a2      1   low

You can use a list comprehension to split into 2 dataframes:

df1, df2 = [res.iloc[i::2] for i in range(int(len(res.index)/2))]

print(df1, df2, sep='\n\n')

   id  state  type
2  a1      1  high
8  a2      1  high

    id  state  type
5   a1      1  high
10  a2      1   low

Upvotes: 3

Related Questions