SecretAgent
SecretAgent

Reputation: 107

Accessing Rows Before and After rows of interest in Pandas Dataframe

I have a dataframe with some values and i need to check if some conditions meet for the row before and row after of a certain set of rows and if so create a new column with true / false. How do i achieve this?

Dataframe

  ------------------------------------------------------------------
  name       age          school      age2      salary       company
  ------------------------------------------------------------------
  Tina                                 24        2500          GHJ 
  -----------------------------------------------------------------                         
  Tina       24            York
  -----------------------------------------------------------------        
  Tina                                 24        2600         GHJ
  -----------------------------------------------------------------
  Sally                                24        2700         OPS
  -----------------------------------------------------------------
  Sally                                null      2700         OPS
  -----------------------------------------------------------------
  Sally      23           leeds                 
  -----------------------------------------------------------------
  Sally                                 23       2700         OPS
  ----------------------------------------------------------------- 

I want to generate this datarame

  --------------------------------------------------------------------
  name       age     school      age2      salary       company  flag
  --------------------------------------------------------------------
  Tina                            24        2500             GHJ 
  ---------------------------------------------------------------------                        
  Tina       24            York                                    FALSE
  --------------------------------------------------------------------        
  Tina                             24        2600         GHJ
  -----------------------------------------------------------------
  Sally                            24        2700         OPS
  -----------------------------------------------------------------
  Sally                           null      2700          OPS
  -----------------------------------------------------------------
  Sally      23           leeds                                    TRUE 
  -----------------------------------------------------------------
  Sally                            23       2700           OPS
  -----------------------------------------------------------------      

a. The logic behind this is for each of the rows where age is not null we check if the immediate row above where age2 is equal to current age. If so we set the flag to false.

b. If not then we check if the immediate row below == age2. So if age == age2 for immediate row below -> flag is TRUE.

Note that we are looking for the next and previous non null value for age2 when i mean immediate. If age2 was null then we go up / down further to look at the value set for age2.

Upvotes: 1

Views: 2012

Answers (1)

theletz
theletz

Reputation: 1805

df.loc[:,'flag'] = (~(df.loc[:, 'age'] is None) & (df.loc[:, 'age']==df.loc[:, 'age2'].shift(-1))) & (df.loc[:, 'age'] != df.loc[:, 'age2'].shift(1))
df.loc[pd.isnull(df.loc[:, 'age']), 'flag'] = None

Upvotes: 1

Related Questions