Check if next value in row is the same based on conditions python?

Question

I have a dataframe like this:

country  question    year   value
   1      a1          2017     Y
   1      a1          2018     Y
   1      a1          2019     N
   1      a2          2017     N
   1      a2          2018     N
   1      a2          2019     Y
   2      a1          2017     Y
   2      a1          2018     Nan
   2      a1          2019     Y
   2      a2          2017     Y
   2      a2          2018     N
   2      a2          2019     Y
   3      a1          2017     Y
   3      a1          2018     N
   3      a1          2019     Y
   3      a2          2017     Y
   3      a2          2018     Y
   3      a2          2019     Y

I want to find where the value from the previous year does not match the value from the next year. I have tried using shift but it doesn't give me what I want. This is how far I've gotten:

 country = all_data['country']
 question = all_data['question']
 value = all_data['value']

 for i in range(len(country)):
     if(country[i] == country[i+1] && question[i] == question[i+1]):

Desired output:

country  question    year   value      match
   1      a1          2017     Y        
   1      a1          2018     Y         T
   1      a1          2019     N         F
   1      a2          2017     N         
   1      a2          2018     N         T
   1      a2          2019     Y         F
   2      a1          2017     Y           
   2      a1          2018     Nan       F
   2      a1          2019     Y         F
   2      a2          2017     Y      
   2      a2          2018     N         F
   2      a2          2019     Y         F
   3      a1          2017     Y         
   3      a1          2018     N         F
   3      a1          2019     Y         F
   3      a2          2017     Y 
   3      a2          2018     Y         T
   3      a2          2019     Y         T

Thank you for any suggestions!!

BENY · Accepted Answer

Let us do groupby with shift then mask the first item with duplicated

df['new'] = df.groupby(['country','question']).value.shift().\
                   eq(df.value).mask(~df[['country','question']].duplicated(),'')
Out[435]: 
0          
1      True
2     False
3          
4      True
5     False
6          
7     False
8     False
9          
10    False
11    False
12         
13    False
14    False
15         
16     True
17     True
Name: value, dtype: object

Check if next value in row is the same based on conditions python?

Answers (2)

Related Questions