Edward
Edward

Reputation: 573

How to check value change in column

in my dataframe have three columns columns value ,ID and distance . i want to check in ID column when its changes from 2 to any other value count rows and record first value and last value when 2 changes to other value and save and also save corresponding value of column distance when change from 2 to other in ID column.

df=pd.DataFrame({'value':[3,4,7,8,11,20,15,20,15,16],'ID':[2,2,8,8,8,2,2,2,5,5],'distance':[0,0,1,0,0,0,0,0,0,0]})

print(df)
   value  ID  distance
0      3   2         0
1      4   2         0
2      7   8         1
3      8   8         0
4     11   8         0
5     20   2         0
6     15   2         0
7     20   2         0
8     15   5         0
9     16   5         0

required results:

df_out=pd.DataFrame({'rows_Count':[3,2],'value_first':[7,15],'value_last':[11,16],'distance_first':[1,0]})
print(df_out)


  rows_Count  value_first  value_last  distance_first
0           3            7          11               1
1           2           15          16               0

Upvotes: 3

Views: 440

Answers (1)

jezrael
jezrael

Reputation: 863731

Use:

#compare by 2
m = df['ID'].eq(2)

#filter out data before first 2 (in sample data not, in real data possible)
df = df[m.cumsum().ne(0)]

#create unique groups for non 2 groups, add misisng values by reindex
s = m.ne(m.shift()).cumsum()[~m].reindex(df.index)

#aggregate with helper s Series
df1 = df.groupby(s).agg({'ID':'size', 'value':['first','last'], 'distance':'first'})
#flatten MultiIndex
df1.columns = df1.columns.map('_'.join)
df1 = df1.reset_index(drop=True)
print (df1)
   ID_size  value_first  value_last  distance_first
0        3            7          11               1
1        2           15          16               0

Verify in changed data (not only 2 first group):

df=pd.DataFrame({'value':[3,4,7,8,11,20,15,20,15,16],
                 'ID':[1,7,8,8,8,2,2,2,5,5],
                 'distance':[0,0,1,0,0,0,0,0,0,0]})

print(df)
   value  ID  distance
0      3   1         0 <- changed ID
1      4   7         0 <- changed ID
2      7   8         1
3      8   8         0
4     11   8         0
5     20   2         0
6     15   2         0
7     20   2         0
8     15   5         0
9     16   5         0

#compare by 2
m = df['ID'].eq(2)

#filter out data before first 2 (in sample data not, in real data possible)
df = df[m.cumsum().ne(0)]

#create unique groups for non 2 groups, add misisng values by reindex
s = m.ne(m.shift()).cumsum()[~m].reindex(df.index)

#aggregate with helper s Series
df1 = df.groupby(s).agg({'ID':'size', 'value':['first','last'], 'distance':'first'})
#flatten MultiIndex
df1.columns = df1.columns.map('_'.join)
df1 = df1.reset_index(drop=True)
print (df1)
   ID_size  value_first  value_last  distance_first
0        2           15          16               0

Upvotes: 3

Related Questions