Consecutive differences within adjacent columns for each group

Question

I have file with few columns. For simplicity I am showing first three groups (A1, A2 and A3) below:

Column1 Column2 Column3 
A1      45      50      
A1      70      90      
A1      100     150
A2      500     510     
A2      550     600
A3      1000    1100

I want to get the consecutive difference from last two columns for each group separately. And the average of these consecutive differences per group.

The expected output would be:

Column1 Column2 Column3 Column4        Column5
A1      45      50      70-50 = 20
A1      70      90      100-90 = 10    15  
A1      100     150
A2      500     510     550-510 = 40
A2      550     600
A3      1000    1100

Here, I am saving the data into dataframe and trying to get the expected output.

My code looks like:

df[output]= ([((float(df['Column3']) - float(df['Column2'].shift(-1)))*-1)/float(len(report_map1[i]))])

I am wondering if their is any simpler way for getting the expected result?

jezrael · Accepted Answer

Use DataFrameGroupBy.shift with subtract column Column3:

df['Column4'] =  df.groupby('Column1')['Column2'].shift(-1) - df['Column3']
#if necessary convert to floats
#df['Column4'] =  (df.groupby('Column1')['Column2'].shift(-1).astype(float) - 
#                  df['Column3'].astype(float))
print (df)
  Column1  Column2  Column3  Column4
0      A1       45       50     20.0
1      A1       70       90     10.0
2      A1      100      150      NaN
3      A2      500      510     40.0
4      A2      550      600      NaN
5      A3     1000     1100      NaN

Consecutive differences within adjacent columns for each group

Answers (1)

Related Questions