newtothis
newtothis

Reputation: 525

Creating a dataframes column using successive differences in other columns

I have a dataframe column like so:

    Index LtoR MSR LtoR CSR    x
0     0.0        0        0     0      
1     1.0      0.5        0   0.5       
2     2.0        1       15  1.15         
3     3.0        1       31  1.31   
4     4.0      1.5        0   1.5   
5     5.0      1.5       16  1.66
6     6.0        2       24  2.24


I want to create a new column such that I am considering the difference between 2 xl values. So I want to do something like

    Index LtoR MSR LtoR CSR    x   Width L
0     0.0        0        0     0  (x2 - x0)/2 = 1.15/2 = 0.575
1     1.0      0.5        0   0.5  (x3 - x1)/2 = 0.81/2 = 0.405
2     2.0        1       15  1.15  
3     3.0        1       31  1.31  
4     4.0      1.5        0   1.5  
5     5.0      1.5       16  1.66  
6     6.0        2       24  2.24  

I tried the following code:

for i in range(0,10):
    df['New Col'] = ((np.float64(df['x'][i+2]) - np.float64(2['x'][i]))/2)

But this returns an error about copying values versus viewing values. What should I be doing then?

Upvotes: 0

Views: 34

Answers (1)

SeaBean
SeaBean

Reputation: 23217

You can use .shift() to get the (row+2)-th entry and set formula as follows:

df['Width L'] = (df['x'].shift(-2) - df['x']) / 2 

Result:

print(df)

   Index  LtoR MSR  LtoR CSR     x  Width L
0    0.0       0.0         0  0.00    0.575
1    1.0       0.5         0  0.50    0.405
2    2.0       1.0        15  1.15    0.175
3    3.0       1.0        31  1.31    0.175
4    4.0       1.5         0  1.50    0.370
5    5.0       1.5        16  1.66      NaN
6    6.0       2.0        24  2.24      NaN

If you want the last 2 entries not to be NaN and take the original value of x, you can additionally use .fillna(), as follows:

df['Width L'] = ((df['x'].shift(-2) - df['x']) / 2).fillna(df['x'])

Result:

print(df)

   Index  LtoR MSR  LtoR CSR     x  Width L
0    0.0       0.0         0  0.00    0.575
1    1.0       0.5         0  0.50    0.405
2    2.0       1.0        15  1.15    0.175
3    3.0       1.0        31  1.31    0.175
4    4.0       1.5         0  1.50    0.370
5    5.0       1.5        16  1.66    1.660
6    6.0       2.0        24  2.24    2.240

Or, if you want to regard unavailable (row+2)-th entry as 0, you can use the fill_value parameter of .shift() to set default value of 0, as follows:

df['Width L'] = (df['x'].shift(-2, fill_value=0) - df['x']) / 2

Result:

print(df)

   Index  LtoR MSR  LtoR CSR     x  Width L
0    0.0       0.0         0  0.00    0.575
1    1.0       0.5         0  0.50    0.405
2    2.0       1.0        15  1.15    0.175
3    3.0       1.0        31  1.31    0.175
4    4.0       1.5         0  1.50    0.370
5    5.0       1.5        16  1.66   -0.830
6    6.0       2.0        24  2.24   -1.120

Upvotes: 1

Related Questions