Calculates new columns based on other columns' values in python pandas dataframe

Question

I want to create a new column based on the value of other columns in pandas dataframe. My data is about a truck that moves back and forth from loading to dumping location. I want calculates the distance of current road segment to the last segment. The example of the data shown below:

State      | segment length | 
-----------------------------
Loaded     |    20          |
Loaded     |    10          |
Loaded     |    10          |
Empty      |    15          |
Empty      |    10          |
Empty      |    10          |
Loaded     |    30          |
Loaded     |    20          |
Loaded     |    10          |

So, the end of the road will be the record where the State changes. Hence I want to calculate the distance from end of the road. The final dataframe will be:

State   | segment length | Distance to end
Loaded  |       20       |     40
Loaded  |       10       |     20
Loaded  |       10       |     10
Empty   |       15       |     35
Empty   |       10       |     20
Empty   |       10       |     10
Loaded  |       30       |     60
Loaded  |       20       |     30
Loaded  |       10       |     10

Can anyone help? Thank you in advance

jezrael · Accepted Answer

Use GroupBy.cumsum with DataFrame.iloc for swap ordering and custom Series for get unique consecutive groups with shift and cumsum:

g = df['State'].ne(df['State'].shift()).cumsum()
df['Distance to end'] = df.iloc[::-1].groupby(g)['segment length'].cumsum()
print (df)
    State  segment length  Distance to end
0  Loaded              20               40
1  Loaded              10               20
2  Loaded              10               10
3   Empty              15               35
4   Empty              10               20
5   Empty              10               10
6  Loaded              30               60
7  Loaded              20               30
8  Loaded              10               10

Detail:

print (g)
0    1
1    1
2    1
3    2
4    2
5    2
6    3
7    3
8    3
Name: State, dtype: int32

Calculates new columns based on other columns' values in python pandas dataframe

Answers (2)

Related Questions

Calculates new columns based on other columns&#39; values in python pandas dataframe

Answers (2)

Related Questions

Calculates new columns based on other columns' values in python pandas dataframe