RitaM
RitaM

Reputation: 143

Pandas: Sum Based on two columns (columns don`t match in same row)

I´m trying to find a way to have a new column (Duration_Week_N+1) based on other column (Device_Week)

I have the following pandas DataFrame.

    Device  Duration   DeviceVSweek     DeviceVSweek_N+1
      452     17461       452_1             452_2  
      452      2962       452_2             452_3 
      452      5370       452_3             452_4

I'm unable to find a way where I can assume Duration_Week_N+1 is based on DeviceVSweek

This is the expected result

Device  Duration   DeviceVSweek     DeviceVSweek_N+1    Duration_Week_N+1
  452     17461       452_1              452_2               2962
  452      2962       452_2              452_3               5370
  452      5370       452_3              452_4                0

I tried using loc which obviously didn't work as it looks for a line match between DeviceVSweek and DeviceVSweek_N+1 (does not exist)

Any suggestion?

Upvotes: 0

Views: 156

Answers (1)

not_speshal
not_speshal

Reputation: 23146

You're looking for shift:

df["Duration_Week_N+1"] = df["Duration"].shift(-1).fillna(0)
>>> df
   Device  Duration DeviceVSweek DeviceVSweek_N+1  Duration_Week_N+1
0     452     17461        452_1            452_2             2962.0
1     452      2962        452_2            452_3             5370.0
2     452      5370        452_3            452_4                  0

Edit: If your df has missing (non-consecutive) weeks such that shift won't work, try:

df["Duration_Week_N+1"] = df["DeviceVSweek_N+1"].map(dict(zip(df["DeviceVSweek"], df["Duration"])))

dict(zip) creates a dictionary with keys as DeviceVSweet and values as Duration. map then maps each row of "DeviceVSweet_N+1" using the created dictionary.

Upvotes: 1

Related Questions