Reputation:

back calculation in excel columns in python

I have a Data set which looks like this:

ID    Date       Input1    Input2   1.Eff   2.Eff    Qty     Time   
3   1/2/2019        A        A      32.08   76.64     5      200
3   1/3/2019        A        A      55.95   41.18     10     100
3   1/4/2019        A        A      56.61    50        5     300
3   1/4/2019        A        B      56.61   35.67     10     300

In the output field i want two columns new_Eff and new_time which will be calculated from the above data, the logic to calculate New_time and New_eff is: 1, If for ID and Date combination if there is one line item then 1.Eff will be equal to new_eff and new_time will be equal to time. So for row 1 and Row2 output will be

   ID     Date       Input1    Input2   1.Eff   2.Eff    Qty     Time    new_time    new_EFf
    3   1/2/2019        A        A      32.08   76.64     5      200      32.08       200
    3   1/3/2019        A        A      55.95   41.18     10     100      55.95       100

In row3 and row4 date is not changing so new_Eff will be equal to Day-1 2.Eff and time will be equal to Time/Qty, so row 3 output will be:

ID    Date       Input1    Input2   1.Eff   2.Eff    Qty     Time    new_time    new_EFf 
3   1/4/2019        A        A      56.61   50       5      300     300/5=60     41.48 (on 3rd 2.Eff is 41.48)

And in row 4 new_time will be total time-60=300-60=240 and new_eff will be new_time/Qty=240/5=48

ID    Date       Input1    Input2   1.Eff   2.Eff    Qty     Time    new_time    new_EFf 
3   1/4/2019        A        B      56.61   35.67     10     300      240          48

so output table will look like this:

  ID      Date       Input1    Input2   1.Eff   2.Eff    Qty     Time    new_time    new_EFf
    3   1/2/2019        A        A      32.08   76.64     5      200      32.08       200
    3   1/3/2019        A        A      55.95   41.18     10     100      55.95       100
    3   1/4/2019        A        A      56.61   50        5      300        60        41.48
    3   1/4/2019        A        B      56.61   35.67     10     300       240          48

Can anyone please help me how to perform these back calculations when there are multiple rows for same date..

Thanks in advance

Upvotes: 1

Answers (2)

Quang Hoang

Reputation: 150785

Here's my approach with np.select:

first = df['Date'] == df['Date'].shift(-1)
second = first.shift(fill_value=False)

df['new_time'] = np.select((first, second),
                           (df['Time']/df['Qty'], df['Time']-60),
                           df['Time'])

df['new_Eff'] = np.select((first, second),
                          (df['2.Eff'].shift(), df['new_time']/df['Qty']),
                          df['1.Eff']
                         )

Output:

   ID      Date Input1 Input2  1.Eff  2.Eff  Qty  Time  new_time  new_Eff
0   3  1/2/2019      A      A  32.08  76.64    5   200     200.0    32.08
1   3  1/3/2019      A      A  55.95  41.18   10   100     100.0    55.95
2   3  1/4/2019      A      A  56.61  50.00    5   300      60.0    41.18
3   3  1/4/2019      A      B  56.61  35.67   10   300     240.0    24.00

Note: From your question And in row 4 new_time will be total time-60=300-60=240 and new_eff will be new_time/Qty=240/5=48 Should Qty in row4 is 10 and new_Eff is 24 as showed in my answer?

Upvotes: 0

Dev Khadka

Reputation: 5461

you can do it using groupby function like below


df = pd.DataFrame([[3, '1/2/2019', 'A', 'A', 32.08, 76.64, 5, 200], [3, '1/3/2019', 'A', 'A', 55.95, 41.18, 10, 100], [3, '1/4/2019', 'A', 'A', 56.61, 50.0, 5, 300], [3, '1/4/2019', 'A', 'B', 56.61, 35.67, 10, 300]], columns=('ID', 'Date', 'Input1', 'Input2', '1.Eff', '2.Eff', 'Qty', 'Time'))

def calc(g):

    if len(g)>1:
        fst = g.iloc[[0]]
        other = g.iloc[1:]
        fst["new_time"] = fst["Time"]/fst["Qty"]
        fst["new_EFf"] = fst["2.Eff"]

        other["new_time"] = other["Time"]-60
        other["new_EFf"] = other["new_time"].values/fst["Qty"].values
        g = pd.concat([fst,other], axis=0)

    else:
        g = g.copy()
        g["new_time"] = g["Time"]
        g["new_EFf"] = g["1.Eff"]

    return g

df.groupby(["ID", "Date"]).apply(calc).reset_index(drop=True)

Result

 ID      Date Input1 Input2  1.Eff  2.Eff  Qty  Time  new_time  new_EFf
0   3  1/2/2019      A      A  32.08  76.64    5   200     200.0    32.08
1   3  1/3/2019      A      A  55.95  41.18   10   100     100.0    55.95
2   3  1/4/2019      A      A  56.61  50.00    5   300      60.0    50.00
3   3  1/4/2019      A      B  56.61  35.67   10   300     240.0    48.00

Upvotes: 1

back calculation in excel columns in python

Answers (2)

Related Questions