Reputation:
I have a Data set which looks like this:
ID Date Input1 Input2 1.Eff 2.Eff Qty Time
3 1/2/2019 A A 32.08 76.64 5 200
3 1/3/2019 A A 55.95 41.18 10 100
3 1/4/2019 A A 56.61 50 5 300
3 1/4/2019 A B 56.61 35.67 10 300
In the output field i want two columns new_Eff and new_time which will be calculated from the above data, the logic to calculate New_time and New_eff is: 1, If for ID and Date combination if there is one line item then 1.Eff will be equal to new_eff and new_time will be equal to time. So for row 1 and Row2 output will be
ID Date Input1 Input2 1.Eff 2.Eff Qty Time new_time new_EFf
3 1/2/2019 A A 32.08 76.64 5 200 32.08 200
3 1/3/2019 A A 55.95 41.18 10 100 55.95 100
In row3 and row4 date is not changing so new_Eff will be equal to Day-1 2.Eff and time will be equal to Time/Qty, so row 3 output will be:
ID Date Input1 Input2 1.Eff 2.Eff Qty Time new_time new_EFf
3 1/4/2019 A A 56.61 50 5 300 300/5=60 41.48 (on 3rd 2.Eff is 41.48)
And in row 4 new_time will be total time-60=300-60=240 and new_eff will be new_time/Qty=240/5=48
ID Date Input1 Input2 1.Eff 2.Eff Qty Time new_time new_EFf
3 1/4/2019 A B 56.61 35.67 10 300 240 48
so output table will look like this:
ID Date Input1 Input2 1.Eff 2.Eff Qty Time new_time new_EFf
3 1/2/2019 A A 32.08 76.64 5 200 32.08 200
3 1/3/2019 A A 55.95 41.18 10 100 55.95 100
3 1/4/2019 A A 56.61 50 5 300 60 41.48
3 1/4/2019 A B 56.61 35.67 10 300 240 48
Can anyone please help me how to perform these back calculations when there are multiple rows for same date..
Thanks in advance
Upvotes: 1
Views: 73
Reputation: 150785
Here's my approach with np.select
:
first = df['Date'] == df['Date'].shift(-1)
second = first.shift(fill_value=False)
df['new_time'] = np.select((first, second),
(df['Time']/df['Qty'], df['Time']-60),
df['Time'])
df['new_Eff'] = np.select((first, second),
(df['2.Eff'].shift(), df['new_time']/df['Qty']),
df['1.Eff']
)
Output:
ID Date Input1 Input2 1.Eff 2.Eff Qty Time new_time new_Eff
0 3 1/2/2019 A A 32.08 76.64 5 200 200.0 32.08
1 3 1/3/2019 A A 55.95 41.18 10 100 100.0 55.95
2 3 1/4/2019 A A 56.61 50.00 5 300 60.0 41.18
3 3 1/4/2019 A B 56.61 35.67 10 300 240.0 24.00
Note: From your question And in row 4 new_time will be total time-60=300-60=240
and new_eff
will be new_time/Qty=240/5=48
Should Qty
in row4
is 10
and new_Eff
is 24
as showed in my answer?
Upvotes: 0
Reputation: 5461
you can do it using groupby function like below
df = pd.DataFrame([[3, '1/2/2019', 'A', 'A', 32.08, 76.64, 5, 200], [3, '1/3/2019', 'A', 'A', 55.95, 41.18, 10, 100], [3, '1/4/2019', 'A', 'A', 56.61, 50.0, 5, 300], [3, '1/4/2019', 'A', 'B', 56.61, 35.67, 10, 300]], columns=('ID', 'Date', 'Input1', 'Input2', '1.Eff', '2.Eff', 'Qty', 'Time'))
def calc(g):
if len(g)>1:
fst = g.iloc[[0]]
other = g.iloc[1:]
fst["new_time"] = fst["Time"]/fst["Qty"]
fst["new_EFf"] = fst["2.Eff"]
other["new_time"] = other["Time"]-60
other["new_EFf"] = other["new_time"].values/fst["Qty"].values
g = pd.concat([fst,other], axis=0)
else:
g = g.copy()
g["new_time"] = g["Time"]
g["new_EFf"] = g["1.Eff"]
return g
df.groupby(["ID", "Date"]).apply(calc).reset_index(drop=True)
Result
ID Date Input1 Input2 1.Eff 2.Eff Qty Time new_time new_EFf
0 3 1/2/2019 A A 32.08 76.64 5 200 200.0 32.08
1 3 1/3/2019 A A 55.95 41.18 10 100 100.0 55.95
2 3 1/4/2019 A A 56.61 50.00 5 300 60.0 50.00
3 3 1/4/2019 A B 56.61 35.67 10 300 240.0 48.00
Upvotes: 1