Reputation: 1571
Having a dataframe similar to:
Country Column1 Product Week Val
UK S1 A 2019-36 10
UK S1 A 2019-37 20
UK S1 A 2019-38 30
UK S1 B 2019-36 30
UK S1 B 2019-37 30
UK S1 B 2019-38 30
DE S1 A 2019-39 100
DE S1 A 2019-40 100
DE S1 A 2019-41 100
DE S1 B 2019-36 10
DE S1 B 2019-37 15
DE S1 B 2019-38 10
How does one go about saying: If Product = "B" then take the VAL from product "A" where all other columns are the same (Country, Column1 & Week) and add 50% of that Val to the current value?
E.g. The first "B" would have a value of 35:
30 + (50%*10)
The second one 40:
30 + (50%*20)
and the third one 45:
30 + (50%*30)
Upvotes: 0
Views: 68
Reputation: 2225
What about playing with indices?
Suppose you have your data in a pandas.DataFrame
called data
data = data.set_index(["Country", "Column1", "Week", "Product"], drop=False)
df1 = data[data.Product == "A"].set_index(["Country", "Column1", "Week"], drop=False)
df2 = data[data.Product == "B"].set_index(["Country", "Column1", "Week"], drop=False)
df2.Val += df1.Val * .5 # so that rows with all else the same would add
df2 = df2.set_index(["Country", "Column1", "Week", "Product"])
data.update(df2)
data["Index"] = range(len(data.Val))
data = data.set_index("Index")
data.index.name = None
I think the advantage of this approach is that it fully meets your demand and achieve the result in place. It produces
Country Column1 Product Week Val
0 UK S1 A 2019-36 10.0
1 UK S1 A 2019-37 20.0
2 UK S1 A 2019-38 30.0
3 UK S1 B 2019-36 35.0
4 UK S1 B 2019-37 40.0
5 UK S1 B 2019-38 45.0
6 DE S1 A 2019-39 100.0
7 DE S1 A 2019-40 100.0
8 DE S1 A 2019-41 100.0
9 DE S1 B 2019-36 10.0
10 DE S1 B 2019-37 15.0
11 DE S1 B 2019-38 10.0
Upvotes: 0
Reputation: 6091
Using pd.Groupby
on 'Country','Column1','Week' conditional on Column2 twice seems to work:
B = df[df['Column2']=='B'].groupby(['Country','Column1','Week']).sum()
A = df[df['Column2']=='A'].groupby(['Country','Column1','Week']).sum()
0.5*A + B
Val
Country Column1 Week
DE S1 2019-36 NaN
2019-37 NaN
2019-38 NaN
2019-39 NaN
2019-40 NaN
2019-41 NaN
UK S1 2019-36 35.0
2019-37 40.0
2019-38 45.0
This only works if there's unique values per choice of Country, Column1, Week
Upvotes: 1