Reputation: 438
I am currently working on a problem and now got stuck to implement one of it's steps. I made up the following example which is a simple scenario in order to explore possibilities. The length of the text might be scary but it was only my attempt to better explain it.
This simpler scenario works like the following: I have an look-up table where I store the values which contains the optimal parameters combination for each situation (df1). Afterwards, I perform a simulation to check update the parameters of my look-up table for a given situation, storing everything on df2, then, I proceed by extracting the row which returns the highest outcome. Now, what I want is to get the values maximum values extracted for the parameters 1 and 2, and subsequently, update the df1 for that specific situation. So please, consider:
import pandas as pd
import numpy as np
import random
np.random.seed(1)
a = np.linspace(0,10,11)
b = np.tile(0,11)
c = np.tile(0,11)
df1 = pd.DataFrame([a,b,c]).T
df1.columns = ['Situation','Parameter 1','Parameter 2']
d = np.tile(8,11)
e = abs(np.random.randn(11))
f = abs(np.random.randn(11))
g = abs(100*np.random.randn(11))
df2 = pd.DataFrame([d,e,f,g]).T
df2.columns = ['Situation', 'Parameter 1', 'Parameter 2', 'Outcome']
x = df2.loc[df2['Outcome'].idxmax()]
At first, my df1 looks like:
Situation Parameter 1 Parameter 2
0 0.0 0.0 0.0
1 1.0 0.0 0.0
2 2.0 0.0 0.0
3 3.0 0.0 0.0
4 4.0 0.0 0.0
5 5.0 0.0 0.0
6 6.0 0.0 0.0
7 7.0 0.0 0.0
8 8.0 0.0 0.0
9 9.0 0.0 0.0
10 10.0 0.0 0.0
For the maximum of df2, I will get a series like this:
Situation 8.000000
Parameter 1 2.301539
Parameter 2 0.172428
Outcome 93.576943
What I want is to df1 to look like this:
Situation Parameter 1 Parameter 2
0 0.0 0.0 0.0
1 1.0 0.0 0.0
2 2.0 0.0 0.0
3 3.0 0.0 0.0
4 4.0 0.0 0.0
5 5.0 0.0 0.0
6 6.0 0.0 0.0
7 7.0 0.0 0.0
8 8.0 2.301539 0.172428
9 9.0 0.0 0.0
10 10.0 0.0 0.0
Hope that I managed to be succinct and precise albeit the length of the explanation. I would really appreciate your help on this one!
Upvotes: 0
Views: 64
Reputation: 12345
You could replace a row by
Not sure if 3. and 4. are actually needed, but with Pandas it's still a one-liner.
df1 = (
df1[df1.Situation != x.Situation]
.append(x.drop(labels=['Outcome']), ignore_index=True)
.sort_values('Situation')
.reset_index(drop=True)
)
print(df1)
Situation Parameter 1 Parameter 2
0 0.0 0.000000 0.000000
1 1.0 0.000000 0.000000
2 2.0 0.000000 0.000000
3 3.0 0.000000 0.000000
4 4.0 0.000000 0.000000
5 5.0 0.000000 0.000000
6 6.0 0.000000 0.000000
7 7.0 0.000000 0.000000
8 8.0 2.301539 0.172428
9 9.0 0.000000 0.000000
10 10.0 0.000000 0.000000
Upvotes: 1
Reputation: 817
It looks to me that your Situation
column should be an index, I would do the following:
df1 = df1.set_index('Situation')
df1.loc[df2.iloc[0], :] = df2.iloc[1:]
This will do what you want considering the rows order in df2
is exactly as you described. Your other option, would be to turn df2
into a DF like df1
:
df2 = pd.DataFrame(df2).T.set_index('Situation')
Then you can assign by the indexes:
df1.loc[df2.index[0], :] = df2.iloc[0, :]
Upvotes: 1