Reputation: 147
I have dataframe like below and want to change as below result df using below def by 'apply method' in pandas.
As far as I know, 'apply' method makes a series not inplacing original df.
id a b
-------
a 1 4
b 2 5
c 6 2
if df['a'] > df['b'] :
df['a'] = df['b']
else :
df['b'] = df['a']
result df :
id a b
-------
a 4 4
b 5 5
c 6 6
Upvotes: 0
Views: 376
Reputation: 1107
Try :
df = pd.DataFrame({'a': [1, 2, 6], 'b': [4,5,2]})
df['a'] = df.max(axis=1)
df['b'] = df['a']
Upvotes: 0
Reputation: 2753
Like the rest, not totally sure what you're trying to do, i'm going to ASSUME you are meaning to set the value of either the current "A" or "B" value throughout to be equal to the highest of either column's values in that row.... If that assumption is correct, here's how that can be done with ".apply()".
First thing, is most "clean" applications (remembering that the application of ".apply()" is generally never recommended) of ".apply()" utilize a function that takes the input of the row fed to it by the ".apply()" function and generally returns the same object, but modified/changed/etc as needed. With your dataframe in mind, this is a function to achieve the desired output, followed by the application of the function against the dataframe using ".apply()".
# Create the function to be used within .apply()
def comparer(row):
if row["a"] > row["b"]:
row["b"] = row["a"]
elif row["b"] > row["a"]:
row["a"] = row["b"]
return(row)
# Use .apply() to execute our function against our column values. Returning the result of .apply(), re-creating the "df" object as our new modified dataframe.
df = df.apply(comparer, axis=1)
Most, if not everyone seems to rail against ".apply()" usage however. I'd probably heed their wisdom :)
Upvotes: 1
Reputation: 323376
I am not sure what you need,since the expected output is different from your condition, here I can only fix your code
for x,y in df.iterrows():
if y['a'] > y['b']:
df.loc[x,'a'] = df.loc[x,'b']
else:
df.loc[x,'b'] = df.loc[x,'a']
df
Out[40]:
id a b
0 a 1 1
1 b 2 2
2 c 2 2
If I understand your problem correctly
df.assign(**dict.fromkeys(['a','b'],np.where(df.a>df.b,df.a,df.b)))
Out[43]:
id a b
0 a 1 1
1 b 2 2
2 c 2 2
Upvotes: 2