Ken Kim
Ken Kim

Reputation: 147

using pandas apply and inplace dataframe

I have dataframe like below and want to change as below result df using below def by 'apply method' in pandas.
As far as I know, 'apply' method makes a series not inplacing original df.

id a b
-------
a  1 4
b  2 5
c  6 2

if df['a'] > df['b'] :
    df['a'] = df['b']
else :
    df['b'] = df['a']

result df :

id a b
-------
a  4 4
b  5 5
c  6 6

Upvotes: 0

Views: 376

Answers (3)

gregV
gregV

Reputation: 1107

Try :

df = pd.DataFrame({'a': [1, 2, 6], 'b': [4,5,2]})

df['a'] = df.max(axis=1)
df['b'] = df['a']

Upvotes: 0

0xhughes
0xhughes

Reputation: 2753

Like the rest, not totally sure what you're trying to do, i'm going to ASSUME you are meaning to set the value of either the current "A" or "B" value throughout to be equal to the highest of either column's values in that row.... If that assumption is correct, here's how that can be done with ".apply()".

First thing, is most "clean" applications (remembering that the application of ".apply()" is generally never recommended) of ".apply()" utilize a function that takes the input of the row fed to it by the ".apply()" function and generally returns the same object, but modified/changed/etc as needed. With your dataframe in mind, this is a function to achieve the desired output, followed by the application of the function against the dataframe using ".apply()".

# Create the function to be used within .apply()
def comparer(row):
    if row["a"] > row["b"]:
        row["b"] = row["a"]
    elif row["b"] > row["a"]:
        row["a"] = row["b"]
    return(row)

# Use .apply() to execute our function against our column values. Returning the result of .apply(), re-creating the "df" object as our new modified dataframe.
df = df.apply(comparer, axis=1)

Most, if not everyone seems to rail against ".apply()" usage however. I'd probably heed their wisdom :)

Upvotes: 1

BENY
BENY

Reputation: 323376

I am not sure what you need,since the expected output is different from your condition, here I can only fix your code

for x,y in df.iterrows():
    if y['a'] > y['b']:
        df.loc[x,'a'] = df.loc[x,'b']
    else:
        df.loc[x,'b'] = df.loc[x,'a']

df
Out[40]: 
  id  a  b
0  a  1  1
1  b  2  2
2  c  2  2

If I understand your problem correctly

df.assign(**dict.fromkeys(['a','b'],np.where(df.a>df.b,df.a,df.b)))
Out[43]: 
  id  a  b
0  a  1  1
1  b  2  2
2  c  2  2

Upvotes: 2

Related Questions