Bruno
Bruno

Reputation: 140

How to modify the value of a pandas dataframe "cell" from within a function?

Why will this work in Pyhon 3:

for i in range(0, len(df.index) ):
    df.loc[i,["Processed"]] =  "YES"

and why wont this work:

def mylargeprocess(SomeData,Processed):
    Processed = "YES"

for i in range(0, len(df.index) ):
    mylargeprocess(df.loc[i,["SomeData"]],df.loc[i,["Processed"]])

I am pretty sure it has something to do with strings being inmutable, but I would still like to understand the diference betweeen those codes.

Thanks,

Upvotes: 0

Views: 554

Answers (1)

jpp
jpp

Reputation: 164613

pd.DataFrame.loc is used both for setting and for accessing values. In this first example, you are setting values. In the second example, you are only accessing data. First you pass a pd.DataFrame object to a function, then assign the string "Yes" to the variable Processed.

You can debug what's happening yourself by using print:

import pandas as pd

df = pd.DataFrame([['this', 'is'], ['a', 'test']],
                  columns=['col1', 'col2'])

def process(df_in):
    df_in = 'hello'
    print(df_in)  # you'll see 'hello' printed twice, no assignment happens

for i in range(len(df.index)):
    process(df.loc[i, ['col2']])

Upvotes: 1

Related Questions