dataframe argument being changed by a function. How to avoid it being mutated?

Question

I know that the pandas dataframe is mutable.

I am passing a dataframe to a function and I do not want the original dataframe to be changed, but it is. I thought as long as I reassigned the dataframe variable and avoided using .drop(inplace=True) and .reset_index(inplace=True), it would be OK, but it is not.
What workaround for .dropna() and .reset_index() is there to avoid my original dataframe being mutated?

Thank you.

def makeChoice():
    return bool(random.getrandbits(1))
def makeChange(row,choice):
    if choice==True:
        result = row['b']
    else:
        result = np.nan
    return result    
workingDF['b']= workingDF.apply(lambda row: makeChange(row, makeChoice()), axis=1)
workingDF = workingDF.dropna()
workingDF = workingDF.reset_index(drop=True)
return workingDF    
a = pd.DataFrame({'a':[1,2], 'b':[3,4]})
print('a - original:')
print(a)
b = testFunc3(a)
print('b after testFunc3():')
print(b)
print('a after testFunc3():')
print(a)

This gives the following output:

a - original:
   a  b
0  1  3
1  2  4
b after testFunc3():
   a    b
0  1  3.0
a after testFunc3():
   a    b
0  1  3.0
1  2  NaN

jcaliz · Accepted Answer

You can send a copy of the dataframe to the function in case if you don't want to modify change the methods inside the function:

b = testFunc3(a.copy())

dataframe argument being changed by a function. How to avoid it being mutated?

Answers (1)

Related Questions