Reputation: 7525
I know that the pandas dataframe is mutable.
I am passing a dataframe to a function and I do not want the original dataframe to be changed, but it is.
I thought as long as I reassigned the dataframe variable and avoided using .drop(inplace=True) and .reset_index(inplace=True), it would be OK, but it is not.
What workaround for .dropna() and .reset_index() is there to avoid my original dataframe being mutated?
Thank you.
def makeChoice():
return bool(random.getrandbits(1))
def makeChange(row,choice):
if choice==True:
result = row['b']
else:
result = np.nan
return result
workingDF['b']= workingDF.apply(lambda row: makeChange(row, makeChoice()), axis=1)
workingDF = workingDF.dropna()
workingDF = workingDF.reset_index(drop=True)
return workingDF
a = pd.DataFrame({'a':[1,2], 'b':[3,4]})
print('a - original:')
print(a)
b = testFunc3(a)
print('b after testFunc3():')
print(b)
print('a after testFunc3():')
print(a)
This gives the following output:
a - original:
a b
0 1 3
1 2 4
b after testFunc3():
a b
0 1 3.0
a after testFunc3():
a b
0 1 3.0
1 2 NaN
Upvotes: 2
Views: 521
Reputation: 4021
You can send a copy of the dataframe to the function in case if you don't want to modify change the methods inside the function:
b = testFunc3(a.copy())
Upvotes: 4