Christophe Vetter
Christophe Vetter

Reputation: 3

Python: assigment of Dataframe to second variable, altering of the second DataFrame alters the first

I have no idea how this is happening, maybe you can help me. I'm assigning a DataFrame to a second variable and then rescaling the second DataFrame, so that every column has mean 0 and variance 1. After doing that the first Dataframe is altered in the same way?! How is this happening? I tried to do the same assignment and then putting the second variable to 0, to see if the problem is, that both variables point to the same data. But this does not alter the first DataFrame. Here is my code:

import numpy as np
import pandas as pd

firstDF = pd.DataFrame([[1,2],[3,4]])
firstDF.columns = ['firstColumn', 'secondColumn']

secondDF=firstDF

print(firstDF)
print(secondDF)
for i in secondDF.columns:
    secondDF[i]=(secondDF[i]-np.mean(secondDF[i]))/np.std(secondDF[i])
print(firstDF)
print(secondDF)

The output is:

   firstColumn  secondColumn
0            1             2
1            3             4
   firstColumn  secondColumn
0            1             2
1            3             4
   firstColumn  secondColumn
0         -1.0          -1.0
1          1.0           1.0
   firstColumn  secondColumn
0         -1.0          -1.0
1          1.0           1.0

My whole understanding of coding is crumbling! Please help!

Upvotes: 0

Views: 68

Answers (2)

Raj
Raj

Reputation: 86

I would suggest that you do this:

firstDF=pd.DataFrame([[1,2],[3,4]])
secondDF=firstDF.copy()

This will create a copy as suggested by Mayank. But I am not sure what you want to do . Do you want to create a copy or not ? Could you please provide a little more clarification ?

Upvotes: 1

Mayank Porwal
Mayank Porwal

Reputation: 34086

Do this:

secondDF=firstDF.copy()

This will create a copy of your firstDF. And firstDF would remain intact.

Upvotes: 1

Related Questions