Reputation: 3
I have no idea how this is happening, maybe you can help me. I'm assigning a DataFrame to a second variable and then rescaling the second DataFrame, so that every column has mean 0 and variance 1. After doing that the first Dataframe is altered in the same way?! How is this happening? I tried to do the same assignment and then putting the second variable to 0, to see if the problem is, that both variables point to the same data. But this does not alter the first DataFrame. Here is my code:
import numpy as np
import pandas as pd
firstDF = pd.DataFrame([[1,2],[3,4]])
firstDF.columns = ['firstColumn', 'secondColumn']
secondDF=firstDF
print(firstDF)
print(secondDF)
for i in secondDF.columns:
secondDF[i]=(secondDF[i]-np.mean(secondDF[i]))/np.std(secondDF[i])
print(firstDF)
print(secondDF)
The output is:
firstColumn secondColumn
0 1 2
1 3 4
firstColumn secondColumn
0 1 2
1 3 4
firstColumn secondColumn
0 -1.0 -1.0
1 1.0 1.0
firstColumn secondColumn
0 -1.0 -1.0
1 1.0 1.0
My whole understanding of coding is crumbling! Please help!
Upvotes: 0
Views: 68
Reputation: 86
I would suggest that you do this:
firstDF=pd.DataFrame([[1,2],[3,4]])
secondDF=firstDF.copy()
This will create a copy as suggested by Mayank. But I am not sure what you want to do . Do you want to create a copy or not ? Could you please provide a little more clarification ?
Upvotes: 1
Reputation: 34086
Do this:
secondDF=firstDF.copy()
This will create a copy of your firstDF
. And firstDF
would remain intact.
Upvotes: 1