Updating a dataframe 2 affects the dataframe 1 that was the origin of dataframe 2

Question

I am appending a reverse dataframe to its original dataframe. Seemed like it is all working but than I realized this: df1had the ids (1,1,1,2,2) at first. I applied the function df2['id'] = df2['id'].apply(lambda x: x + id_amount) to df2 not to df1 but anyways when i append df2 to df1 its ids changed too. How can that be? Why does df1take the df2values of the id column

import pandas as pd
df1 = pd.DataFrame({'x':[1,1,2,9,9], 'y':[1,2,2,100,101],'id':[1,1,1,2,2]})
df2= df1[::-1]   #df2 as reverse of df1
print(df1)
id_amount=df2['id'].nunique()
df2['id'] = df2['id'].apply(lambda x: x + id_amount) 
df2=df2.sort_values(by=['id'])
df1=df1.append(df2)
df1 = df1.reset_index(drop=True)
print(df1)

Here df1 before and after:

becomes:

#after
x   y     id 
1   1      3     
1   2      3     
2   2      3     
9  100     4     
9  101     4     
2   2      3     
1   2      3     
1   1      3     
9  101     4     
9  100     4

should become:

x   y     id 
1   1      1     
1   2      1     
2   2      1     
9  100     2     
9  101     2     
2   2      3     
1   2      3     
1   1      3     
9  101     4     
9  100     4

ALollz · Accepted Answer

You could do this with a concat and sort_index, then resort at the end.

import pandas as pd
df1 = pd.DataFrame({'x':[1,1,2,9,9], 'y':[1,2,2,100,101],'id':[1,1,1,2,2]})

df1 = (pd.concat([df1, df1.assign(id=df1.id+df1.id.max()).sort_index(ascending=False)], ignore_index=True)
           .sort_values('id')
           .reset_index(drop=True))

   x    y  id
0  1    1   1
1  1    2   1
2  2    2   1
3  9  100   2
4  9  101   2
5  2    2   3
6  1    2   3
7  1    1   3
8  9  101   4
9  9  100   4

Updating a dataframe 2 affects the dataframe 1 that was the origin of dataframe 2

Answers (2)

Related Questions