Brooks
Brooks

Reputation: 167

Pandas Dataframe to Dataframe Assignment Not Aligning and Producing NaN

I am trying to assign the values of one Pandas dataframe to another dataframe. However, the assignment results are not behaving as I expected and I'm not sure why. I have a workaround, however, I don't understand why this workaround is necessary or whether it is a preferred workaround.

I set up my data like this:

d1 = {'col1': [1,2,3,4,5], 'col2': ['a','ERROR','ERROR','ERROR', 'e']}
df1 = pd.DataFrame(data=d1)
d2 = {'col3': ['b','c','d']}
df2 = pd.DataFrame(data=d2)
bad = (df1['col2'] == 'ERROR') 

This is what I tried (but it does not work as I expected):

df1.loc[bad,'col2'] = df2.loc[:,'col3']
print(df1)

   col1 col2
0     1    a
1     2    c
2     3    d
3     4  NaN
4     5    e

However, if I change the code to the following, then it does work:

df1.loc[bad,'col2'] = df2.loc[:,'col3'].values
print(df1)

   col1 col2
0     1    a
1     2    b
2     3    c
3     4    d
4     5    e

Upvotes: 1

Views: 298

Answers (1)

meW
meW

Reputation: 3967

Explaining @coldspeed comment.

Try this:

df1.loc[bad, 'col2'] 

which gives you

1    ERROR
2    ERROR
3    ERROR
Name: col2, dtype: object

As you can observe above data has index 1,2 and 3. Now check df2 index

    col3
0   b
1   c
2   d

So when you replace using df1.loc[bad,'col2'] = df2.loc[:,'col3'] only second and third index gets the values. However, when you use values you are proceeding correctly because that forms a numpy array as can be verified from type(df2.col3.values) or a python list using type(df2.col3.tolist()). Both of them are acceptable.

Upvotes: 2

Related Questions