Reputation: 93
I have two dataframes, sharing some columns together.
I'm trying to:
1) Merge the two dataframes together, i.e. adding the columns which are different:
diff = df2[df2.columns.difference(df1.columns)]
merged = pd.merge(df1, diff, how='outer', sort=False, on='ID')
Up to here, everything works as expected.
2) Now, to replace the NaN
values with the values of df2
merged = merged[~merged.index.duplicated(keep='first')]
merged.fillna(value=df2)
And it is here that I get:
pandas.core.indexes.base.InvalidIndexError
I don't have any duplicates, and I can't find any information as to what can cause this.
Upvotes: 2
Views: 2841
Reputation: 93
The solution to this problem is to use a different method - combine_first() this way, each row with missing data is filled with data from the other dataframe, as can be seen here Merging together values within Series or DataFrame columns
Upvotes: 3
Reputation: 16966
In case, number of rows changes because of the merge, fillna sometimes cause error. Try the following!
merged.fillna(df2.groupby(level=0).transform("mean"))
Upvotes: 0