Reputation: 1147
I want to add the the data of reference to data, so I use
data[reference.columns]=reference
but it only creates the column with no value, how can I add the value?
Upvotes: 2
Views: 679
Reputation: 59519
Your two DataFrames are indexed differently, so when you do data[reference.columns] = reference
it tries to align the new columns on indices. Since the indices of reference
are not in data
(or only align for index=0
) it adds the columns, but fills the values with NaN
.
It looks like you want to add multiple static columns to data
with the values from reference
. You can just assign these:
for col in reference.columns:
data[col] = reference[col].values[0]
Here's an illustration of the issue.
import pandas as pd
data = pd.DataFrame({'id': [1, 2, 3, 4],
'val1': ['A', 'B', 'C', 'D']})
reference = pd.DataFrame({'id2': [1, 2, 3, 4],
'val2': ['A', 'B', 'C', 'D']})
These have the same indices ranging from 0-3.
data[reference.columns] = reference
Outputs
id val1 id2 val2
0 1 A 1 A
1 2 B 2 B
2 3 C 3 C
3 4 D 4 D
But, if these DataFrames have different indices (that only partially overlap):
data = pd.DataFrame({'id': [1, 2, 3, 4],
'val1': ['A', 'B', 'C', 'D']})
reference = pd.DataFrame({'id2': [1, 2, 3, 4],
'val2': ['A', 'B', 'C', 'D']})
reference.index=[3,4,5,6]
data[reference.columns]=reference
Outputs:
id val1 id2 val2
0 1 A NaN NaN
1 2 B NaN NaN
2 3 C NaN NaN
3 4 D 1.0 A
As only the index value of 3 is shared.
Upvotes: 2