Reputation: 101
So I have this df1:
ID State
1 AA
2 AA
3 ZF
3 CJ
and df2:
ID AA ZF CJ etc
1 9 8 77
2 7 6 5
3 8 88 6
I have to create a new column in df1 bringing the values in df2 like this:
ID State Value
1 AA 9
2 AA 7
3 ZF 88
3 CJ 6
I've been trying for 2 hours now and I can't seem to find a way to refer to the column names on df2 based on the values of df1['State']. Even if I could think of a way to do that, the value is filtered by ID too... tricky stuff. Any help?
Thank you in advance
Upvotes: 2
Views: 856
Reputation: 8768
Here is an option using loc
df1['value'] = df2.set_index('ID').stack().loc[(pd.MultiIndex.from_frame(df1))].to_numpy()
or
idx,cols = pd.factorize(df['State'])
df2.set_index('ID').reindex(df['ID']).reindex(cols,axis=1).to_numpy()[range(len(idx)),idx]
Upvotes: 1
Reputation: 2061
Since you want to map the columns of the second DataFrame to a row in the first DataFrame, you need to first Transpose the second DataFrame, I also suggest removing the 'ID' column for ease:
df2.drop('ID', axis = 1, inplace = True)
df2 = df2.T
df2.columns = ['State', 'Value1', 'Value2', 'Value3']
final_df = pd.merge(df1, df2, on = 'State', how = 'left')
Upvotes: 0
Reputation: 35636
Let's try something like:
import pandas as pd
df1 = pd.DataFrame({'ID': {0: 1, 1: 2, 2: 3, 3: 3},
'State': {0: 'AA', 1: 'AA',
2: 'ZF', 3: 'CJ'}})
df2 = pd.DataFrame({'ID': {0: 1, 1: 2, 2: 3},
'AA': {0: 9, 1: 7, 2: 8},
'ZF': {0: 8, 1: 6, 2: 88},
'CJ': {0: 77, 1: 5, 2: 6}})
merged = df1.merge(
df2.set_index('ID')
.stack()
.reset_index()
.rename(columns={'level_1': 'State', 0: 'Value'}),
on=['ID', 'State']
)
print(merged.to_string(index=False))
merged
:
ID State Value
1 AA 9
2 AA 7
3 ZF 88
3 CJ 6
Uses stack to get each value in df2
into its own row:
print(df2.set_index('ID')
.stack()
.reset_index()
.rename(columns={'level_1': 'State', 0: 'Value'}))
Output:
ID State Value
0 1 AA 9
1 1 ZF 8
2 1 CJ 77
3 2 AA 7
4 2 ZF 6
5 2 CJ 5
6 3 AA 8
7 3 ZF 88
8 3 CJ 6
Then this easily merges with df1
Upvotes: 1