Reputation: 101

How to get a value from a column in another dataframe (df2), based on ID and on a value that points to the name of the column in df2. Python/Pandas

So I have this df1:

ID    State
1      AA
2      AA
3      ZF
3      CJ

and df2:

ID    AA    ZF  CJ  etc
1     9      8  77
2     7      6   5
3     8     88   6

I have to create a new column in df1 bringing the values in df2 like this:

ID    State    Value
1      AA       9
2      AA       7
3      ZF       88
3      CJ       6

I've been trying for 2 hours now and I can't seem to find a way to refer to the column names on df2 based on the values of df1['State']. Even if I could think of a way to do that, the value is filtered by ID too... tricky stuff. Any help?

Thank you in advance

Upvotes: 2

Answers (3)

rhug123

Reputation: 8768

Here is an option using loc

df1['value'] = df2.set_index('ID').stack().loc[(pd.MultiIndex.from_frame(df1))].to_numpy()

idx,cols = pd.factorize(df['State'])
df2.set_index('ID').reindex(df['ID']).reindex(cols,axis=1).to_numpy()[range(len(idx)),idx]

Upvotes: 1

Shivam Roy

Reputation: 2061

Since you want to map the columns of the second DataFrame to a row in the first DataFrame, you need to first Transpose the second DataFrame, I also suggest removing the 'ID' column for ease:

df2.drop('ID', axis = 1, inplace = True)
df2 = df2.T
df2.columns = ['State', 'Value1', 'Value2', 'Value3']

final_df = pd.merge(df1, df2, on = 'State', how = 'left')

Upvotes: 0

Henry Ecker

Reputation: 35636

Let's try something like:

import pandas as pd

df1 = pd.DataFrame({'ID': {0: 1, 1: 2, 2: 3, 3: 3},
                    'State': {0: 'AA', 1: 'AA',
                              2: 'ZF', 3: 'CJ'}})
df2 = pd.DataFrame({'ID': {0: 1, 1: 2, 2: 3},
                    'AA': {0: 9, 1: 7, 2: 8},
                    'ZF': {0: 8, 1: 6, 2: 88},
                    'CJ': {0: 77, 1: 5, 2: 6}})

merged = df1.merge(
    df2.set_index('ID')
        .stack()
        .reset_index()
        .rename(columns={'level_1': 'State', 0: 'Value'}),
    on=['ID', 'State']
)

print(merged.to_string(index=False))

merged:

ID State  Value
 1    AA      9
 2    AA      7
 3    ZF     88
 3    CJ      6

Uses stack to get each value in df2 into its own row:

print(df2.set_index('ID')
        .stack()
        .reset_index()
        .rename(columns={'level_1': 'State', 0: 'Value'}))

Output:

   ID State  Value
0   1    AA      9
1   1    ZF      8
2   1    CJ     77
3   2    AA      7
4   2    ZF      6
5   2    CJ      5
6   3    AA      8
7   3    ZF     88
8   3    CJ      6

Then this easily merges with df1

Upvotes: 1

How to get a value from a column in another dataframe (df2), based on ID and on a value that points to the name of the column in df2. Python/Pandas

Answers (3)

Related Questions