Plentyoftime
Plentyoftime

Reputation: 27

pandas returning column name as value if column name matches value of another data frame

I've been stuck on this for a few weeks now....

df1:

2 1/1/2021 1/2/2021 1/3/2021
Name
a door nan house
b nan key door
c nan house key
d house key nan

df2:

2 key door house
Name
a nan nan nan
b nan nan nan
c nan nan nan
d nan nan nan

desired output=

df2:

2 key door house
Name
a nan 1/1/2021 1/3/2021
b 1/2/2021 1/3/2021 nan
c 1/3/2021 nan 1/2/2021
d 1/2/2021 nan 1/1/2021

Upvotes: 0

Views: 602

Answers (3)

Andreas
Andreas

Reputation: 9197

You can combine melt and pivot:

df1.reset_index().melt('Name').dropna().pivot('Name', 'value', 2)

value      door     house       key
Name                               
a      1/1/2021  1/3/2021       NaN
b      1/3/2021       NaN  1/2/2021
c           NaN  1/2/2021  1/3/2021
d           NaN  1/1/2021  1/2/2021

Upvotes: 0

keepAlive
keepAlive

Reputation: 6655

You may actually not even need df2. What about

>>> df1.stack().reset_index().set_index(['Name', 0])[2].unstack()
0         door     house       key
Name                              
a     1/1/2021  1/3/2021       NaN
b     1/3/2021       NaN  1/2/2021
c          NaN  1/2/2021  1/3/2021
d          NaN  1/1/2021  1/2/2021

Tested thanks to @Henry's copiable data

Upvotes: 2

Henry Ecker
Henry Ecker

Reputation: 35626

Try with stack + pivot_table with aggfunc='first' to get the first match

df2 = (
    df1.stack()
        .reset_index()
        .pivot_table(index='Name', columns=0, values=2, aggfunc='first')
        .rename_axis(None, axis=1)
)
          door     house       key
Name                              
a     1/1/2021  1/3/2021       NaN
b     1/3/2021       NaN  1/2/2021
c          NaN  1/2/2021  1/3/2021
d          NaN  1/1/2021  1/2/2021

Optional reindex from df2:

df2 = (
    df1.stack()
        .reset_index()
        .pivot_table(index='Name', columns=0, values=2, aggfunc='first')
        .reindex(index=df2.index, columns=df2.columns)
)
2          key      door     house
Name                              
a          NaN  1/1/2021  1/3/2021
b     1/2/2021  1/3/2021       NaN
c     1/3/2021       NaN  1/2/2021
d     1/2/2021       NaN  1/1/2021

DataFrames Used:

import pandas as pd
from numpy import nan

df1 = pd.DataFrame({
    '1/1/2021': {'a': 'door', 'b': nan, 'c': nan, 'd': 'house'},
    '1/2/2021': {'a': nan, 'b': 'key', 'c': 'house',
                 'd': 'key'},
    '1/3/2021': {'a': 'house', 'b': 'door', 'c': 'key',
                 'd': nan}
}).rename_axis(index='Name', columns=2)

df2 = pd.DataFrame({
    'key': {'a': nan, 'b': nan, 'c': nan, 'd': nan},
    'door': {'a': nan, 'b': nan, 'c': nan, 'd': nan},
    'house': {'a': nan, 'b': nan, 'c': nan, 'd': nan}
}).rename_axis(index='Name', columns=2)

Upvotes: 3

Related Questions