Duoran
Duoran

Reputation: 327

pandas.update on Multiindex using single Index

I want to fill up a DataFrame with a MultiIndex using values from a DataFrame with a simple Index, where the simple Index matches the MultiIndex'es first column. It should not touch anything where there isn't a match. I tried multiindexed.update(singleindexed, overwrite=False), but that didn't work.

import numpy as np
import pandas as pd

products = ['shoes']*5 + ['socks']*5 + ['boots']*5
years = [2005, 2006, 2007, 2008, 2009]*3
values = [11, np.NaN, 12, np.NaN, 7, 23, 8, 40, 5, np.NaN, np.NaN, 53, np.NaN, 88, 34]

df = pd.DataFrame({'product': products, 'year': years, 'value': values})
df = df.set_index(['product', 'year'])

df1 = pd.DataFrame({'products': ['shoes', 'socks'], 'value': [555555, 666666]})
df1 = df1.set_index('products')

The MultiIndex-ed data looks like:

               value
product  year    
shoes    2005    11
         2006    NaN
         2007    12
         2008    NaN
         2009    7
socks    2005    23
         2006    8
         2007    40
         2008    5
         2009    NaN
boots    2005    NaN
         2006    53
         2007    NaN
         2008    88
         2009    34

The single-Index-ed data looks like

product
shoes      555555
socks      666666

The result should look like

               value
product  year    
shoes    2005    11
         2006    555555
         2007    12
         2008    555555
         2009    7
boots    2005    NaN
         2006    53
         2007    NaN
         2008    88
         2009    34

What am I missing here? Is update even the right approach, or should I be looking at apply?

Edit Added the data to generate this example, and added an important omission: I forgot to mention that non-matching values should not be touched, hence I cannot use fillna(). df.update(df1, overwrite=False) would have seemed like a perfect and simple fit.

Upvotes: 0

Views: 155

Answers (1)

BENY
BENY

Reputation: 323396

We can do reindex and slice assign

df.loc[df.value.isnull(),'value']=df1.reindex(df.index.get_level_values(0))[df.value.isnull().values].values
df
                 value
product year          
shoes   2005      11.0
        2006  555555.0
        2007      12.0
        2008  555555.0
        2009       7.0
socks   2005      23.0
        2006       8.0
        2007      40.0
        2008       5.0
        2009  666666.0

Upvotes: 1

Related Questions