rahlf23
rahlf23

Reputation: 9019

Create new column for in multiindex dataframe and fillna

Let's say I have the following multi-indexed dataframe, generated with the following code:

import pandas as pd, numpy as np

names = ['Name1','Name2','Name3','Name4']
values = ['x1','x2','x3','x4']
categories = ['y1','y2','y3']

x1 = pd.Series([0, 0, 0], index=categories)

index = pd.MultiIndex.from_product([names, values]); placeholders = np.zeros((len(names)*len(values), len(categories)))

df = pd.DataFrame(placeholders, index=index, columns=categories)

for i in names:
    for j in values:
        df.loc[i,j] = x1

           y1   y2   y3
Name1 x1  0.0  0.0  0.0
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0
Name2 x1  0.0  0.0  0.0
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0
Name3 x1  0.0  0.0  0.0
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0
Name4 x1  0.0  0.0  0.0
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0

How would I use a dictionary such as the following to fill the column y3 corresponding to row x1 and filling np.nan for values for any other rows in the y3 column or when a given Name (i.e. Name1, Name2, Name3, etc.) is not a key in the dictionary?

{'Name1': 54, 'Name3': 50}

Expected output (0's could be np.nan):

           y1   y2   y3
Name1 x1  0.0  0.0   54
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0
Name2 x1  0.0  0.0  0.0
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0
Name3 x1  0.0  0.0   50
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0
Name4 x1  0.0  0.0  0.0
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0

Upvotes: 0

Views: 47

Answers (3)

CT Zhu
CT Zhu

Reputation: 54400

The idiomatic way is probably to use update

In [31]: df2 = pd.DataFrame({(k,'x1'): {'y3': v} for k, v in d.items()}).T

In [32]: df2
Out[32]: 
          y3
Name1 x1  54
Name3 x1  50

In [33]: df.update(df2)

In [34]: df
Out[34]: 
           y1   y2    y3
Name1 x1  0.0  0.0  54.0
      x2  0.0  0.0   0.0
      x3  0.0  0.0   0.0
      x4  0.0  0.0   0.0
Name2 x1  0.0  0.0   0.0
      x2  0.0  0.0   0.0
      x3  0.0  0.0   0.0
      x4  0.0  0.0   0.0
Name3 x1  0.0  0.0  50.0
      x2  0.0  0.0   0.0
      x3  0.0  0.0   0.0
      x4  0.0  0.0   0.0
Name4 x1  0.0  0.0   0.0
      x2  0.0  0.0   0.0
      x3  0.0  0.0   0.0
      x4  0.0  0.0   0.0

Upvotes: 1

BENY
BENY

Reputation: 323396

You can create the update df then using update

d={'Name1': 54, 'Name3': 50}
updatedf=pd.DataFrame(data=list(d.values()),columns=['y3'],index=pd.MultiIndex.from_arrays([list(d.keys()),['x1','x1']]))
df.update(updatedf)
df
Out[229]: 
           y1   y2    y3
Name1 x1  0.0  0.0  54.0
      x2  0.0  0.0   0.0
      x3  0.0  0.0   0.0
      x4  0.0  0.0   0.0
Name2 x1  0.0  0.0   0.0
      x2  0.0  0.0   0.0
      x3  0.0  0.0   0.0
      x4  0.0  0.0   0.0
Name3 x1  0.0  0.0  50.0
      x2  0.0  0.0   0.0
      x3  0.0  0.0   0.0
      x4  0.0  0.0   0.0
Name4 x1  0.0  0.0   0.0
      x2  0.0  0.0   0.0
      x3  0.0  0.0   0.0
      x4  0.0  0.0   0.0

Upvotes: 1

YOLO
YOLO

Reputation: 21759

There are many ways to do this, one simple way is using .loc indexers:

d = {'Name1': 54, 'Name3': 50}

for i in d.keys():
    df.loc[i,'x1']['y3'] = d[i]

Upvotes: 0

Related Questions