EJ Kang
EJ Kang

Reputation: 495

Python Dataframe: combining/replacing multi-index columns with single index columns

I have multi-index column in my df. In my df, all the values are 1 or 0 that represent boolean. My task is to replace values with '1' with value from my another df_test dataframe. See below.

In [221]: df
Out[221]:
first        bar                 baz
second       one       two       one       two
0            0         1         0         0
1            1         0         1         1
2            0         0         0         1
3            0         0         0         0
4            1         1         1         1
..............(continues)

My df_test has regular columns (not multi index) and values that should go in to df.

In [222]: df_test
Out[222]:
        amount
0            38
1            2179   
2            191     
3            4     
4            19823    
..............(continues)

The index of two dataframe matches and my output should be:

In [223]: df
Out[223]:
first        bar                 baz
second       one       two       one       two
0            0         38        0         0
1            2179      0         2179      2179      
2            0         0         0         191     
3            0         0         0         0
4            19823     19823     19823     19823    
..............(continues)

Notice that my df can have no '1' value like index = 3, or all '1' value like index = 4. If there are efficient way to set my dataframe

Upvotes: 0

Views: 261

Answers (2)

cs95
cs95

Reputation: 402892

To get your result, you can use broadcasted multiplication -

v = df.values * df_test.amount.values[:, None]

v
array([[    0,    38,     0,     0],
       [ 2179,     0,  2179,  2179],
       [    0,     0,     0,   191],
       [    0,     0,     0,     0],
       [19823, 19823, 19823, 19823]])

To get back your original dataframe, just call the DataFrame constructor -

df = pd.DataFrame(v, columns=df.columns, index=df.index)
df

first     bar           baz       
second    one    two    one    two
0           0     38      0      0
1        2179      0   2179   2179
2           0      0      0    191
3           0      0      0      0
4       19823  19823  19823  19823

Setup borrowed with thanks from piRSquared's answer.

Upvotes: 2

piRSquared
piRSquared

Reputation: 294508

You want to use pd.DataFrame.mask and use the column amount as the substitute. However, you need to provide the axis=0 parameter to tell Pandas to align on the index.

df.mask(df.eq(1), df_test.amount, axis=0)

first     bar           baz       
second    one    two    one    two
0           0     38      0      0
1        2179      0   2179   2179
2           0      0      0    191
3           0      0      0      0
4       19823  19823  19823  19823

Setup

df = pd.DataFrame(
    [[0, 1, 0, 0],
     [1, 0, 1, 1],
     [0, 0, 0, 1],
     [0, 0, 0, 0],
     [1, 1, 1, 1]],
    columns=pd.MultiIndex.from_product(
        [['bar', 'baz'], ['one', 'two']],
        names=['first', 'second']
    )
)

df_test = pd.DataFrame(dict(amount=[38, 2179, 191, 4, 19823]))

Upvotes: 2

Related Questions