klados
klados

Reputation: 787

python pandas dataframe multiply columns matching index or row name

I have two dataframes,

df1:

hash  a  b  c
ABC   1  2  3
def   5  3  4
Xyz   3  2 -1

df2:

hash  v
Xyz   3
def   5

I want to make

df:
hash  a  b  c
ABC   1  2  3 (= as is, because no matching 'ABC' in df2)
def  25 15 20 (= 5*5 3*5 4*5)
Xyz   9  6 -3 (= 3*3 2*3 -1*3)

as like above,

I want to make a dataframe with values of multiplying df1 and df2 according to their index (or first column name) matched. As df2 only has one column (v), all df1's columns except for the first one (index) should be affected.

Is there any neat Pythonic and Panda's way to achieve it?

df1.set_index(['hash']).mul(df2.set_index(['hash'])) or similar things seem not work..

Upvotes: 2

Views: 1949

Answers (3)

BeRT2me
BeRT2me

Reputation: 13242

One Method:

# We'll make this for convenience
cols = ['a', 'b', 'c']

# Merge the DataFrames, keeping everything from df
df = df1.merge(df2, 'left').fillna(1)

# We'll make the v column integers again since it's been filled.
df.v = df.v.astype(int)

# Broadcast the multiplication across axis 0
df[cols] = df[cols].mul(df.v, axis=0)

# Drop the no-longer needed column:
df = df.drop('v', axis=1)

print(df)

Output:

  hash   a   b   c
0  ABC   1   2   3
1  def  25  15  20
2  Xyz   9   6  -3

Alternative Method:

# Set indices
df1 = df1.set_index('hash')
df2 = df2.set_index('hash')

# Apply multiplication and fill values
df = (df1.mul(df2.v, axis=0)
        .fillna(df1)
        .astype(int)
        .reset_index())

# Output:

  hash   a   b   c
0  ABC   1   2   3
1  Xyz   9   6  -3
2  def  25  15  20

Upvotes: 2

Dani Mesejo
Dani Mesejo

Reputation: 61910

One approach:

df1 = df1.set_index("hash")
df2 = df2.set_index("hash")["v"]

res = df1.mul(df2, axis=0).combine_first(df1)
print(res)

Output

         a     b     c
hash                  
ABC    1.0   2.0   3.0
Xyz    9.0   6.0  -3.0
def   25.0  15.0  20.0

Upvotes: 2

Sala
Sala

Reputation: 480

The function you are looking for is actually multiply.

Here's how I have done it:

>>> df
  hash  a  b
0  ABC  1  2
1  DEF  5  3
2  XYZ  3 -1

>>> df2
  hash  v
0  XYZ  4
1  ABC  8

df = df.merge(df2, on='hash', how='left').fillna(1)
>>> df
  hash  a  b    v
0  ABC  1  2  8.0
1  DEF  5  3  1.0
2  XYZ  3 -1  4.0


df[['a','b']] = df[['a','b']].multiply(df['v'], axis='index')

>>>df
  hash     a     b    v
0  ABC   8.0  16.0  8.0
1  DEF   5.0   3.0  1.0
2  XYZ  12.0  -4.0  4.0


You can actually drop v at the end if you don't need it.

Upvotes: 0

Related Questions