itthrill
itthrill

Reputation: 1376

Multiplying two data frames in pandas

I have two data frames as shown below df1 and df2. I want to create a third dataframe i.e. df as shown below. What would be the appropriate way?

df1={'id':['a','b','c'],
   'val':[1,2,3]}

df1=pd.DataFrame(df)
df1
 id  val
0  a    1
1  b    2
2  c    3


df2={'yr':['2010','2011','2012'],
   'val':[4,5,6]}

df2=pd.DataFrame(df2)
df2
    yr  val
0  2010    4
1  2011    5
2  2012    6

df={'id':['a','b','c'],
   'val':[1,2,3],
    '2010':[4,8,12],
    '2011':[5,10,15],
    '2012':[6,12,18]}


df=pd.DataFrame(df)
df

  id  val  2010  2011  2012
0  a    1     4     5     6
1  b    2     8    10    12
2  c    3    12    15    18

I can basically convert df1 and df2 as 1 by n matrices and get n by n result and assign it back to the df1. But is there any easy pandas way?

Upvotes: 0

Views: 228

Answers (3)

Vitalizzare
Vitalizzare

Reputation: 7230

TL;DR

We can do it in one line like this:

df1.join(df1.val.apply(lambda x: x * df2.set_index('yr').val))

or like this:

df1.join(df1.set_index('id') @ df2.set_index('yr').T, on='id')

Done.

The long story

Let's see what's going on here.

To find the output of multiplication of each df1.val by values in df2.val we use apply:

df1['val'].apply(lambda x: x * df2.val)

The function inside will obtain df1.vals one by one and multiply each by df2.val element-wise (see broadcasting for details if needed). As far as df2.val is a pandas sequence, the output is a data frame with indexes df1.val.index and columns df2.val.index. By df2.set_index('yr') we force years to be indexes before multiplication so they will become column names in the output.

DataFrame.join is joining frames index-on-index by default. So due to identical indexes of df1 and the multiplication output, we can apply df1.join( <the output of multiplication> ) as is.

At the end we get the desired matrix with indexes df1.index and columns id, val, *df2['yr'].

The second variant with @ operator is actually the same. The main difference is that we multiply 2-dimentional frames instead of series. These are the vertical and horizontal vectors, respectively. So the matrix multiplication will produce a frame with indexes df1.id and columns df2.yr and element-wise multiplication as values. At the end we connect df1 with the output on identical id column and index respectively.

Upvotes: 1

Bruhlickd
Bruhlickd

Reputation: 73

Your question is a bit vague. But I suppose you want to do something like that:

df = pd.concat([df1, df2], axis=1)

Upvotes: 0

This works for me:

df2 = df2.T
new_df = pd.DataFrame(np.outer(df1['val'],df2.iloc[1:]))
df = pd.concat([df1, new_df], axis=1)
df.columns = ['id', 'val', '2010', '2011', '2012']
df

The output I get:

    id  val 2010    2011    2012
0   a   1      4       5       6
1   b   2      8      10      12
2   c   3     12      15      18

Upvotes: 0

Related Questions