Reputation: 679
I need to multiply every single element of a column by every single element from a different column of the same dataframe. My original data sets looks something like this:
origin sum sum2
a. 2 1
b. 4 2
c. 6 3
The result I'm expecting is something similar to:
origin dest result (sum * sum2)
a. a. 2
a. b. 4
a. c. 6
b. a. 4
b. b. 8
b. c. 12
c. a. 6
c. b. 12
c. c. 18
The script that I'm writing is the following, but I can't get the results in need:
x = 0
numerator = []
for index1, row1 in df.iterrows():
constant = row1
numerator.append([])
for index2, row2 in df.iterrows():
result = row2*constant
numerator[x].append(result)
x = x + 1
Upvotes: 3
Views: 1232
Reputation: 863741
You can use:
numpy.outer
for multiple numpy.ravel
for flattening
MultiIndex.from_product
for new index from column origin
DataFrame
constructor
reset_index
for columns from MultiIndex
:
mux = pd.MultiIndex.from_product([df.origin, df.origin], names=['origin','dest'])
data = np.outer(df['sum'], df['sum2']).ravel()
df = pd.DataFrame(data, index=mux, columns=['result']).reset_index()
print (df)
origin dest result
0 a. a. 2
1 a. b. 4
2 a. c. 6
3 b. a. 4
4 b. b. 8
5 b. c. 12
6 c. a. 6
7 c. b. 12
8 c. c. 18
Upvotes: 3
Reputation: 453
import pandas as pd
import itertools
# Make data example
df = pd.DataFrame()
df['origin']=['a.','b.','c.']
df['sum'] = [2,4,6]
df['sum2'] = [1,2,3]
# Record sum and sum2 for a. b. c.
df_dict = df.set_index('origin').to_dict()
df_final = pd.DataFrame()
for x,y in itertools.product(df['origin'],df['origin']):
df_final = pd.concat([df_final,pd.DataFrame([x,y,df_dict['sum'][x]*df_dict['sum2'][y]]).T],axis=0)
df_final.columns =['origin','dest','result (sum * sum2)']
Result
origin dest result (sum * sum2)
0 a. a. 2
0 a. b. 4
0 a. c. 6
0 b. a. 4
0 b. b. 8
0 b. c. 12
0 c. a. 6
0 c. b. 12
0 c. c. 18
Upvotes: 0
Reputation:
You can use np.outer
for the multiplication.
np.outer(df['sum'], df['sum2'])
Out:
array([[ 2, 4, 6],
[ 4, 8, 12],
[ 6, 12, 18]])
This can be converted to a Series with labels as follows:
pd.DataFrame(np.outer(df['sum'], df['sum2']),
index=df['origin'],
columns=df['origin']).rename_axis('dest', axis=1).stack()
Out:
origin dest
a. a. 2
b. 4
c. 6
b. a. 4
b. 8
c. 12
c. a. 6
b. 12
c. 18
dtype: int64
(pd.DataFrame(np.outer(df['sum'], df['sum2']),
index=df['origin'],
columns=df['origin']).rename_axis('dest', axis=1).stack()
.to_frame('result').reset_index())
Out:
origin dest result
0 a. a. 2
1 a. b. 4
2 a. c. 6
3 b. a. 4
4 b. b. 8
5 b. c. 12
6 c. a. 6
7 c. b. 12
8 c. c. 18
Upvotes: 3