Reputation: 69
I have two DataFrames of 20 rows and 4 columns. The names and value types of the columns are the same.
One of the columns is the title
, the other 3 are values.
df1
title col1 col2 col3
apple a d g
pear b e h
grape c f i
df2
title col1 col2 col3
carrot q t w
pumpkin r u x
sprouts s v y
Now I would like to create 3 separate tables/lists subtracting each value of df1.col1 - df2.col1
| df1.col2 - df2.col2
| df1.col3 - df2.col3
. For df1.col1 - df2.col1
I expect an output that looks something among the lines of:
df1.title df2.title score
apple carrot (a - q)
apple pumpkin (a - r)
apple sprouts (a - s)
pear carrot (b - t)
pear pumpkin (b - u)
pear sprouts (b - v)
grape carrot (c - w)
grape pumpkin (c - x)
grape sprouts (c - y)
I tried to create a for loop using the following code:
for i in df1.iterrows():
score_col1 = df1.col1[[i]] - df2.col2[[j]]
score_col2 = df1.col2[[i]] - df2.col2[[j]]
score_col3 = df1.col3[[i]] - df2.col3[[j]]
score_total = score_col1 + score_col2 + score_col3
i = i + 1
In return, I received an output for score_col1
looking like this:
df1.title df2.title score
apple carrot (a - q)
pear carrot (b - t)
grape carrot (c - w)
Can someone help me to obtain the expected output?
Upvotes: 1
Views: 275
Reputation: 231
a1 = ['apple','pear', 'banana']
b1 = [56,32,23]
c1 = [12,34,90]
d1 = [87,65,23]
a2 = ['carrot','pumpkin','sprouts']
b2 = [16,12,93]
c2 = [12,32,70]
d2 = [81,55,21]
df1 = pd.DataFrame({'title':a1, 'col1':b1, 'col2':c1, 'col3':d1})
df2 = pd.DataFrame({'title':a2, 'col1':b2, 'col2':c2, 'col3':d2})
res_df = pd.DataFrame([])
cols = ['col1','col2','col3']
for c in cols:
res_df = pd.DataFrame([])
for i,j in df1.iterrows():
for k,l in df2.iterrows():
res_df = res_df.append(pd.DataFrame({'title_df1':j.title, 'title_df2':l.title, 'score':j[str(c)] - l[str(c)]},index=[0]), ignore_index=True)
print(res_df)
Upvotes: 1