Adding row values to a dataframe based on matching column labels

Question

I try to get my head around this problem. I have three dataframes and I would like to merge (concatenate?) two of these dataframes based on values inside a third one. Here are the dataframes:

df1:

index,fields,a1,a2,a3,a4,a5
2018-06-01,price,1.1,2.1,3.1,4.1,5.1
2018-06-01,amount,15,25,35,45,55
2018-06-02,price,1.2,2.2,3.2,4.2,5.2
2018-06-02,amount,16,26,36,46,56
2018-06-03,price,1.3,2.3,3.3,4.3,5.3
2018-06-03,amount,17,27,37,47,57

df2:

index,fields,b1,b2,b3
2018-06-01,clients,1,2,3
2018-06-02,clients,1,2,3
2018-06-03,clients,1,2,3

Columns in df1 and df2 are different but their relationship is in df3.

df3:

index,product1,product2
0,a1,b1
1,a2,b1
2,a3,b2
3,a4,b2
4,a5,b3

I would like to merge the data in df1 and df2 but keep the same columns as in d1 (as b1, b2, b3 are referenced with a1, a2, a3, a4 and a5). Here is df4, the desired dataframe I want.

df4:

 index,fields,a1,a2,a3,a4,a5
 2018-06-01,price,1.1,2.1,3.1,4.1,5.1
 2018-06-01,amount,15,25,35,45,55
 2018-06-01,clients,1,1,2,2,3
 2018-06-02,price,1.2,2.2,3.2,4.2,5.2
 2018-06-02,amount,16,26,36,46,56
 2018-06-02,clients,4,4,5,5,6
 2018-06-03,price,1.3,2.3,3.3,4.3,5.3
 2018-06-03,amount,17,27,37,47,57
 2018-06-03,clients,7,7,8,8,9

many thanks in advance,

koPytok · Accepted Answer

Unpivot df2 using df.melt:

df2_melt = df2.melt(["index", "fields"], var_name="product2")

Drop redundant column index from reference table df3 and pd.merge it with melted df2:

merged = pd.merge(df2_melt, df3.drop("index", axis=1), on="product2")\
    .drop("product2", axis=1)

Do pd.pivot_table from merge result:

new_rows = pd.pivot_table(merged, index=["index", "fields"],
                          columns="product1", values="value")\
    .reset_index()

Add new rows to df1 with pd.concat, sort rows and reset index:

pd.concat([df1, new_rows]).sort_values("index").reset_index(drop=True)

Result:

product1    index       fields  a1      a2      a3      a4      a5
0           2018-06-01  price   1.1     2.1     3.1     4.1     5.1
1           2018-06-01  amount  15.0    25.0    35.0    45.0    55.0
2           2018-06-01  clients 1.0     1.0     2.0     2.0     3.0
3           2018-06-02  price   1.2     2.2     3.2     4.2     5.2
4           2018-06-02  amount  16.0    26.0    36.0    46.0    56.0
5           2018-06-02  clients 1.0     1.0     2.0     2.0     3.0
6           2018-06-03  price   1.3     2.3     3.3     4.3     5.3
7           2018-06-03  amount  17.0    27.0    37.0    47.0    57.0
8           2018-06-03  clients 1.0     1.0     2.0     2.0     3.0

Adding row values to a dataframe based on matching column labels

Answers (2)

Related Questions