Use one dataframe rows to connect the columns of two different dataframes (Pandas)

Question

I have a dataframe named "table":

UNICO       |   RES    |        
Responsabile|   -      | 
Product     |  Prodotto| 
Brand       |  Brand   |

The column names of table dataframe corresponds to 2 dataframe. The first one, "Unico", which is empty, is something like this:

Responsabile | Product  | Brand
NaN          | NaN      | NaN

The second one, "RES" is something like this:

Prodotto | Brand
X        | AA
Y        | BB

I want to use the rows of the dataframe table to connect the values of the columns of the dataframe UNICO to the values of the columns of the dataframe RES and populate the dataframe UNICO. My expected outcome is:

UNICO:

Responsabile | Product  | Brand
NaN          |     X    | AA
NaN          |     Y    | BB

Diego Mourino · Accepted Answer

To solve this situation, there is two main paths:

Path one: use an outer join, and handle the output to create the desired output. The code to perform that is below:

import pandas as pd
import numpy as np

unico = pd.DataFrame({'Resposabile':[np.nan], 'Product':[np.nan], 'Brand':[np.nan]})
res = pd.DataFrame({'Prodotto':['x', 'y'], 'Brand':['AA', 'BB']})

output = pd.merge(left=unico,
                  right=res,
                  how='outer',
                  left_index=True,
                  right_index=True)

output = output[['Resposabile', 'Prodotto', 'Brand_y']]
output.columns = ['Resposabile', 'Prodotto', 'Brand']

This solution implies an outer join. Bechas the column Brand appears in both DataFrames, at the end of the day the one retaines is the one with non-nulls values.

Path two (the easiest one): using the pd.concat() function. To do that, the syntaxis is in the code below:

output = pd.concat([unico.Resposabile, res], axis=1)

Use one dataframe rows to connect the columns of two different dataframes (Pandas)

Answers (1)

Related Questions