Mukesh Kumar Mishra
Mukesh Kumar Mishra

Reputation: 161

Match Data of 2 Excel Sheets where Column Names are Different

I have 2 Excel Sheets both put into a Pandas Dataframe.

Ex:

1.csv
EmployeeID | DOB        | Age
A          | 12/11/1978 | 2
B          | 12/09/1988 | 8
C          | 12/02/1985 | 20
D          | 11/11/1990 | 25 

2.csv
MemberIdentifier| DateofBirth| YOW
A               | 12/11/1978 | 2
B               | 12/09/1988 | 8
C               | 12/02/1985 | 20

result.csv
EmployeeID | MemberIdentifier | DOB        |DateofBirth| Exp | YOW
A          |       A          | 12/11/1978 | 12/11/1978| 2   | 2
B          |       B          | 12/09/1988 | 12/09/1988| 8   | 8
C          |       C          | 12/02/1985 | 12/02/1985| 20  | 20
D          |       NA         | NA         | NA        | NA  | NA

Now how do I get a result which compares both the sheets based on the data not on the basis of column names because column names can be anything.

Any hint hint or idea of how to start will be great!!

Upvotes: 3

Views: 1298

Answers (1)

jezrael
jezrael

Reputation: 862681

You need parameters left_on and right_on in merge with left join:

df = pd.merge(df1, df2, left_on='EmployeeID', right_on='MemberIdentifier', how='left')
print (df)
  EmployeeID         DOB  Age MemberIdentifier DateofBirth   YOW
0          A  12/11/1978    2                A  12/11/1978   2.0
1          B  12/09/1988    8                B  12/09/1988   8.0
2          C  12/02/1985   20                C  12/02/1985  20.0
3          D  11/11/1990   25              NaN         NaN   NaN

If need match columns in two DataFrames use:

d = {}
for col2 in df2.columns:
    for col1 in df1.columns:
        cond = df2[col2].isin(df1[col1]).all()
        if cond:
            d[col2] = col1

print (d)
{'MemberIdentifier': 'EmployeeID', 'DateofBirth': 'DOB', 'YOW': 'Age'}

Last you can rename columns by dict:

df2 = df2.rename(columns=d)
print (df2)
  EmployeeID         DOB  Age
0          A  12/11/1978    2
1          B  12/09/1988    8
2          C  12/02/1985   20

Upvotes: 3

Related Questions