Compare two csv files and add columns that are not common in both of them

Question

I have to CSV files Book1 and Book2. The columns in Book1 are A, B, C, D, E and in Book2 are A, B, E, H. I want to modify Book2 in such a way that it contains only those column names that are common with Book1 plus whatever additional is there in Book1. The files are :

Book1

A           B   C   D   E
10.12.0.1   a   35  0   11
10.12.0.1   b   35  1   10
107.77.87   a   35  0   101

Book2:

A          B    E   H
9.81.2     b    10  w
10.15.32   b    100 w
11.16.5    b    101 w

After modification the final Book2 will be: Book2_final:

A          B    C   D   E
9.81.2     b            10
10.15.32   b            100
11.16.5    b            101

I tried the following:

import pandas
a= open('input_test.txt','r')
csv1 = pandas.read_csv('Book2.csv',dtype='unicode')
inserted_cols = a.read().split(',')
csv1[inserted_cols].to_csv('Book2_test.csv',index=False)

where the file 'input_test.txt' contains the columns of Book1 in the same order sepaated by commas. But I get error:

KeyError: "['C' 'D' 'E\n'] not in index"

Can't figure out what's wrong.New to Python.

piRSquared · Accepted Answer

Use reindex on book2's columns with book1's columns. You'll have to transpose first, then transpose back.

book2.T.reindex(book1.columns).T

Compare two csv files and add columns that are not common in both of them

Answers (1)

Related Questions