Sachin
Sachin

Reputation: 1704

Comparison of two csv file and output with differences?

I am comparing two csv files but the update.csv file is same as new.csv

import csv

with open('old.csv', 'r') as t1:
    old_csv = t1.readlines()

with open('new.csv', 'r') as t2:
    new_csv = t2.readlines()

with open('update.csv', 'w') as out_file:
        line_in_new = 0
        line_in_old = 0
        while line_in_new < len(new_csv) and line_in_old < len(old_csv):
            if old_csv[line_in_old] != new_csv[line_in_new]:
                out_file.write(new_csv[line_in_new])
            else:
        line_in_old += 1
    line_in_new += 1

I want output same as the sample.

Sample :

Input:

old.csv

a,b,c
1,2,3
4,5,6
8,9,9

new.csv

a,b,c
1,2,3
5,6,7
8,9,7

Output:

update.csv

4,5,6,deleted
5,6,7,new added 
8,9,9,change

Please help me to get the only difference on update.csv

Upvotes: 0

Views: 522

Answers (1)

Ashish Acharya
Ashish Acharya

Reputation: 3399

A solution using pandas:

import pandas as pd

df1 = pd.read_csv('old.csv')
df2 = pd.read_csv('new.csv')

df1['flag'] = 'old'
df2['flag'] = 'new'

df = pd.concat([df1, df2])

dups_dropped = df.drop_duplicates(df.columns.difference(['flag']), keep=False)
dups_dropped.to_csv('update.csv', index=False)

Input:

old.csv

a,b,c
1,2,3
4,5,6

new.csv

a,b,c
1,2,3
5,6,7

Output:

update.csv

a,b,c,flag
4,5,6,old
5,6,7,new

Upvotes: 3

Related Questions