Reputation: 1693
I am trying to compare columns from 2 csv files.
In file1.csv:
aaa,1
abc,2
bcd,2
adc,3
In file2.csv:
aaa,1
abc,0
bcd,2
adc,4
I expected to get the results "Not Equ".
If the first columns are the same but the second columns are difference.
I tried the codes below but not success:
import csv
file1 = 'C:/Users/Desktop/file1.csv'
file2 = 'C:/Users/Desktop/file2.csv'
reader1 = csv.reader(open(file1))
reader2 = csv.reader(open(file2))
for row1 in reader1:
text1 = row1[0].split(',')
test1sentence = text1[0]
test1class = text1[1]
for row2 in reader2:
text2 = row2[0].split(',')
test2sentence = text2[0]
test2class = text2[1]
if test1sentence == test2sentence:
if test1class != test2class:
print "Not Equ"
Any suggestion?
Upvotes: 0
Views: 3543
Reputation: 136141
You're basically comparing to dictionaries (or maps), that matches keys to values.
The right way to do this is to compare the keysets. If they're equal, for each key, compare the value in both files.
#!/usr/bin/python
def file_to_dict(filename):
lines = open(filename).read().split()
return dict([line.split(',') for line in lines])
dict1, dict2 = file_to_dict('file1.csv'), file_to_dict('file2.csv')
print "Keys are equal? ", set(dict1.keys())==set(dict2.keys())
print "Values are equal? ", all([dict1[key]==dict2[key] for key in dict1])
Upvotes: 2
Reputation: 25197
You can read the files into two dicts and compare like this:
dict1 = dict(row for row in reader1 if len(row) == 2)
dict2 = dict(row for row in reader2 if len(row) == 2)
if sorted(dict1.keys()) == sorted(dict2.keys()):
if dict1 != dict2:
print "Not Equ"
dict(reader1)
turns the first column into keys of the dict and the second column into values.
The generator expression (row for row in reader1 if len(row) == 2)
filters out rows that do not have exactly two columns.
To compare the keys, they must be sorted. Comparing whole dicts (dict1 != dict2
) will compare sorted key-value pairs. Because we compared the keys first, we know that any difference between the whole dicts is due to values.
dict1 = dict(row for row in reader1 if len(row) == 2)
dict2 = dict(row for row in reader2 if len(row) == 2)
for key, val in dict1.iteritems():
try:
if dict2[key] != val:
print "Not Equ"
except KeyError:
pass
Upvotes: 1
Reputation: 2236
You can parse it into dictionaries and make compare, like this:
>>> d1={'aaa':1, 'abc':2, 'bcd':2, 'adc':3}
>>> d2={'aaa':1, 'abc':0, 'bcd':2, 'adc':4}
>>> d1==d2
False
>>> d2={'aaa':1, 'abc':2, 'bcd':2, 'adc':3}
>>> d1==d2
True
Upvotes: 0