Reputation: 143
using python, I want to compare two csv files but only compare row2 of the first csv against row0 of the second csv, but print out in a new csv file only the lines where there are no matches for the compared rows.
Example....
currentstudents.csv contains the following information
Susan,Smith,[email protected],8
John,Doe,[email protected],9
Cool,Guy,[email protected],3
Test,User,[email protected],5
previousstudents.csv contains the following information
[email protected]
[email protected]
[email protected]
After comparing the two csv files, a new csv called NewStudents.csv should be written with the following information:
Cool,Guy,[email protected],3
Here is what I have, but this fails to produce what I need....The below code will work, if I omit all data except the email address in the original currentstudents.csv file, but then I dont end up with the needed data in the final csv file.
def newusers():
for line in fileinput.input(r'C:\work\currentstudents.csv', inplace=1):
print(line.lower(), end='')
with open(r'C:\work\previousstudents.csv', 'r') as t1, open(r'C:\work\currentstudents.csv', 'r') as t2:
fileone = t1.readlines()
filetwo = t2.readlines()
with open(r'C:\work\NewStudents.csv', 'w') as outFile:
for (line[0]) in filetwo:
if (line[0]) not in fileone:
outFile.write(line)
Thanks in advance!
Upvotes: 0
Views: 764
Reputation: 62403
emails
(multiple times) with csv
.pd.read_csv
pandas.DataFrame.merge
[all_students._merge == 'left_only']
.
.iloc[:, :-2]
selects all rows, and all but last two columns.import pandas as pd
# read the two csv files
cs = pd.read_csv('currentstudents.csv', header=None)
ps = pd.read_csv('previousstudents.csv', header=None)
# merge the data
all_students = cs.merge(ps, left_on=2, right_on=0, how='left', indicator=True)
# select only data from left_only
new_students = all_students.iloc[:, :-2][all_students._merge == 'left_only']
# save the data without the index or header
new_students.to_csv('NewStudents.csv', header=False, index=False)
# NewStudents.csv
Cool,Guy,[email protected],3
Upvotes: 0
Reputation: 195428
This script writes NewStudents.csv
:
import csv
with open('sample.csv', newline='') as csvfile1, \
open('sample2.csv', newline='') as csvfile2, \
open('NewStudents.csv', 'w', newline='') as csvfile3:
reader1 = csv.reader(csvfile1)
reader2 = csv.reader(csvfile2)
csvwriter = csv.writer(csvfile3)
emails = set(row[0] for row in reader2)
for row in reader1:
if row[2] not in emails:
csvwriter.writerow(row)
The content of NewStudents.csv
:
Cool,Guy,[email protected],3
Upvotes: 1