Reputation: 3
I am currently trying to compare two CSV files to check if IP addresses in the first column of file1.csv are in a row in file2.csv using Python 3.6. If the address is in file2, I need the second column value of that row copied to a new file that is identical to file 1. The two file setups look like this:
file 1:
XX.XXX.XXX.1,Test1
XX.XXX.XXX.2,Test2
XX.XXX.XXX.3,Test3
XX.XXX.XXX.4,Test4
XX.XXX.XXX.5,Test5
XX.XXX.XXX.6,Test6
XX.XXX.XXX.7,Test7
XX.XXX.XXX.8,Test8
and so on
file 2:
XX.XXX.XXX.6, Name6
XX.XXX.XXX.7, Name7
XX.XXX.XXX.8, Name8
I need the result.csv file to look like this:
XX.XXX.XXX.1,Test1, Not found
XX.XXX.XXX.2,Test2, Not found
XX.XXX.XXX.3,Test3, Not found
XX.XXX.XXX.4,Test4, Not found
XX.XXX.XXX.5,Test5, Not found
XX.XXX.XXX.6,Test6,Name6
XX.XXX.XXX.7,Test7,Name7
XX.XXX.XXX.8,Test8,Name8
The code I have so far is as follows:
import csv
f1 = open('file1.csv', 'r')
f2 = open('file2.csv', 'r')
f3 = open('results.csv', 'w')
c1 = csv.reader(f1)
c2 = csv.reader(f2)
c3 = csv.writer(f3)
file2 = list(c2)
for file1_row in c1:
row = 1
found = False
for file2_row in file2:
results_row = file1_row
x = file2_row[3]
if file1_row[1] == file2_row[1]:
results_row.append('Found. Name: ' + x)
found = True
break
row += 1
if not found:
results_row.append('Not found in File1')
c3.writerow(results_row)
f1.close()
f2.close()
f3.close()
Right now this code is checking for identical rows not values. Meaning it will not match anything as it requires both the IP column and adjacent column to be equal on both files, in addition it matches both files' row 1, row 2, row3, and so on, but I need it to search through one to find the matches in the other, not compare rows by index.
Upvotes: 0
Views: 3144
Reputation: 123473
Here's a non-pandas solution (assuming you're using Python 3.x):
import csv
present = {}
with open('file2.csv', 'r', newline='') as file2:
reader = csv.reader(file2, skipinitialspace=True)
for ip, name in reader:
present[ip] = name
with open('file1.csv', 'r', newline='') as file1, \
open('results.csv', 'w', newline='') as results:
reader = csv.reader(file1, skipinitialspace=True)
writer = csv.writer(results)
for ip, name in reader:
writer.writerow([ip, name, present.get(ip, ' Not found')])
File Results.csv
:
XX.XXX.XXX.1,Test1, Not found
XX.XXX.XXX.2,Test2, Not found
XX.XXX.XXX.3,Test3, Not found
XX.XXX.XXX.4,Test4, Not found
XX.XXX.XXX.5,Test5, Not found
XX.XXX.XXX.6,Test6,Name6
XX.XXX.XXX.7,Test7,Name7
XX.XXX.XXX.8,Test8,Name8
Upvotes: 0
Reputation: 85482
A pandas solution:
import pandas as pd
df1 = pd.read_csv('file_1.csv', names=['a', 'b'])
df2 = pd.read_csv('file_2.csv', names=['a', 'b'])
merged = pd.merge(df1, df2, on='a', how='outer')
merged.to_csv('results.csv', header=False, index=False, na_rep='Not found')
Content of results.csv
:
XX.XXX.XXX.1,Test1,Not found
XX.XXX.XXX.2,Test2,Not found
XX.XXX.XXX.3,Test3,Not found
XX.XXX.XXX.4,Test4,Not found
XX.XXX.XXX.5,Test5,Not found
XX.XXX.XXX.6,Test6, Name6
XX.XXX.XXX.7,Test7, Name7
XX.XXX.XXX.8,Test8, Name8
Upvotes: 1
Reputation: 19806
A close solution to what you have tried would be as follows:
with open('result.csv', 'w') as out:
with open('file1.csv', 'r') as f1, open('file2.csv', 'r') as f2:
f2_lines = [line for line in f2.readlines() if len(line) > 1]
f1_lines = [line for line in f1.readlines() if len(line) > 1]
for line in f1_lines:
val = 'Not found'
b = [line.split(',')[0].strip() in item for item in f2_lines]
if any(b):
val = f2_lines[b.index(True)].split(',')[1].strip()
out.write('{}, {}\n'.format(line.strip(), val))
Output:
XX.XXX.XXX.1,Test1, Not found
XX.XXX.XXX.2,Test2, Not found
XX.XXX.XXX.3,Test3, Not found
XX.XXX.XXX.4,Test4, Not found
XX.XXX.XXX.5,Test5, Not found
XX.XXX.XXX.6,Test6, Name6
XX.XXX.XXX.7,Test7, Name7
XX.XXX.XXX.8,Test8, Name8
Upvotes: 0
Reputation: 546
I moved the position of results_row and changed indent after row+=1
import csv
f1 = open('file1.csv', 'r')
f2 = open('file2.csv', 'r')
f3 = open('results.csv', 'w')
c1 = csv.reader(f1)
c2 = csv.reader(f2)
c3 = csv.writer(f3)
file2 = list(c2)
for file1_row in c1:
row = 1
found = False
results_row = file1_row #Moved out from nested loop
for file2_row in file2:
x = file2_row[1]
if file1_row[0] == file2_row[0]:
results_row.append(x)
found = True
break
row += 1
if not found:
results_row.append('Not found')
c3.writerow(results_row)
f1.close()
f2.close()
f3.close()
Upvotes: 0