Reputation: 35
Hi Stack Overflow users,
I'm quite new to Python and have already made a lot of progress from all questions and answer. Many thanks to everyone contributing!
I have been trying to solve an issue for the last couple of days, but despite much research I still haven't been able to figure it out.
Say I have two CSV files containing the following data:
'File1.csv':
sku; title
01001; Product1
01002; Product2
01003; Product3
and 'File2.csv':
sku; color
01001; blue
01003; green
01005; red
What I would like my python program to do, is to add the missing data (color) for all SKU's in File1.csv. The output should be as follows:
'output1.csv':
sku; title; color
01001; Product1; blue
01002; Product2;
01003; Product3; green
The Python code I wrote:
import csv
f1 = open('file1.csv', 'r', newline='', encoding='UTF-8')
f2 = open('file2.csv', 'r', newline='', encoding='UTF-8')
f1_reader = csv.reader(f1, delimiter=';')
f2_reader = csv.reader(f2, delimiter=';')
i = 0
j = 0
for row in f1_reader:
for line in f2_reader:
if str(row[0]) == str(line[0]):
print(str(i), str(j), 'Success', str(row[0]), str(line[0]), str(line[1]))
# Use values to copy the color value to file1.csv
j = j + 1
i = i + 1
Unfortunately this code doesn't work. The only output I get is as follows:
1 1 Success sku sku color
I guess my logic might be a bit off here, but I expected more 'Successes'. Once Again, I am just a beginner in Python, so I would love to learn what is wrong with my code!
Any advice?
UPDATE
I deleted the newline='' and the if-statement to see what the problem might be:
for row in f1_reader:
for line in f2_reader:
print('Success', str(row[0]), str(line[0]), str(line[1]))
It turns out I get the following output:
Success sku sku color
Success sku 01001 blue
Success sku 01003 green
Success sku 01005 red
IndexError: list index out of range
The f1_reader doesn't iterate over all of its lines, while the f2_reader does. Is there any way to make my code iterate over both my files, checking for identical row[0]'s and line[0]'s?
SOLUTION
The following code works for me:
import csv
f1 = open('file1.csv', 'r')
f1_reader = csv.reader(f1, delimiter=';')
for row in f1_reader:
with open('file2.csv', 'r') as f2:
f2_reader = csv.reader(f2, delimiter=';')
for line in f2_reader:
try:
if str(row[0]) == str(line[0]):
print('Success', str(row[0]), str(line[0]), str(line[1]))
except IndexError:
continue
Returns:
Success sku sku color
Success 01001 01001 blue
Success 01003 01003 green
Success 01005 01005 red
Upvotes: 2
Views: 55
Reputation: 1585
The issue you have is that once you have read through f2 once, you are at the end of the file. Therefore it will only compare f2 with the first loop. You need a way to read f2 from the start of the file when you start the loop again. I have edited you code below so that is loops as you except it to:
import csv
f1 = open('file1.csv', 'r')
f1_reader = csv.reader(f1, delimiter=';')
i = 0
j = 0
for row in f1_reader:
f2 = open('file2.csv', 'r')
f2_reader = csv.reader(f2, delimiter=';')
for line in f2_reader:
if row[0] == line[0]:
print(str(i), str(j), 'Success', str(row[0]), str(line[0]), str(line[1]))
# Use values to copy the color value to file1.csv
j = j + 1
f2.close
i = i + 1
Returns
0 0 Success sku sku color
1 5 Success 01001 01001 blue
3 14 Success 01003 01003 green
Upvotes: 2