Reputation: 868
I have two '.csv' files in the below format: First File :
Roll_num Class_Name
1 ABC
2 DEF
5 PQR
27 UVW
Second File :
Roll_num Marks Grade
1 75 A
2 60 C
27 68 B
61 45 E
Now i want to add a column in the second file appending a column 'Class_Name' from First File. The data in both the files has duplicates in it and is not sorted.
I have written the following code that writes our required data from 2 files into a new file.
import csv
path="xyz"
file_read=open(path + "ClassName.CSV", "r")
reader_ClassName = csv.reader(file_read)
read_all_data=open(path + "Student.CSV", "r")
reader_Student =csv.reader(read_all_data)
write_all_data=open( path +"Student_Classname.CSV", "w")
for line_Student in reader_Student:
Roll_Student=line_Student[0]
for line_ClassName in reader_ClassName:
Roll_ClassName=line_ClassName[0]
ClassName=line_ClassName[1]
if(Roll_ClassName == Roll_Student):
string= Roll_Student +","+ClassName +"\n"
print string
write_all_data.write(string)
break
Output Expected :
Roll_num Marks Grade Class_Name
1 75 A ABC
2 60 C DEF
27 68 B UVW
61 45 E LMN
Output our code generates:
Roll_num Marks Grade Class_Name
1 75 A ABC
2 60 C DEF
There is some issue in reading the Third line from Second inner 'for' loop. We have hundreds of thousands of records in both the files.
Upvotes: 0
Views: 78
Reputation: 1850
I have named first CSV file as hash.csv
and second CSV file as data.csv
. Below script will help you.
import re
# Building up a hash with roll_num and class_name from hash.csv
chash = dict([ re.split('\s+', x.strip()) for x in open('hash.csv').readlines()][1:])
# Building a list of students record from data.csv
data = [ re.split('\s+', x.strip()) for x in open('data.csv').readlines() ][1:]
# Iterating through each data
for x in data:
if x[0] in chash:
x.append(chash[x[0]])
print('{0:<5} {1:<5} {2:<5} {3:<5}'.format(*x))
else:
print('{0:<5} {1:<5} {2:<5}'.format(*x))
Output:
1 75 A ABC
2 60 C DEF
27 68 B UVW
61 45 E
Upvotes: 0
Reputation: 160
I suggest to avoid loop in loop
by reading the whole ClassName.csv file and put into an dictionary first. I suggest the idea below
mydict = {}
for each_line in ClassName_csvfile:
rollnum = get_roll_num()
classname = get_class_name()
mydict[rollnum]=classname
for each_line in Student_csv_file:
rollnum = get_roll_num()
mark = get_mark()
grade = get_grade()
Classname = ''
if mydict.has_key(rollnum):
Classname = mydict[rollnum]
writetofile(rollnum, mark, grade, Classname)
Update: you can use if rollnum in mydict:
instead of mydict.has_key(rollnum)
if you are using Python 2.3+
. I am using python 2.7 and both works
P/s: Sorry for not commenting as it requires me 50 reputations
Upvotes: 1