Reputation: 906
I have this .csv file ...
id,first_name,last_name,email,date,opt-in
1,Jimmy,Reyes,[email protected],12/29/2016,FALSE
2,Doris,Wood,[email protected],04/22/2016,
3,Steven,Miller,[email protected],07/31/2016,FALSE
4,Earl,Parker,[email protected],01-08-17,FALSE
5,Barbara,Cruz,[email protected],12/30/2016,FALSE
I want to read the above shown csv file, transform data, and finally write the data in another text file, which should look like this ....
1,<tab>"first_name"="Jimmy","last_name"="Reyes","email"="[email protected]","date"="12/29/2016","opt-in"="FALSE"
2,<tab>"first_name"="Doris","last_name"="Wood","email"="[email protected]","date"="04/22/2016,,"opt-in"="0"
Also, If the opt-in value is empty, its should print "0".
Here is my code so far ....
import csv
import time
# Do the reading
with open('my-scripts/mock.csv', 'r') as f1:
#next(f1, None) # skip the headers
reader = csv.reader(f1)
new_rows_list = []
for row in reader:
if row[5] == '':
new_row = [row[0],'\t',row[1], row[2], row[3], row[4], '0']
new_rows_list.append(new_row)
else:
new_row = [row[0],'\t',row[1], row[2], row[3], row[4], row[5]]
new_rows_list.append(new_row)
f1.close() # <---IMPORTANT
# Do the writing
newfilename = 'my-scripts/ftp_745198_'+str(int(time.time()))
with open(newfilename, 'w', newline='') as f2:
writer = csv.writer(f2, quoting=csv.QUOTE_NONNUMERIC)
writer.writerows(new_rows_list)
f2.close()
The above code is generating this output, which is not what I exactly want ... I am unable to figure out how to print column names in each row as shown above in the desired output ...!
"id"," ","first_name","last_name","email","date","opt-in"
"1"," ","Jimmy","Reyes","[email protected]","12/29/2016","FALSE"
"2"," ","Doris","Wood","[email protected]","04/22/2016","0"
"3"," ","Steven","Miller","[email protected]","07/31/2016","FALSE"
"4"," ","Earl","Parker","[email protected]","01-08-17","FALSE"
"5"," ","Barbara","Cruz","[email protected]","12/30/2016","FALSE"
New CSV
id,first_name,last_name,email,date,opt-in,unique_code
1,Jimmy,Reyes,[email protected],12/29/2016,FALSE,ER45DH
2,Doris,Wood,[email protected],04/22/2016,,MU34T3
3,Steven,Miller,[email protected],07/31/2016,FALSE,G34FGH
4,Earl,Parker,[email protected],01-08-17,FALSE,ASY67J
5,Barbara,Cruz,[email protected],12/30/2016,FALSE,NHG67P
New expected output
ER45DH<tab>"id"="1","first_name"="Jimmy","last_name"="Reyes","email"="[email protected]","date"="12/29/2016","opt-in"="FALSE"
MU34T3<tab>"id"="2","first_name"="Doris","last_name"="Wood","email"="[email protected]","date"="04/22/2016,"opt-in"="0"
I will really appreciate any help/ideas/pointers.
Thanks
Upvotes: 1
Views: 2195
Reputation: 1174
Initially extract the header as a new list.
Then append header with each row elements as a string.
Write it to the file.
Please try this code,
import csv
with open('newfilename.csv', 'w') as f2:
with open('mycsvfile.csv', mode='r') as infile:
reader = csv.reader(infile)
for i,rows in enumerate(reader):
if i == 0:
header = rows
else:
if rows[5] == '':
rows[5] = 0;
pat = rows[0]+'\t'+'''"%s=%%s",'''*(len(header)-1)+'\n'
print pat
f2.write(pat % tuple(header[1:]) % tuple(rows[1:]))
f2.close()
Output:
1 "first_name=Jimmy","last_name=Reyes","[email protected]","date=12/29/2016","opt-in=FALSE",
2 "first_name=Doris","last_name=Wood","[email protected]","date=04/22/2016","opt-in=0",
3 "first_name=Steven","last_name=Miller","[email protected]","date=07/31/2016","opt-in=FALSE",
4 "first_name=Earl","last_name=Parker","[email protected]","date=01-08-17","opt-in=FALSE",
5 "first_name=Barbara","last_name=Cruz","[email protected]","date=12/30/2016","opt-in=FALSE",
Please let me know in terms of any queries.
Upvotes: 1
Reputation: 823
Firstly, save the header into a variable. For example:
for i,row in enumerate(reader):
if i == 0:
header = row
else:
new_row = [row[0],'\t'] + ['%s=%s' % (header[j],row[j]) for j in range(1,6)]
....
...
Secondly, code such as [row[1], row[2], row[3], row[4], row[5]]
can be simplified into [row[i] for i in range(1,6)] (generator)
Thridly, format is a good tool:
print('"%s"="%s"'% (header[1],row[1]))
will output "first_name"="Jimmy"
Use this knowledge and consider how to make it work.
Upvotes: 1
Reputation: 2567
You could keep the header in a list, then use the the list (like first_name, etc.) to match the elements in the followed lines (like Jimmy, etc.) to generate the output you want ("first_name"="Jimmy").
Upvotes: 1