Reputation: 13
i'm rather new to programming and am trying to reduce the time taken to write my data into a file, and i found that the writing part is the main issue.
The following is part of my code for a machine learning program:
filename="data.csv"
f=open(filename,"w")
headers="row,open\n"
f.write(headers)
for i in range (0,55970):
score=rf.predict(edit[i].reshape(1, -1))
score=str(score).replace('[','').replace(']','')
f.write(str(i) +","+ score +"\n")
f.close()
I understand that I should be writing the data only after i have gotten all of it, but i am not sure how to go about doing it - given that i only know f.write(). Do i make a function for my prediction and return score, then create a list to store all the scores and write it in? (if that is possible)
[Edit]
score=rf.predict(edit)
with open('data.csv', 'w',newline='') as f:
writer = csv.writer(f)
writer.writerow(['row', 'open'])
for i in range(55970):
writer.writerow([i,str(score[i])])
^ added based on new suggestion. Found that i should just do the predict and then write the rows which improved the time taken significantly!
Thank you for your help!!
Upvotes: 0
Views: 154
Reputation: 336
The CSV module is a better tool for this. More specifically, writerows()
is what you are looking for.
https://docs.python.org/3/library/csv.html#csv.csvwriter.writerows
Here is an example from the docs:
import csv
with open('some.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows(someiterable)
import csv
with open('data.csv', 'w') as csv_file:
writer = csv.writer(csv_file)
writer.writerow(['row_id', 'open_flag'])
for i in range(55970):
score = str(rf.predict(edit[i].reshape(1, -1)))
score.replace('[', '').replace(']', '')
writer.writerow([i, score])
Upvotes: 1