Reputation: 105
I am trying to read from mysql and write the result into a txt file. As you can see I use Apache's Commons IO. Result set contains tweets and every sql query below nearly returns 725 rows to be writed into txt file. My problem is the writing speed, it is very slow (2-3 kb per second). Am I missing something here?
Statement stmt2 = connection.createStatement();
for (int week = 0 ; week<hashTag.length/15 ; week++){
File container = new File("C:\\Users\\COMP\\Desktop\\threeMonthsSplitTxt\\weeklyBinsTwitter\\week"+week+"-"+hashTag[week]+".txt");
for(int hash = 0 ; hash<15 ; hash++){
ResultSet results = stmt2.executeQuery("select tweetContent
from threemonthswithhashtag
where hashTag = '"+hashTag[hashCount]+"'
and tweetCreatedTime between '"+firstDate[hashCount]+"'
and '"+ lastDate[hashCount]+"';");
while(results.next()){
tweetContent = results.getString("tweetContent");
try{
FileUtils.write(container,newLine,"UTF8",true);
FileUtils.write(container,tweetContent,"UTF8",true);
}catch(IOException e){e.getMessage();}
}
hashCount++;
}
}
Upvotes: 1
Views: 1905
Reputation: 140543
You are using an API that will create/open/close a file (handle) for each write operation.
And you are surprised that this doesn't give you optimal performance?!
That utility method might be convenient, but heck, instead of going
loop:
try:
open file; write to file; close file
open file; write to file; close file
Consider doing something along the lines of
open file
loop:
try:
write to open file
write to open file
close file
instead. Of course, that means that you will have to write more code; making things more complicated; but well: one has to balance "super-easy to read" code with "performing good enough" code sometimes.
Probably the most rework could go even go like:
StringBuilder toWrite = ...
loop:
try:
toWrite.append(...)
toWrite.append(...)
and then, after the loop, you use FileUtils.write()
in order to simply write the whole content (that you collected within memory) in one single shot to the file system.
That should keep the overall complexity of your new code at a reasonable level; but help with better end-to-end performance.
Upvotes: 4