Reputation: 856
I've written the following simple testing code, that creates 10 000 empty .txt files in a subdirectory.
#include <iostream>
#include <time.h>
#include <string>
#include <fstream>
void CreateFiles()
{
int i = 1;
while (i <= 10000) {
int filename = i;
std::string string_i = std::to_string(i);
std::string file_dir = ".\\results\\"+string_i+".txt";
std::ofstream outfile(file_dir);
i++;
}
}
int main()
{
clock_t tStart1 = clock();
CreateFiles();
printf("\nHow long it took to make files: %.2fs\n", (double)(clock() - tStart1)/CLOCKS_PER_SEC);
std::cin.get();
return 0;
}
Everything works fine. All 10 000 .txt files are created within ~3.55
seconds. (using my PC)
Question 1: Ignoring the conversion from int
to std::string
etc., is there anything that I could optimize here for the program to create the files faster? I specifically mean the std::ofstream outfile
usage - perhaps using something else would be relevantly faster?
Regardless, ~3,55
seconds is satisfying compared to the following:
I have modified the function so right now it would also fill the .txt files with some random i
integer data and some constant text:
void CreateFiles()
{
int i = 1;
while (i <= 10000) {
int filename = i;
std::string string_i = std::to_string(i);
std::string file_dir = ".\\results\\"+string_i+".txt";
std::ofstream outfile(file_dir);
// Here is the part where I am filling the .txt with some data
outfile << i << " some " << i << " constant " << i << " text " << i << " . . . "
<< i << " --more text-- " << i << " --even more-- " << i;
i++;
}
}
And now everything (creating the .txt files and filling it with short data) executes within... ~37
seconds. That's a huge difference. And that's only 10 000 files.
Question 2: Is there anything I can optimize here? Perhaps there exist some alternative that would fill the .txt files quicker. Or perhaps I have forgotten about something very obvious that slows down the entire process?
Or, perhaps I am exaggerating a little bit and ~37
seconds seems normal and optimized?
Thanks for sharing your insights!
Upvotes: 1
Views: 392
Reputation: 4165
The speed of creation of file is hardware dependent, faster the drive faster you can create the files.
This is evident from the fact that I ran your code on an ARM processor (Snapdragon 636, on a Mobile phone using termux), now mobile phones have flash memory that are very fast when it comes to I/O. So it ran under 3 seconds most of the time and some time 5 second. This variation is expected as drive has to handle multi process read writes. You reported that it took 47 seconds for your hardware. Hence you can safely conclude that I/O speed is significantly dependent on Hardware.
None the less I thought to do some optimization to your code and I used 2 different approaches.
Using a C counterpart for I/O
Using C++ but writing in a chunk in one go.
I ran the simulation on my phone. I ran it 50 times and here are the results.
C was fastest taking 2.73928 second on average to write your word on 10000 text files, using fprintf
C++ writing with the complete line at one go took 2.7899 seconds. I used sprintf to get the complete line into a char[] then wrote using << operator on ofstream.
C++ Normal (Your Code) took 2.8752 seconds
This behaviour is expected, writing in chunks is fasters. Read this answer as to why. C was fastest no doubt.
You may note here that The difference is not that significant but if you are on a hardware with slow I/O, this becomes significant.
Here is the code I used for simulation. You can test it yourself but make sure to replace std::system
argument with your own commands (different for windows).
#include <iostream>
#include <time.h>
#include <string>
#include <fstream>
#include <stdio.h>
void CreateFiles()
{
int i = 1;
while (i <= 10000) {
// int filename = i;
std::string string_i = std::to_string(i);
std::string file_dir = "./results/"+string_i+".txt";
std::ofstream outfile(file_dir);
// Here is the part where I am filling the .txt with some data
outfile << i << " some " << i << " constant " << i << " text " << i << " . . . "
<< i << " --more text-- " << i << " --even more-- " << i;
i++;
}
}
void CreateFilesOneGo(){
int i = 1;
while(i<=10000){
std::string string_i = std::to_string(i);
std::string file_dir = "./results3/" + string_i + ".txt";
char buffer[256];
sprintf(buffer,"%d some %d constant %d text %d . . . %d --more text-- %d --even more-- %d",i,i,i,i,i,i,i);
std::ofstream outfile(file_dir);
outfile << buffer;
i++;
}
}
void CreateFilesFast(){
int i = 1;
while(i<=10000){
// int filename = i;
std::string string_i = std::to_string(i);
std::string file_dir = "./results2/"+string_i+".txt";
FILE *f = fopen(file_dir.c_str(), "w");
fprintf(f,"%d some %d constant %d text %d . . . %d --more text-- %d --even more-- %d",i,i,i,i,i,i,i);
fclose(f);
i++;
}
}
int main()
{
double normal = 0, one_go = 0, c = 0;
for (int u=0;u<50;u++){
std::system("mkdir results results2 results3");
clock_t tStart1 = clock();
CreateFiles();
//printf("\nNormal : How long it took to make files: %.2fs\n", (double)(clock() - tStart1)/CLOCKS_PER_SEC);
normal+=(double)(clock() - tStart1)/CLOCKS_PER_SEC;
tStart1 = clock();
CreateFilesFast();
//printf("\nIn C : How long it took to make files: %.2fs\n", (double)(clock() - tStart1)/CLOCKS_PER_SEC);
c+=(double)(clock() - tStart1)/CLOCKS_PER_SEC;
tStart1 = clock();
CreateFilesOneGo();
//printf("\nOne Go : How long it took to make files: %.2fs\n", (double)(clock() - tStart1)/CLOCKS_PER_SEC);
one_go+=(double)(clock() - tStart1)/CLOCKS_PER_SEC;
std::system("rm -rf results results2 results3");
std::cout<<"Completed "<<u+1<<"\n";
}
std::cout<<"C on average took : "<<c/50<<"\n";
std::cout<<"Normal on average took : "<<normal/50<<"\n";
std::cout<<"One Go C++ took : "<<one_go/50<<"\n";
return 0;
}
Also I used clang-7.0 as the compiler.
If you have any other approach let me know, I will test that too. If you find a mistake do let me know, I will correct it as soon as possible.
Upvotes: 4