Reputation: 93
I'm creating a function that searches through a directory, prints out files, and when it runs into a folder, a new thread is created to run through that folder and do the same thing.
It makes sense to me to use recursion then as follows:
pthread_t tid[500];
int i = 0;
void *search(void *dir)
{
struct dirent *dp;
DIR *df;
df = opendir(dir)
char curFile[100];
while ((dp = readdir(df)) != NULL)
{
sprintf(curFile, "%s/%s",dir,dp->d_name);
if(isADirectory(curFile))
{
pthread_create(&tid[i], NULL, &search, &curFile);
i++;
}
else
{
printf("%s\n", curFile);
}
}
pthread_join(&tid[i])
return 0;
}
When I do this, however, the function ends up trying to access directories that don't actually exist. Initially I had pthread_join() directly after pthread_create(), which worked, but I don't know if you can count that as multithreading since each thread waits for its worker thread to exit before doing anything.
Is the recursive aspect of this problem even possible, or is it necessary for a new thread to call a different function other than itself?
Upvotes: 0
Views: 3923
Reputation: 20651
Consider your while
loop. Inside it you have:
sprintf(curFile, "%s/%s",dir,dp->d_name);
and
pthread_create(&tid[i], NULL, &search, &curFile);
So, you mutate the contents of curFile
inside the loop, and you also create a thread which you are trying to pass the current contents of curFile
. This is a spectacular race hazard - there is no guarantee that the new thread will see the intended contents of curFile
, since it may have changed in the meantime. You need to duplicate the string and pass the new thread a copy which won't be mutated by the calling thread. The thread is therefore also going to have be responsible for deallocating the copy, which means either that the search
method do exactly that or that you have a second method.
You have another race condition in using i
and tid
in all threads. As I have suggested in the comment on your question, I think these variables should be method local.
In general I suggest that you read on thread safety and learn about data race hazards before you attempt to use threads. It is usually best to avoid the use of threads unless you really need the extra performance.
Upvotes: 0
Reputation: 974
I haven't dealt with multithreading in a while but if memory serves threads share resources. Which means (in your example) every new thread you make accesses the same variable "i". Now if those threads only read variable "i" there would be no problem whatsoever (every thread keeps reading ... i = 2 wohoo :D).
But issues arise when threads share resources that are being read and written on.
i = 2
i++
// there are many threads running this code
// and "i" is shared among them, are you sure i = 3?
Read, write on shared resources problem is solved with thread synchronization. I recommend reading/googling upon it since it's a pretty unique topic to be solved in one question.
P.S. I pointed out variable "i" in your code but there may be more such resources since your code doesn't display any attempt at thread synchronization.
Upvotes: 1