Reputation: 305
I wanna check a list of ips if they are blacklisted (using multi-threads).
So, I have the following code:
pthread_mutex_t input_queue;
void * process(void * data)
{
unsigned long ip = 0xffffffff;
char line[20];
while (!feof(INFILE))
{
pthread_mutex_lock(&input_queue);//?required
if (fgets(line,sizeof(line),INFILE) != NULL)
{
if (strlen(line) < 8)
break;
if (line[strlen (line) - 1] == '\n')
line[strlen (line) - 1] = '\0';
ip = ntohl((unsigned long)inet_addr(line));
}
pthread_mutex_unlock(&input_queue);
blacklist(ip);
}
return NULL;
}
//in main()
pthread_mutex_init(&input_queue,NULL);
for(i = 0 ; i < number_thread; i++)
{
if(pthread_create(&thread_id[i],NULL,&process,NULL) != 0)
{
i--;
fprintf(stderr,RED "\nError in creating thread\n" NONE);
}
}
for(i = 0 ; i < number_thread; i++)
if(pthread_join(thread_id[i],NULL) != 0)
{
fprintf(stderr,RED "\nError in joining thread\n" NONE);
}
Is pthread_mutex_lock necessary or fgets is thread safe? I have the feeling that my code has some issues.
Upvotes: 1
Views: 5173
Reputation: 15278
You don't need those. POSIX guarantees that each FILE
object is thread-safe. See http://pubs.opengroup.org/onlinepubs/009695399/functions/flockfile.html:
All functions that reference (
FILE *
) objects shall behave as if they useflockfile()
andfunlockfile()
internally to obtain ownership of these (FILE *
) objects.
Unless blacklist(ip)
is computation intensive, locking every 10 bytes will actually make your application much slower than avoiding multi threading altogether.
Upvotes: 4
Reputation: 70372
C.99 is not thread aware, and so portability would demand locks to be in place. However, C.11 makes thread safety guarantees on file operations (C.11 §7.21.2 ¶7):
Each stream has an associated lock that is used to prevent data races when multiple threads of execution access a stream, and to restrict the interleaving of stream operations performed by multiple threads. Only one thread may hold this lock at a time. The lock is reentrant: a single thread may hold the lock multiple times at a given time.
In terms of implementation, if the file is not very large, you might find it more performant to read in the entire file at once, and then divide input for the threads. However, with my proposal, a large enough file will cause the serialized I/O to become a bottleneck. At that point, I might consider alternative file representations for the input, such as a binary file format, and using asynchronous I/O and read from multiple points in the file in parallel.
Upvotes: 1