rockstone9527
rockstone9527

Reputation: 33

Why can't malloc/free APIs be correctly called in threads created by clone?

Why does some glibc's APIs(such as function malloc(), realloc() or free()) can not be correctly called in threads that are created by syscall clone?

Here is my code only for testing:

int thread_func( void *arg )
{
    void *ptr = malloc( 4096 );

    printf( "tid=%d, ptr=%x\n", gettid(), ptr );
    sleep(1);

    if( ptr )
        free( ptr );

    return 0;
}


int main( int argc, char **argv )
{
    int i, m;
    void *stk;
    int stksz = 1024 * 128;
    int flag = CLONE_VM | CLONE _FILES | CLONE_FS | CLONE_SIGHAND;

    for( i=m=0; i < 100; i++ )
    {
        stk = malloc( stksz );

        if( !stk ) break;

        if( clone( thread_func, stk+stksz, flags, NULL, NULL, NULL, NULL ) != -1 )
            m++;
    }

    printf( "create %d thread\n", m );
    sleep(10);

    return 0;
}

Testing result: thread thread_func or main thread main will be blocked on malloc() or free() function randomly. Or sometimes causes malloc() or free() to crash.

I think may be malloc() and free() need certain TLS data to distinguish every thread.

Does anyone know the reason, and what solution can been used to resolve this problem?

Upvotes: 2

Views: 846

Answers (2)

Florian Weimer
Florian Weimer

Reputation: 33727

Correct, malloc and free need TLS for at least the following things:

  1. The malloc arena attached to the current thread (used for allocation operations).
  2. The errno TLS variable (written to when system calls fail).
  3. The stack protector canary (if enabled and the architecture stores the canary in the TCB).
  4. The malloc thread cache (enabled by default in the upcoming glibc 2.26 release).

All these items need a properly initialized thread control block (TCB), but curiously, until recently and as far as malloc/free was concerned, it almost did not matter if a thread created with clone was shared with another TCB (so that the data is no longer thread-local):

Threads basically never reattach themselves to a different arena, so the arena TLS variable is practically read-only after initialization—and multiple threads can share a single arena. errno can be shared as long as system calls only fail in one of the threads undergoing sharing. The stack protector canary is read-only after process startup, and its value is identical across threads anyway.

But all this is an implementation detail, and things change radically in glibc 2.26 with its malloc thread cache: The cache is read and written without synchronization, so it is very likely that what you are trying to do results in memory corruption.

This is not a material change in glibc 2.26, it is always how things were: calling any glibc function from a thread created with clone is undefined. As John Bollinger pointed out, this mostly worked by accident before, but I can assure you that it has always been completely undefined.

Upvotes: 1

John Bollinger
John Bollinger

Reputation: 181159

I think may be malloc() and free() need certain TLS data to distinguish every thread.

Glibc's malloc() and free() do not rely on TLS. They use mutexes to protect the shared memory-allocation data structures. To reduce contention for those, they employ a strategy of maintaining separate memory-allocation arenas with independent metadata and mutexes. This is documented on their manual page.

After correcting the syntax errors in your code and dummying-out the call to non-existent function gettid() (see comments on the question), I was able to produce segmentation faults, but not blockage. Perhaps you confused the exit delay caused by your program's 10-second sleep with blockage.

In addition to any issue that may have been related to your undisclosed implementation of gettid(), your program contains two semantic errors, each producing undefined behavior:

  1. As I already noted in comments, it passes the wrong child-stack pointer values.*

  2. It uses the wrong printf() directive in thread_func() for printing the pointer. The directive for pointer values is %p; %x is for arguments of type unsigned int.

After I corrected those errors as well, the program consistently ran to completion for me. Revised code:

int thread_func(void *arg) {
    void *ptr = malloc(4096);

    // printf( "tid=%d, ptr=%x\n", gettid(), ptr );
    printf("tid=%d, ptr=%p\n", 1, ptr);
    sleep(1);

    if (ptr) {
        free(ptr);
    }

    return 0;
}

int main(int argc, char **argv) {
    int i, m;
    char *stk;  // Note: char * instead of void * to afford arithmetic
    int stksz = 1024 * 128;
    int flags = CLONE_VM | CLONE_FILES | CLONE_FS | CLONE_SIGHAND;

    for (i = m = 0; i < 100; i++) {
        stk = malloc( stksz );

        if( !stk ) break;

        if (clone(thread_func, stk + stksz - 1, flags, NULL, NULL, NULL, NULL ) != -1) {
            m++;
        }
    }

    printf("create %d thread\n", m);
    sleep(10);

    return 0;
}

Even with that, however, all is not completely well: I see various anomalies in the program output, especially near the beginning.

The bottom line is that, contrary to your assertion, you are not creating any threads, at least not in the sense that the C library recognizes. You are merely creating processes that have behavior similar to threads'. That may be sufficient for some purposes, but you cannot rely on the system to treat such processes identically to threads.

On Linux, bona fide threads that the system and standard library will recognize are POSIX threads, launched via pthread_create(). (I note here that modifying your program to use pthread_create() instead of clone() resolved the output anomalies for me.) You might be able to add flags and arguments to your clone() calls that make the resulting processes enough like the Linux implementation of pthreads to be effectively identical, but whyever would you do such a thing instead of just using real pthreads in the first place?


* The program also performs pointer arithmetic on a void *, which C does not permit. GCC accepts that as an extension, however, and since your code is deeply Linux-specific anyway, I'm letting that slide with only this note.

Upvotes: 2

Related Questions