ssk
ssk

Reputation: 9255

How to measure and fix context switching bottlenecks?

I have a multi-threaded socket program. I use boost threadpool (http://threadpool.sourceforge.net/) for executing tasks. I create a TCP client socket per thread in threadpool. Whenever I send large amount of data say 500KB (message size), the throughput reduces significantly. I checked my code for:

1) Waits that might cause context-switching 2) Lock/Mutexes

For example, a 500KB message is divided into multiple lines and I send each line through the socket using ::send( ).

typedef std::list< std::string > LinesListType;
// now send the lines to the server
for ( LinesListType::const_iterator it = linesOut.begin( );
      it!=linesOut.end( );
      ++it )
{
    std::string line = *it;
    if ( !line.empty( ) && '.' == line[0] )
    {
        line.insert( 0, "." );
    }

   SendData( line + CRLF );
}

SendData:

void SendData( const std::string& data )
{
    try
    {
        uint32_t bytesToSendNo  = data.length();
        uint32_t totalBytesSent = 0;

        ASSERT( m_socketPtr.get( ) != NULL )
        while ( bytesToSendNo > 0 )
        {
            try
            {
                int32_t ret = m_socketPtr->Send( data.data( ) + totalBytesSent, bytesToSendNo );

                if ( 0 == ret )
                {
                    throw;
                }

                bytesToSendNo -= ret;
                totalBytesSent += ret;
            }
            catch( )
            {
            }
        }
    }
    catch()
    {

    }
}

Send Method in Client Socket:

int Send( const char* buffer, int length )
{
    try
    {
        int bytes = 0;
        do
        {
            bytes = ::send( m_handle, buffer, length, MSG_NOSIGNAL );
        }
        while ( bytes == -1 && errno == EINTR );

        if ( bytes == -1 )
        {
            throw SocketSendFailed( );
        }

        return bytes;

    }
    catch( )
    {

    }
}

Invoking ::select() before sending caused context switches since ::select could block. Holding a lock on shared mutex caused parallel threads to wait and switch context. That affected the performance.

Is there a best practice for avoiding context switches especially in network programming? I have spent at least a week trying to figure out various tools with no luck (vmstat, callgrind in valgrind). Any tools on Linux would help measuring these bottlenecks?

Upvotes: 1

Views: 1480

Answers (1)

Ulrich Eckhardt
Ulrich Eckhardt

Reputation: 17415

In general, not related to networking, you need one thread for each resource that could be used in parallel. In other words, if you have a single network interface, a single thread is enough to service the network interface. Since you don't typically just receive or send data but also do something with it, your thread then switches to consume a different resource like e.g. the CPU for computations or the IO channel to the harddisk for storage or retrieval. This task then needs to be done in a different thread, while the single network thread keeps retrieving messages from the network.

As a consequence, your approach of creating a thread for each connection seems a simple way to keep things clean and separate, but it simply doesn't scale since it involves too much unnecessary context switching. Instead, keep the networking in one place if you can. Also, don't reinvent the wheel. There are tools like e.g. zeromq out there that serve several connections, assemble whole messages from fragmented network packets and only invoke a callback when one message was completely received. And it does so performantly, so I'd suggest using this tool as a base for your communication. In addition, it provides a plethora of language bindings, so you can quickly prototype nodes using a scripting language and switch to C++ for performance lateron.

Lastly, I'm afraid that the library you are using (which does not seem to be part of Boost!) is abandonware, i.e. its development is discontinued. I'm not sure of that, but looking at the changelog, they claim that they made it compatible to Boost 1.37, which is really old. Make sure that what you are using is worth your time!

Upvotes: 2

Related Questions