Armen Michaeli
Armen Michaeli

Reputation: 9180

Is there any way to associate a file descriptor with user defined data?

I am writing a client-server application, and it uses POSIX poll function to provide a form of concurrent client handling. Clients also have state and other related data, which is stored in a client structure.

My immediate problem is that when I get a hint from poll to do I/O on a socket file descriptor that is associated with a client (conceptually), I have to actually match the file descriptor to its associated client data structure. Currently I do a O(n_clients) lookup (my client data structure stores the descriptor), but I was wondering whether there exists a better alternative?

Upvotes: 5

Views: 2674

Answers (4)

Armen Michaeli
Armen Michaeli

Reputation: 9180

Adding to all the other, very useful answers, I wanted to make the following information available, hoping it would be useful for others, in the spirit of knowledge base.

The thing is that if we assume a POSIX-compliant system, http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_14, it specifies the following (emphasis mine):

All functions that open one or more file descriptors shall, unless specified otherwise, atomically allocate the lowest numbered available (that is, not already open in the calling process) file descriptor at the time of each allocation. Where a single function allocates two file descriptors (for example, pipe() or socketpair()), the allocations may be independent and therefore applications should not expect them to have adjacent values or depend on which has the higher value.

This allows a program to simply reserve an array up to a number of descriptors it wants to support, where an open descriptor can just be used as array subscript to reference something like client connection related data. Basically, an open file descriptor on such systems can be straightforwardly used as an index into a table which is implemented as an array. After all, file descriptor numbers will not only grow upwards from lowest availbale values, they also seem to be reused -- if you close descriptor 10, while you still have descriptors 11 and upwards open, next time you open a descriptor, a POSIX-compliant system will open description with index 10. This also makes reuse of rows in your fd-indexed table very simple.

Upvotes: 1

mark4o
mark4o

Reputation: 60943

If you use poll() or select()/pselect() then you should keep the data yourself, e.g. in a hash table or array as others have mentioned. That is the most portable solution. Some of the alternative interfaces do have ways to associate your own user data. For example using asynchronous I/O (e.g. aio_read()), you can supply a user value sigev_value that can be passed to a signal handler or thread upon completion of the asynchronous request. The Linux epoll interface also allows user data to be specified for each file descriptor in the set.

Upvotes: 1

Adam Rosenfield
Adam Rosenfield

Reputation: 400622

No. If there were, it would have to be tracked by the kernel, and looking up that data would therefore involve a system call. The cost of a system call is an order of magnitude more expensive than doing an O(n) lookup in user space.

How many clients are you dealing with at once? Unless it's on the order of hundreds or more, the cost of a lookup is going to be miniscule compared to the cost of doing any sort of I/O.

Instead of using an O(n) lookup, you could also just use an array indexed by the file descriptor, assuming you won't have more than a certain number of descriptors open at once. For example:

#define MY_MAX_FD 1024  // Tune this to your needs
void *per_fd_data[MY_MAX_FD];

void *get_per_fd_data(int fd)
{
    assert(fd >= 0);
    if(fd < MY_MAX_FD)
        return per_fd_data[fd];
    else
    {
        // Look up fd in a dynamic associative array (left as an exercise to the
        // reader)
    }
}

Upvotes: 4

wildplasser
wildplasser

Reputation: 44250

Cheapest is to just make a fixed-size array of connection structures, with {state, *context, ..., maybe callback functions} per entry, indexed by fd (=O(1)). Memory is cheap, and you can afford a few hundred or thousand file descriptors and table entries.

EDIT: You dont need to make it fixed size. If your pollstructure or fdset is fixed: make it fixed; otherwise use getdtablesize() or getrlimit() to get the number of entries to allocate.

Upvotes: 2

Related Questions