Reputation: 525
One of the Linux kernel drivers I am developing is using network communication in the kernel (sock_create()
, sock->ops->bind()
, and so on).
The problem is there will be multiple sockets to receive data from. So I need something that will simulate a select()
or poll()
in kernel space. Since these functions use file descriptors, I cannot use the system calls unless I use the system calls to create the sockets, but that seems unnecessary since I am working in the kernel.
So I was thinking of wrapping the default sock->sk_data_ready
handler in my own handler (custom_sk_data_ready()
), which would unlock a semaphore. Then I can write my own kernel_select()
function that tries to lock the semaphore and does a blocking wait until it is open. That way the kernel function goes to sleep until the semaphore is unlocked by custom_sk_data_ready()
. Once kernel_select()
gets the lock, it unlocks and calls custom_sk_data_ready()
to relock it. So the only additional initialization is to run custom_sk_data_ready()
before binding a socket so the first call to custom_select()
does not falsely trigger.
I see one possible problem. If multiple receives occur, then multiple calls to custom_sk_data_ready()
will try unlock the semaphore. So to not lose the multiple calls and to track the sock
being used, there will have to be a table or list of pointers to the sockets being used. And custom_sk_data_ready()
will have to flag in the table/list which socket it was passed.
Is this method sound? Or should I just struggle with the user/kernel space issue when using the standard system calls?
Initial Finding:
All callback functions in the sock
structure are called in an interrupt context. This means they cannot sleep. To allow the main kernel thread to sleep on a list of ready sockets, mutexes are used, but the custom_sk_data_ready()
must act like a spinlock on the mutexes (calling mutex_trylock()
repeatedly). This also means that any dynamic allocation must use the GFP_ATOMIC
flag.
Additional possibility:
For every open socket, replace each socket's sk_data_ready()
with a custom one (custom_sk_data_ready()
) and create a worker (struct work_struct
) and work queue (struct workqueue_struct
). A common process_msg()
function will be use for each worker. Create a kernel module-level global list where each list element has a pointer to the socket and contains the worker structure. When data is ready on a socket, custom_sk_data_ready()
will execute and find the matching list element with the same socket, and then call queue_work()
with the list element's work queue and worker. Then the process_msg()
function will be called, and can either find the matching list element through the contents of the struct work_struct *
parameter (an address), or use the container_of()
macro to get the address of the list structure that holds the worker structure.
Which technique is the most sound?
Upvotes: 12
Views: 3374
Reputation: 2720
Your second idea sounds more like it will work.
The CEPH code looks like it does something similar, see net/ceph/messenger.c
.
Upvotes: 3