1 file descriptor for multiple threads, shows multiple open files on losf

Question

I have a program with ~200 threads active at a time. When I open a fd, I know it is shared between the threads.

In /proc/[pid]/fd I can see really only 1 fd, but when looking at all the open files, using lsof I can see the file is opened for each thread. (e.g. same file shows 200 times, with same pid, and different tid)

What is the reason for that?

Also, I need to have different threads writing to the same file (different locations). Is it thread safe to use this 1 fd? (does not make sense to me, but if the file is opened once per thread already, as shown by lsof it could be safe).

Andrew Henle · Accepted Answer

lsof lists the file for each "thread" because Linux threads aren't true threads because of the underlying OS design.

The first threads on Linux were "LinuxThreads":

In the Linux operating system, LinuxThreads was a partial implementation of POSIX Threads. It has since been superseded by the Native POSIX Thread Library (NPTL).1 The main developer of LinuxThreads was Xavier Leroy.

LinuxThreads had a number of problems, mainly owing to the implementation, which used the clone system call to create a new process sharing the parent's address space. For example, threads had distinct process identifiers, causing problems for signal handling; LinuxThreads used the signals SIGUSR1 and SIGUSR2 for inter-thread coordination, meaning these signals could not be used by programs.

To improve the situation, two competing projects were started to develop a replacement; NGPT (Next Generation POSIX Threads) and NPTL. NPTL won out and is today shipped with the vast majority of Linux systems.

LinuxThreads were replaced by NPTL - Native POSIX Thread Library. But there is still a fundamental lack of actual, full kernel-level threads:

Design

NPTL uses a similar approach to LinuxThreads, in that the primary abstraction known by the kernel is still a process, and new threads are created with the clone() system call (called from the NPTL library).

Most of the time, the fact that Linux lacks full kernel-level threads isn't apparent.

And it really doesn't matter how the OS handles concurrent processing.

But that's why lsof list the file as open by multiple "processes". Because it is. It's just that those "processes" share the same address space, along with a lot of other resources.

Note that one of the "shared resources" is the current offset of an open file descriptor - if you change the offset in one thread, you change it for all threads in the process.

If need to write to a file open via one file descriptor from multiple threads, you can use the pwrite() function to atomically write to an arbitrary location in the file, without regard to the descriptor's current offset:

#include 

ssize_t pwrite(int fildes, const void *buf, size_t nbyte,
       off_t offset);
...

The pwrite() function shall be equivalent to write(), except that it writes into a given position and does not change the file offset (regardless of whether O_APPEND is set). The first three arguments to pwrite() are the same as write() with the addition of a fourth argument offset for the desired position inside the file. An attempt to perform a pwrite() on a file that is incapable of seeking shall result in an error.

Note that on Linux, if you open the file with O_APPEND, pwrite() is broken:

BUGS

POSIX requires that opening a file with the O_APPEND flag should have no effect on the location at which pwrite() writes data. However, on Linux, if a file is opened with O_APPEND, pwrite() appends data to the end of the file, regardless of the value of offset.

1 file descriptor for multiple threads, shows multiple open files on losf

Answers (1)

Related Questions