Anand C U
Anand C U

Reputation: 915

Is there a way to check what part of my code leaves file handles open

Is there a way to track the python process to check where a file is being opened. I have too many files open when I use lsof on my running process but I'm not sure where they are being opened.

ls /proc/$pid/fd/ | wc -l

I suspect one of the libraries I'm using might have not handled the files properly. Is there a way to isolate exactly which line in my python code the files are being opened?

In my code I work with 3rd party libraries to process thousands of media files and since they are being left open I receive the error

OSError: [Errno 24] Too many open files

after running for a few minutes. Now I know raising the limit of open files is an option but this will just push the error to a later point of time.

Upvotes: 4

Views: 902

Answers (3)

Pi Marillion
Pi Marillion

Reputation: 4674

Seeing open file handles is easy on Linux:

open_file_handles = os.listdir('/proc/self/fd')
print('open file handles: ' + ', '.join(map(str, open_file_handles)))

You can also use the following on any OS (e.g. Windows, Mac):

import errno, os, resource
open_file_handles = []
for fd in range(resource.getrlimit(resource.RLIMIT_NOFILE)[0]):
    try: os.fstat(fd)
    except OSError as e:
        if e.errno == errno.EBADF: continue
    open_file_handles.append(fd)
print('open file handles: ' + ', '.join(map(str, open_file_handles)))

Note: This should always work assuming you're actually (occasionally) running out of file handles. There are usually a max of 256 file handles. But it might take a long time if the max (set by the OS/user policy) is something huge like a billion.

Note also: There will almost always be at least three file handles open for STDIN, STDOUT, and STDERR respectively.

Upvotes: 0

tkrennwa
tkrennwa

Reputation: 535

The easiest way to trace the open calls is to use an audit hook in Python. Note that this method would only trace Python open calls and not the system calls.

Let fdmod.py be a module file with a single function foo:

def foo():
    return open("/dev/zero", mode="r")

Now the main code in file fd_trace.py, which is tracing all open calls and importing fdmod, is defined follows:

import sys
import inspect
import fdmod

def open_audit_hook(name, *args):
    if name == "open":
        print(name, *args, "was called:")
        caller = inspect.currentframe()
        while caller := caller.f_back:
            print(f"\tFunction {caller.f_code.co_name} "
                  f"in {caller.f_code.co_filename}:"
                  f"{caller.f_lineno}"
            )
sys.addaudithook(open_audit_hook)

# main code
fdmod.foo()
with open("/dev/null", "w") as dev_null:
    dev_null.write("hi")
fdmod.foo()

When we run fd_trace.py, we will print the call stack whenever some component is calling open:

% python3 fd_trace.py
open ('/dev/zero', 'r', 524288) was called:
        Function foo in /home/tkrennwa/fdmod.py:2
        Function <module> in fd_trace.py:17
open ('/dev/null', 'w', 524865) was called:
        Function <module> in fd_trace.py:18
open ('/dev/zero', 'r', 524288) was called:
        Function foo in /home/tkrennwa/fdmod.py:2
        Function <module> in fd_trace.py:20

See sys.audithook and inspect.currentframe for details.

Upvotes: 8

Dave Costa
Dave Costa

Reputation: 48111

You might get useful information using strace. This will show all system calls made by a process, including calls to open(). It will not directly show you where in the Python code those calls are occurring, but you may be able to deduce some information from the context.

Upvotes: 1

Related Questions