Kodiologist
Kodiologist

Reputation: 3495

Show whether a Python module is loaded from bytecode

I'm trying to debug Hy's use of bytecode. In particular, each time a module is imported, I want to see the path it was actually imported from, whether source or bytecode. Under the hood, Hy manages modules with importlib. It doesn't explicitly read or write bytecode; that's taken care of by importlib.machinery.SourceFileLoader. So it looks like what I want to do is monkey-patch Python's importing system to print the import path each time an import happens. How can I do that? I should be able to figure out how to do it for Hy once I understand how to do it for Python.

Upvotes: 1

Views: 390

Answers (2)

Bastian Venthur
Bastian Venthur

Reputation: 16640

The easiest way that does not involve coding, is to start Python with two(!) verbose flags:

python -vv myscript.py

you'll get a lot of output, including all the import statements and all the files Python tries to access in order to load the module. In this example I have a simple python script that does import json:

lots of output!
[...]
# trying /tmp/json.cpython-310-x86_64-linux-gnu.so                                                                                                                                              
# trying /tmp/json.abi3.so                                                                                                                                                                      
# trying /tmp/json.so                                                                                                                                                                           
# trying /tmp/json.py                                                                                                                                                                           
# trying /tmp/json.pyc                                                                                                                                                                          
# /usr/lib/python3.10/json/__pycache__/__init__.cpython-310.pyc matches /usr/lib/python3.10/json/__init__.py                                                                                    
# code object from '/usr/lib/python3.10/json/__pycache__/__init__.cpython-310.pyc' 
[...]

Alternatively but more complex: you could change the import statement itself. For that, you can overwrite __import__, which is invoked by the import statement itself. This way you could print out all the files import actually opens.

Upvotes: 2

Phoenix
Phoenix

Reputation: 982

Seems like a good option would be to dynamically patch importlib.machinery.SourceFileLoader(fullname, path) and importlib.machinery.SourcelessFileLoader(fullname, path) to each print or write to a variable (a) the calling method and (b) the argument passed to the function.

If all you need to do is:

I want to see the path it was actually imported from, whether source or bytecode

And you don't need the import to "work properly", perhaps you can do a modified version of something like this. For example, I quickly modified their sample code to get this, I have not tested it so it may not work exactly, but it should get you on the right track:

# custom class to be the mock return value
class MockSourceLoader:

    # mock SourceFileLoader method always returns that the module was loaded from source and its path

    def SourceFileLoader(fullname, path):
        return {"load type": "source", "fullname": fullname, "path": path}

def check_how_imported(monkeypatch):

    # Any arguments may be passed and mock_get() will always return our
    # mocked object
    def mock_get(*args, **kwargs):
        return MockSourceLoader

    # apply the monkeypatch
    monkeypatch.setattr(importlib.machinery, SourceFileLoader, SourceFileLoader)

You would of course provide a similar mock for Sourceless file loading for SourcelessFileLoader

For reference:

Upvotes: 0

Related Questions