Anaphory
Anaphory

Reputation: 6412

mypy does not like aliased Cython types

I am trying to speed up a PEP 484 typed python script using Cython. I want to maintain some semantics and readability.

Before, I had a

Flags = int

def difference(f1: Flags, f2: Flags):
    return bin(f1 ^ f2).count("1")

Now this function gets called quite often and is a natural candidate for slight refactoring and compiling in C using Cython, but I would not want to lose the information that f1 and f2 are collections of flags. So, I obviously tried

import cython

Flags = cython.int

def difference(f1: Flags, f2: Flags):
    return bin(f1 ^ f2).count("1")

Now, mypy fails this, complaining

flags.py:5: error: Variable "flags.Flags" is not valid as a type
flags.py:5: note: See https://mypy.readthedocs.io/en/latest/common_issues.html#variables-vs-type-aliases
flags.py:6: error: Unsupported left operand type for ^ (Flags?)

whereas without that type alias

import cython

def difference(f1: cython.int, f2: cython.int):
    return bin(f1 ^ f2).count("1")

the module checks just fine (apart from the missing library stub for cython).

What is going on here? Isn't the point of a type alias exactly that there should be no difference in behaviour down the line?

Upvotes: 4

Views: 1202

Answers (1)

Michael0x2a
Michael0x2a

Reputation: 64188

The problem you're running into here is that since there are no type hints associated with cython, it's unfortunately ambiguous what exactly the expression cython.int is supposed to mean -- and therefore ambiguous what Flags = cython.int is supposed to mean.

In particular, it could be the case that cython.int is supposed to be a value, not a type. In that case, Flags = cython.int would be just a regular variable assignment, instead of a type alias.

While mypy could theoretically try analyzing the rest of your program to resolve this ambiguity, that would be somewhat expensive to do. So instead, it somewhat arbitrarily decides that cython.int must be a value (e.g. a constant) which in turn causes your difference function to fail to type check.

However, if you use the cython.int type directly in a type signature, we get no such ambiguity: in that context, that expression is most likely meant to be a type of some sort, so mypy decides to interpret the expression the other way.


So, how do you work around this? Well, there are several things you could try, which I'll list in roughly decreasing order of effort (and increasing order of hackyness).

  1. Submit a pull request to mypy implementing support for PEP 613. This PEP is intended to give users a way of directly resolving this ambiguity by letting them directly indicate whether something is supposed to be a type alias or not.

    This PEP has been accepted; the only reason why mypy doesn't support it is because nobody has gotten around to implementing it yet.

  2. Ask the Cython maintainers if they'd be ok with shipping stub files for cython by turning their package into a PEP 561 compliant package -- a package that comes bundled with type hints.

    It seems Cython is already bundling some type hints in a limited way, and making them available for external use may theoretically be as simple as testing to make sure they're still up-to-date and adding a py.typed file to the Cython package.

    More context about type hints in Cython can be found here and here.

    Mypy is also planning on overhauling how imports are handled so you can optionally use any bundled type hints even if the package isn't PEP 561 compliant during the next few months -- you could also wait for that to happen.

  3. Create your own stubs package for cython. This package can be incomplete and define only int and a few other things you need. For example, you could create a "stubs/cython.pyi" file that looks like this:

    from typing import Any
    
    # Defining these two functions will tell mypy that this stub file
    # is incomplete and to not complain if you try importing things other
    # than 'int'. 
    def __getattr__(name: str) -> Any: ...
    def __setattr__(name: str, value: Any) -> None: ...
    
    class _int:
        # Define relevant method stubs here
    

    Then, point mypy at this stub file in addition to your usual code. Mypy will then understand that it should use this stub file to serve as type hints for the cython module. This means that when you do cython.int, mypy will see that it's the class you defined up above and so will have enough information to know that Flags = cython.int is likely a type alias.

  4. Redefine what Flags is assigned when performing type checking only. You can do this via the typing.TYPE_CHECKING variable:

    from typing import TYPE_CHECKING
    import cython
    
    # The TYPE_CHECKING variable is always False at runtime, but is treated
    # as being always True for the purposes of type checking
    if TYPE_CHECKING:
        # Hopefully this is a good enough approximation of cython.int?
        Flags = int
    else:
        Flags = cython.int
    
    def difference(f1: Flags, f2: Flags):
        return bin(f1 ^ f2).count("1")
    

    One caveat to this approach is that I'm not sure to what degree Cython supports these sorts of PEP 484 tricks and whether it'll recognize that Flags is meant to be a type alias if it's wrapped in an if statement like this.

  5. Instead of making Flags a type alias for cython.int, make it a subclass:

    import cython
    
    class Flags(cython.int): pass
    
    def foo(a: Flags, b: Flags) -> Flags:
        return a ^ b
    

    Now, you're using cython.int back in a context where it's reasonable to assume it's a type, and mypy ends up not reporting an error.

    Granted, this does change the semantics of your program and may also make Cython unhappy -- I'm not really familiar with how Cython works, but I suspect you're not really meant to subclass cython.int.

Upvotes: 8

Related Questions