Leon Cruz
Leon Cruz

Reputation: 375

Type hints for dataclass defined inside a class with generic types

I know that the title is very confusing, so let me take the Binary Search Tree as an example:

Using ordinary class definition

# This code passed mypy test
from typing import Generic, TypeVar

T = TypeVar('T')
class BST(Generic[T]):
    class Node:        
        def __init__(
            self,
            val: T,
            left: 'BST.Node',
            right: 'BST.Node'
        ) -> None:
            self.val = val
            self.left = left
            self.right = right

The above code passed mypy test.

Using dataclass

However, when I tried to use dataclass to simplify the definition of Node, the code failed in mypy test.

# This code failed to pass mypy test
from dataclasses import dataclass
from typing import Generic, TypeVar

T = TypeVar('T')
class BST(Generic[T]):
    @dataclass
    class Node:
        val: T
        left: 'BST.Node'
        right: 'BST.Node'

mypy gave me this error message: (test_typing.py:8 is the line val: T)

test_typing.py:8: error: Type variable "test_typing.T" is unbound
test_typing.py:8: note: (Hint: Use "Generic[T]" or "Protocol[T]" base class to bind "T" inside a class)
test_typing.py:8: note: (Hint: Use "T" in function signature to bind "T" inside a function)

Pinpoint the problem

# This code passed mypy test, suggest the problem is the reference to `T` in the dataclass definition
from dataclasses import dataclass
from typing import Generic, TypeVar

T = TypeVar('T')
class BST(Generic[T]):
    @dataclass
    class Node:
        val: int # chose `int` just for testing
        left: 'BST.Node'
        right: 'BST.Node'

The above code agained passed the test, so I think the problem is the reference to T in the dataclass definition. Does anyone know how to future fix this to meet my original goal?

Upvotes: 6

Views: 6753

Answers (2)

MisterMiyagi
MisterMiyagi

Reputation: 51979

Nested classes cannot implicitly use a TypeVar from their containing classes: The nested class must be Generic with an unbound TypeVar.

BT = TypeVar('BT')
NT = TypeVar('NT')

class BST(Generic[BT]):
    root: 'BST.Node[BT]'  # root note is of same type as search tree

    @dataclass
    class Node(Generic[NT]):  # generic node may be of any type
        val: NT
        left: 'BST.Node[NT]'
        right: 'BST.Node[NT]'

This makes the nested class well-defined when referred to outside of its containing class. The underlying issue is that the nested class exists separately of the outer specialistaion – inference only knows BST.Node or BST.Node[T], not BST[T].Node.


Since nesting does not provide any functional advantages, it is usually simpler to define separate classes reusing the same TypeVar

T = TypeVar('T')

class BST(Generic[T]):
    root: 'Node[T]'

@dataclass
class Node(Generic[T]):
    val: T
    left: 'Node[T]'
    right: 'Node[T]'

Upvotes: 4

alex_noname
alex_noname

Reputation: 32233

Let's start with what is written in PEP 484 about scoping rules for type variables:

A generic class nested in another generic class cannot use same type variables. The scope of the type variables of the outer class doesn't cover the inner one:

T = TypeVar('T')
S = TypeVar('S')

class Outer(Generic[T]):
   class Bad(Iterable[T]):       # Error
       ...
   class AlsoBad:
       x = None  # type: List[T] # Also an error

   class Inner(Iterable[S]):     # OK
       ...
   attr = None  # type: Inner[T] # Also OK

This is why your example with nested decorated class does not work.

Now let's answer the question why the example works with __init__ function that takes TypeVar variable.

This is because the method __init__ is treated by mypy as a generic method with an independent TypeVar variable. For example reveal_type(BST[int].Node.__init__) shows Revealed type is 'def [T, T] (self: main.BST.Node, val: T'-1, left: main.BST.Node, right: main.BST.Node)'. i.e. T is not bound to int here.

Upvotes: 5

Related Questions