Reputation: 95
I am currently working on adding type hints to a project and can't figure out how to get this right. I have a list of lists, with the nested list containing two elements of type int and float. The first element of the nested list is always an int and the second is always a float.
my_list = [[1000, 5.5], [1432, 2.2], [1234, 0.3]]
I would like to type annotate it so that unpacking the inner list in for loops or loop comprehensions keeps the type information. I could change the inner lists to tuples and would get what I'm looking for:
def some_function(list_arg: list[tuple[int, float]]): pass
However, I need the inner lists to be mutable. Is there a nice way to do this for lists? I know that abstract classes like Sequence and Collection do not support multiple types.
Upvotes: 8
Views: 1681
Reputation: 2504
One small addition to the TallChuck
's response:
from recordclass import dataobject
class Sample(dataobject):
atomCount: int
atomicMass: float
The class Sample
doesn't support namedtuple
-like API. But it support some kind of dataclasses
-like API.
Upvotes: 0
Reputation: 1972
I think the question highlights a fundamental difference between statically typed Python and dynamically typed Python. For someone who is used to dynamically typed Python (or Perl or JavaScript or any number of other scripting languages), it's perfectly normal to have diverse data types in a list. It's convenient, flexible, and doesn't require you to define custom data types. However, when you introduce static typing, you step into a tighter box that requires more rigorous design.
As several others have already pointed out, type annotations for lists require all elements of the list to be the same type, and don't allow you to specify a length. Rather than viewing this as a shortcoming of the type system, you should consider that the flaw is in your own design. What you are really looking for is a class with two data members. The first data member is named 0
, and has type int
, and the second is named 1
, and has type float
. As your friend, I would recommend that you define a proper class, with meaningful names for these data members. As I'm not sure what your data type represents, I'll make up names, for illustration.
class Sample:
def __init__(self, atomCount: int, atomicMass: float):
self.atomCount = atomCount
self.atomicMass = atomicMass
This not only solves the typing problem, but also gives a major boost to readability. Your code would now look more like this:
my_list = [Sample(1000, 5.5), Sample(1432, 2.2), Sample(1234, 0.3)]
def some_function(list_arg: list[Sample]): pass
I do think it's worth highlighting Stef's comment, which points to this question. The answers given highlight two useful features related to this.
First, as of Python 3.7, you can mark a class as a data class, which will automatically generate methods like __init__()
. The Sample
class would look like this, using the @dataclass
decorator:
from dataclasses import dataclass
@dataclass
class Sample:
atomCount: int
atomicMass: float
Another answer to that question mentions a PyPi package called recordclass, which it says is basically a mutable namedtuple
. The typed version is called RecordClass
from recordclass import RecordClass
class Sample(RecordClass):
atomCount: int
atomicMass: float
Upvotes: 5
Reputation: 3805
The mutability of the datastructure is not compatible with a static length and an invariant order of the types contained. It is not possible to statically analyze the sequence unpacking if you can sort, append, prepend or insert records into it.
Imagine the following snippet
def some_function(list_arg: list[int, float]): # Invalid Syntax :)
myint, myfloat = list_arg # ok?
list_arg.sort()
myint, myfloat = list_arg # ??????
if random.random() < .5:
list_arg.insert(1, 'yet another type!')
myint, myfloat = list_arg # 50% chance of an actual runtime error
# fat chance for any static analysis!
If your sequence mutability is an imperative, write a union or other richer type hints for the potential types of the contained objects
def some_function(list_arg: list[list[A|B]]): pass
or use a supertype. Because int is duck type compatible with float https://mypy.readthedocs.io/en/latest/duck_type_compatibility.html#duck-type-compatibility :
def some_function(list_arg: list[list[float]]): pass
If your datastructure will not be actually mutated, then choosing a list instead of a tuple was the first mistake.
Upvotes: 2