toriningen
toriningen

Reputation: 7462

Python type annotation for sequences of strings, but not for strings?

Is there a Python type hint that matches lists, tuples and possibly other sequential types, but does not match strings?

The issue is that strings are at the same time sequences of strings of length 1 (e.g. individual characters), so they technically match the Sequence[str], but providing a string to a function expecting a list of strings is an error in maybe 100% cases.

Is there a way to exclude strings from type annotation to make it something similar to non-existent And[Sequence[str], Not[str]]?

As for the purpose, I would like to annotate this function:

PathType = Union[str, os.PathLike]
def escape_cmdline(argv: Union[List[PathType], Tuple[PathType]]) -> str: ...

But existing signature looks bloated to me, and does not cover any custom types that are list and tuple compatible. Is there any better way?

Upvotes: 49

Views: 6912

Answers (7)

InSync
InSync

Reputation: 10472

The useful_types package has a SequenceNotStr type that goes:

(playgrounds: Mypy, Pyright)

from useful_types import SequenceNotStr

def first(v: SequenceNotStr[str]) -> str:
    return next(iter(v))
first(['foo'])  # fine
first('foo')    # error: Expected `SequenceNotStr[str]`, got `str`

This type is a Protocol. Here is its full definition, which can be copy-and-pasted directly if you don't want to introduce an extra dependency:

(Tweaking this so that it works with 3.11 and lower is left as an exercise for the reader.)

# Source from https://github.com/python/typing/issues/256#issuecomment-1442633430
class SequenceNotStr[T](Protocol):
    @overload
    def __getitem__(self, index: SupportsIndex, /) -> T: ...
    @overload
    def __getitem__(self, index: slice, /) -> Sequence[T]: ...
    def __contains__(self, value: object, /) -> bool: ...
    def __len__(self) -> int: ...
    def __iter__(self) -> Iterator[T]: ...
    def index(self, value: Any, start: int = 0, stop: int = ..., /) -> int: ...
    def count(self, value: Any, /) -> int: ...
    def __reversed__(self) -> Iterator[T]: ...

It works by relying on the fact that str.__contains__() does not accept object:

# (at both type checking time and runtime)
object() in ['foo']  # fine 
object() in 'foo'    # error
class SequenceNotStr[T](Protocol):
    ...
    def __contains__(self, value: object, /) -> bool: ...
    #                             ^^^^^^

# https://github.com/python/typeshed/blob/7178fa3356/stdlib/typing.pyi#L613
class Sequence(Reversible[_T_co], Collection[_T_co]):
    ...
    def __contains__(self, value: object) -> bool: ...
    #                             ^^^^^^

# https://github.com/python/typeshed/blob/7178fa3356/stdlib/builtins.pyi#L710-L711
class str(Sequence[str]):
    ...
    # Incompatible with Sequence.__contains__
    def __contains__(self, key: SupportsIndex | ReadableBuffer, /) -> bool: ...
    #                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Upvotes: 0

Deep Drop
Deep Drop

Reputation: 143

There is a way you can annotate function to take Sequence[str] or Iterable[str], but not str, so that the code after the wrong function call (with str argument) is checked. Pyright can handle this correctly (complain about wrong calls), not sure if some other type checkers can:


from typing import overload, Sequence
from warnings import deprecated   # python 3.13 +
# from typing_extensions import deprecated  # python 3.12 -


@overload
@deprecated('v must not be a string')
def first(v: str) -> ...: ...


@overload
def first(v: Sequence[str]) -> ...: ...


def first(v: Sequence[str]) -> ...:
    # actual code
    ...


first('STR')
# the code below is checked
{}.update().items()  # Cannot access attribute "items" for class "None"

Upvotes: 3

InSync
InSync

Reputation: 10472

Now that PEP 702 is accepted, there's another workaround: Use @deprecated with overloads you don't want your users to use. This decorator has yet to be supported by Mypy, however.

(playground: Pyright)

# 3.13+
from warnings import deprecated
# 3.12-
from typing_extensions import deprecated
@overload
@deprecated('v must not be a string')
def first(v: str) -> Never: ...

@overload
def first(v: Iterable[str]) -> str: ...

def first(v: Iterable[str]) -> str:
    return next(iter(v))

While it is true that @deprecated has runtime effects, the overloads will just be overwritten by the implementation, giving you a type-checking-time-only warning.

first(['foo'])  # fine
first('foo')    # error: v must not be a string

The function "first" is deprecated: v must not be a string

Upvotes: 2

InSync
InSync

Reputation: 10472

Somewhat, in some circumstances and if you use Mypy

There is actually one way: Use an @overload with Never as the return type.

from typing import overload, Never
from collections.abc import Iterable  # Or Sequence

@overload
def first(v: str) -> Never: ...

@overload
def first(v: Iterable[str]) -> str: ...

def first(v: Iterable[str]) -> str:
    return next(iter(v))

This is not sufficient, however. Mypy will still consider the following as fine:

first(['foo'])  # Fine
first('foo')  # Anything after this line is simply ignored.

a = 'bar'
reveal_type(a)  # Silently emit nothing.

The --warn-unreachable flag allows you to configure this behaviour. With it, Mypy will raise an error saying the line is unreachable:

first(['foo'])  # Fine
first('foo')  # Still no error, however.

a = 'bar'  # error: Statement is unreachable

On the other hand, while raising no errors, Pylance, which uses Pyright under the hood, will fade unreachable code out:

Code is unreachable (Pylance)

It only does so for explicitly type-hinted code though:

def g() -> NoReturn: return first(v='foo') / g() / b = 'bar'

Upvotes: 2

ZF007
ZF007

Reputation: 3731

I might not fully understand your questions but to me it looks like you're in search for the following shortcut:

for object in data:
    if not isinstance(object, type):
        your code functions + list...
        .... etc.

wherease type is str and object an variable from your raw data provided through a list or tuple of items. If I misunderstood your question deepening your question with more details may perhaps help? Or was the above answer enough to get you going? Then a little feedback would be nice ;-)

Upvotes: 2

MacFreek
MacFreek

Reputation: 3446

Apparently, this is not possible with type hints. PEP 484 can not distinguish between Sequence[str], Iterable[str] and str according to Guido van Rossum.

Source: https://github.com/python/mypy/issues/1965 and https://github.com/python/typing/issues/256

Upvotes: 18

Tsagana Nokhaeva
Tsagana Nokhaeva

Reputation: 815

I couldn't find anything about the type exclusion or the type negation, seems like it's not supported in current version of Python 3. So the only distinctive feature of strings that crossed my mind is that strings are immutable. Maybe it'll help:

from typing import Union
from collections.abc import MutableSequence


MySequenceType = Union[MutableSequence, tuple, set]
def foo(a: MySequenceType):
    pass

foo(["09485", "kfjg", "kfjg"]) # passed
foo(("09485", "kfjg", "kfjg")) # passed
foo({"09485", "kfjg", "kfjg"}) # passed
foo("qwerty") # not passed

Upvotes: 5

Related Questions