baxx
baxx

Reputation: 4745

How to typehint numpy array of floats and/or integers

I'm not sure how to go about typehinting the following:

def prop(
    *,
    arr,  # numpy array of floats or/and ints
):
    return 100 * arr / arr.sum()

I've tried the following:


def prop(
    *,
    arr: npt.NDArray[np._IntType] | npt.NDArray[np._FloatType],  # numpy array of floats or/and ints
) -> npt.NDArray[np._IntType] | npt.NDArray[np._FloatType]:
    return 100 * arr / arr.sum()

And am getting the errors from running mypy <path>/check_mypy.py --strict:

check_mypy.py:12: error: Returning Any from function declared to return "Union[ndarray[Any, dtype[np._IntType]], ndarray[Any, dtype[np._FloatType]]]"  [no-any-return]
check_mypy.py:17: error: Need type annotation for "x"  [var-annotated]
check_mypy.py:18: error: Need type annotation for "x1"  [var-annotated]
check_mypy.py:22: error: Need type annotation for "y"  [var-annotated]
check_mypy.py:23: error: Need type annotation for "y1"  [var-annotated]
check_mypy.py:27: error: Need type annotation for "z"  [var-annotated]
check_mypy.py:28: error: Need type annotation for "z1"  [var-annotated]

Full example:

# check_mypy.py
from __future__ import annotations

import numpy as np
import numpy.typing as npt


def prop(
    *,
    arr: npt.NDArray[np._IntType] | npt.NDArray[np._FloatType],  # numpy array of floats or/and ints
) -> npt.NDArray[np._IntType] | npt.NDArray[np._FloatType]:
    return 100 * arr / arr.sum()


def main() -> int:
    # check 1 - ints
    x = np.array([1, 2, 3])
    x1 = prop(arr=x)
    print(x1)

    # check 2 - mixed
    y = np.array([1.4, 21, 3.2])
    y1 = prop(arr=y)
    print(y1)

    # check 3 - floats
    z = np.array([1.4, 2.1, 3.2])
    z1 = prop(arr=z)
    print(z1)

    return 0


if __name__ == "__main__":
    raise SystemExit(main())

edit 1

If you do x: npt.NDArray[np._IntType] = np.array([1, 2, 3]) does that work?

Following the above I get the following error output:


check_mypy.py:12: error: Returning Any from function declared to return "Union[ndarray[Any, dtype[np._IntType]], ndarray[Any, dtype[np._FloatType]]]"  [no-any-return]
check_mypy.py:17: error: Type variable "numpy._IntType" is unbound  [valid-type]
check_mypy.py:17: note: (Hint: Use "Generic[_IntType]" or "Protocol[_IntType]" base class to bind "_IntType" inside a class)
check_mypy.py:17: note: (Hint: Use "_IntType" in function signature to bind "_IntType" inside a function)
check_mypy.py:18: error: Need type annotation for "x1"  [var-annotated]
check_mypy.py:23: error: Need type annotation for "y"  [var-annotated]
check_mypy.py:24: error: Need type annotation for "y1"  [var-annotated]
check_mypy.py:28: error: Need type annotation for "z"  [var-annotated]
check_mypy.py:29: error: Need type annotation for "z1"  [var-annotated]

edit 2.

After updating the function to:

T = TypeVar('T', np._IntType, np._FloatType)


def prop(
    *,
    arr: npt.NDArray[T],  # numpy array of floats or/and ints
) -> npt.NDArray[T]:
    return 100 * arr / arr.sum()

I get the following error output:

check_mypy.py:15: error: Type variable "numpy._IntType" is unbound  [valid-type]
check_mypy.py:15: note: (Hint: Use "Generic[_IntType]" or "Protocol[_IntType]" base class to bind "_IntType" inside a class)
check_mypy.py:15: note: (Hint: Use "_IntType" in function signature to bind "_IntType" inside a function)
check_mypy.py:15: error: Type variable "numpy._FloatType" is unbound  [valid-type]
check_mypy.py:15: note: (Hint: Use "Generic[_FloatType]" or "Protocol[_FloatType]" base class to bind "_FloatType" inside a class)
check_mypy.py:15: note: (Hint: Use "_FloatType" in function signature to bind "_FloatType" inside a function)
check_mypy.py:22: error: Returning Any from function declared to return "ndarray[Any, dtype[np._IntType?]]"  [no-any-return]
check_mypy.py:22: error: Returning Any from function declared to return "ndarray[Any, dtype[np._FloatType?]]"  [no-any-return]
check_mypy.py:27: error: Type variable "numpy._IntType" is unbound  [valid-type]
check_mypy.py:27: note: (Hint: Use "Generic[_IntType]" or "Protocol[_IntType]" base class to bind "_IntType" inside a class)
check_mypy.py:27: note: (Hint: Use "_IntType" in function signature to bind "_IntType" inside a function)
check_mypy.py:33: error: Need type annotation for "y"  [var-annotated]
check_mypy.py:38: error: Need type annotation for "z"  [var-annotated]

edit 3

If you actually want to allows mixed arrays, then you can drop the type variable and use npt.NDArray[ np._IntType | np._FloatType ] in both cases

Updating the function to:

def prop(
    *,
    arr: npt.NDArray[np._IntType | np._FloatType],  # numpy array of floats or/and ints
) -> npt.NDArray[np._IntType | np._FloatType]:
    return 100 * arr / arr.sum()

Gives the following errors:

check_mypy.py:22: error: Returning Any from function declared to return "ndarray[Any, dtype[Union[np._IntType, np._FloatType]]]"  [no-any-return]
check_mypy.py:27: error: Type variable "numpy._IntType" is unbound  [valid-type]
check_mypy.py:27: note: (Hint: Use "Generic[_IntType]" or "Protocol[_IntType]" base class to bind "_IntType" inside a class)
check_mypy.py:27: note: (Hint: Use "_IntType" in function signature to bind "_IntType" inside a function)
check_mypy.py:30: error: Need type annotation for "x1"  [var-annotated]
check_mypy.py:35: error: Need type annotation for "y"  [var-annotated]
check_mypy.py:36: error: Need type annotation for "y1"  [var-annotated]
check_mypy.py:40: error: Need type annotation for "z"  [var-annotated]
check_mypy.py:41: error: Need type annotation for "z1"  [var-annotated]

Upvotes: 6

Views: 7753

Answers (1)

Simon Hawe
Simon Hawe

Reputation: 4539

I was really curious to see how that could go. The best I was able to get is the following

# check_mypy.py
from __future__ import annotations

# Third party
import numpy as np
import numpy.typing as npt
from typing import cast


def prop(
    *,
    arr: npt.NDArray[np.float64] | npt.NDArray[np.int64] ,  # numpy array of floats or/and ints
) -> npt.NDArray[np.float64]:
    return cast(npt.NDArray[np.float64], 100 * arr / arr.sum())


def main() -> int:
    # check 1 - ints
    x : npt.NDArray[np.float64] = np.array([1, 2, 3])
    x1 = prop(arr=x)
    print(x1)

    # check 2 - mixed
    y : npt.NDArray[np.float64]  = np.array([1.4, 21, 3.2])
    y1 = prop(arr=y)
    print(y1)

    # check 3 - floats
    z: npt.NDArray[np.int64] = np.array([1.4, 2.1, 3.2])
    z1 = prop(arr=z)
    print(z1)

    return 0


if __name__ == "__main__":
    raise SystemExit(main())

So no mixed types as this doesn't make sense. Numpy arrays are always of a single type, even though you add other types when defining the array.

Using np.float64/np.int64 was actually what made it work in the end. Not sure if that is acceptable for you. Here you might want to define a Union or type var of all floats, but again not sure. Import thing is that it is a subclass of NP.generic.

Type annotating all defined arrays was also needed as mypy cannot determine the type on its own.

Last, returning fixed npt.NDArray[np.float64] as what you are computing there is always of that type.

Upvotes: 5

Related Questions