HuLu ViCa
HuLu ViCa

Reputation: 5452

How to avoid breaking Liskov substitution principle when overriding method signature

I have an abstract class DataWriter that defines an abstract method write(). This class is supposed to be the base class for a dynamic set of concrete classes, each one of which is intended to implement its own version of the method write(). In order to define the datatype of parameter meta of the method write(), I created the type WriterMeta as follows:

WriterMeta = typing.Union[GSheetWritable, S3Writable, LocalWritable]

Each concrete class would be in charge of processing one of the different types that conform to the union, but linter mypy doesn't seem to grasp that, because when I define the signature of the method write() of the concrete class using one of the types of the union for parameter meta, it marks a Liskov substituion principle violation, which I believe doesn't exist because the concret class is a subset of the abstract class, which means that the parent class can substitute the child class with no problem.

This is my code:

class LocalWritable(typing.TypedDict):
    file_name: str


class GSheetWritable(typing.TypedDict):
    tab_name: str


class S3Writable(typing.TypedDict):
    data_name: str
    table_name: str


WriterMeta = typing.Union[GSheetWritable, S3Writable, LocalWritable]

class GSheetOutputWriter(DataWriter):
    def __init__(
        self, google_driver: GoogleApiDriver, folder: str, settings, timestamp, env
    ):
        self._connector = google_driver
        self.folder = folder
        self.settings = settings
        self.timestamp = timestamp
        self.env = env
        self.file_name = self.get_file_name()
        self._target = self.create_gsheet()
        self.new = True

    def get_file_name(self) -> str:
        file_name = (
            "boxes_shipping_costs_"
            + self.settings["volume_source"]
            + "_"
            + (
                self.timestamp
                if self.settings["volume_source"] == "adhoc"
                else self.settings["scm_week"]
            )
        )

        return file_name

    def create_gsheet(self):
        gsheet = self.connector.sheet_create(self.file_name, folder_id=self.folder)
        gsheet.worksheet("Sheet1").resize(rows=1, cols=1)

        return gsheet

    @property
    def connector(self) -> typing.Any:
        return self._connector

    @property
    def target(self) -> typing.Any:
        return self._target

    def write(self, data: pd.DataFrame, meta: GSheetWritable, versionize: bool):
        data = data.replace({np.nan: 0, np.Inf: "Inf"})

        print("Writing '{}' table to gsheet.".format(meta["tab_name"]))
        if self.new:
            tab = self.connector.get_worksheet(self.target.url, "Sheet1")
            self.connector.rename_worksheet(tab, meta["tab_name"])
            self.new = False
        else:
            tab = self.connector.add_worksheet(
                self.target, meta["tab_name"], rows=1, cols=1
            )

        time.sleep(random.randint(30, 60))
        self.connector.update_worksheet(
            tab, [data.columns.values.tolist()] + data.values.tolist()
        )

Is my understanding of Liskov sustitution principle right? How can I refactor this set of classes for mypy to accept them?

Upvotes: 3

Views: 831

Answers (1)

STerliakov
STerliakov

Reputation: 7913

Why this fails

When you declare abstract class method that operates on some type T (Union in your case), from mypy point of view it means the following: every subclass must implement this method that accepts type T or wider. Method, accepting only part of T, violates LSP: whenever you refer to abstract class, you suppose that everything of type T can be used in the method, while concrete implementation doesn't allow that.

Type safe way – generic solution

You can use generic classes for that.

import pandas as pd
from abc import ABC, abstractmethod
from typing import Generic, TypeVar, TypedDict

class LocalWritable(TypedDict):
    file_name: str

class GSheetWritable(TypedDict):
    tab_name: str

class S3Writable(TypedDict):
    data_name: str
    table_name: str

# This variable is compatible with any of classes or their subclasses,
# not with `Union` of them as opposed to TypeVar(bound=Union[...]) usage.
_OutputT = TypeVar('_OutputT', GSheetWritable, S3Writable, LocalWritable)

class DataWriter(ABC, Generic[_OutputT]):
    @abstractmethod
    def write(self, data: pd.DataFrame, meta: _OutputT, versionize: bool) -> None: ...

class GSheetDataWriter(DataWriter[GSheetWritable]):
    def write(self, data: pd.DataFrame, meta: GSheetWritable, versionize: bool) -> None:
        pass
        # Implementation goes here

Upvotes: 6

Related Questions