hacikho
hacikho

Reputation: 147

How to write a pytest unit test for a function if it uses __init__ values?

I like to write a unit test for the extract function by using pytest, but is depends on init, so my question is how may I write a unit test for extract function? Moreover I need to make sure if extract calls _funct_2 or funct_3 depend on the initialization values.

class Extraction:

def __init__(self, spark: SparkSession, dbutils: DBUtils, params: dict):
    self._params = params
    self._spark = spark
    self._dbutils = dbutils
    self.logger = getLogger(Extraction.__name__)

def extract(self) -> DataFrame:
    file_path = self._params["RawFilePath"]
    path_length = len(file_path)

    self.logger.info("determine extraction method based on file length: {}".format(path_length))
    if len(file_path) == 0:
        self.logger.info("funct_2/db extraction case")
        return self._funct_2()
    else:
        self.logger.info("file based")
        return self._funct_3()

Upvotes: 1

Views: 1255

Answers (2)

dpb
dpb

Reputation: 3882

Not sure, but might be that you are conflating two concepts.

If you want to write a unit test to to test some Extraction functionality (that relies on a fully initialized Extraction object), you can simply write it in a test module:

Toy EG: test_extraction.py

def test_toy_extract():
    extract_obj = Extraction(...init params...)
    assert extract_obj.extract() == "Whatever"

Then just a matter of running pytest on the above module.

On the other hand, if you need setup (or teardown) behavior on a whole test module, class, or function, can do that: https://docs.pytest.org/en/stable/xunit_setup.html#class-level-setup-teardown

Upvotes: 1

Eric Truett
Eric Truett

Reputation: 3010

You can use pytest fixtures to create one object for your tests, and then pass that object to your tests of the object methods.

@pytest.fixture
def edb():
    # you need to specify parameter values here
    # this should be database, not file in params
    return Extraction(spark, dbutils, params)

@pytest.fixture
def efile():
    # you need to specify parameter values here
    # this should be file in params
    return Extraction(spark, dbutils, params)

def test_extract_funct2(edb):
    # test the _funct2 use case

def test_extract_funct3(efile):
    # test the _funct3 use case

I think the method would be easier to test if you made filepath a parameter to extract. Then yo could use the same object and just change the file_path parameter. The way it is set up now, you could set _params["RawFilePath"], but that is pretty awkward.

def extract(self, file_path: str = None) -> DataFrame:
    if not file_path:
        file_path = self._params.get("RawFilePath")
    # rest of method

Also, the detection for file_path seems likely to throw an error, as there is no reason to include a blank value for the RawFilePath key if you aren't using it. You can use get instead and then test for existence rather than by length. This is simpler and more pythonic than getting the length of a string or None value.

if self._params.get("RawFilePath"):
    self.logger.info("funct_2/db extraction case")
    return self._funct_2()
self.logger.info("file based")
return self._funct_3()

Upvotes: 0

Related Questions