ccsucic
ccsucic

Reputation: 129

How to run a pytest test function on all data files in a folder

I have a set of functions that I am attempting to write pytest unit tests for. The function I am trying to write a test for looks like so:

def IndexLevel(df: pd.DataFrame, index_row: str, start_col: str, end_col: str) -> pd.DataFrame:

    # Editing df
    return df_index

And the pytest function looks like so (df_format_index is a fixture to test df_index shape):

@pytest.mark.parametrize(
    "df, index_row, start_col, end_col, df_format_index",
    [(function("blah.txt"), "index level", "index level", "Equity 2", df_format_index)],
    indirect=["df_format_index"],
)
def test_IndexLevel(df: pd.DataFrame, index_row: str, start_col: str, end_col: str, df_format_index: pd.DataFrame):
    print("-----test-IndexLevel------")
    assert (IndexLevel(df: pd.DataFrame, index_row, start_col, end_col).shape == df_format_index.shape)

These functions work if I hard code the filename, but I would like to thoroughly test them by running the test on all data files in a folder. I have tried to use the following function, but it did not work:

def pytest_generate_tests(metafunc):
    filelist = glob.glob("Data/*.txt")
    metafunc.parametrize("filename", filelist)

How can I run the test on all files in the data folder without editing the original function?

Upvotes: 1

Views: 1764

Answers (1)

ccsucic
ccsucic

Reputation: 129

Here's what I ended up doing. Since pytest.mark.parametrize takes a list of tuples, I made a function that returns a list of tuples as testing parameters. Hope it helps somebody!

def IndexLevel(filename: str, index_row: str, start_col: str, end_col: str) -> pd.DataFrame:
    # Editing a df from the file
    return df_index
def file_loop_index() -> list:
    filenames = []
    files = glob.glob("Data/*.txt")
    for file in files:
        filenames.append(tuple((file, "index level", "s&p500", "equity 2", df_format_index)))
    return filenames

# Testing df_index shape against the dummy dataframe
@pytest.mark.parametrize("filename, index_row, start_col, end_col, df_format_index", file_loop_index(), indirect=["df_format_index"])
def test_IndexLevel(
    filename: str,
    index_row: str,
    start_col: str,
    end_col: str,
    df_format_index: pd.DataFrame,
):
    print("-----test-IndexLevel------")
    assert (IndexLevel(filename, index_row, start_col, end_col).shape == df_format_index.shape)

Upvotes: 1

Related Questions