Scott
Scott

Reputation: 330

How to mock and test python open and pandas to_pickle

I am trying to test this function that takes a pandas dataframe row which it uses to make an ftp call saved to csv, opens that csv file, formats it, and saves it as a pickle.

I want to test the following:

  1. builtins.open is called once with (path_to_raw, 'wb')
  2. to_pickle is called once with (LOCAL_PKL.format(row.name))

Patching builtins.open does not seem to work since it is called indirectly by to_pickle, so the tests fail as builtins.open is called twice.

Function to Test:

def download_file(row):
    path_from = row['source']
    path_to_raw = LOCAL_RAW.format(row.name)

    self.connection = FTP(self.url)
    self.connection.login(self.username, self.password)
    with open(path_to_raw, 'wb') as f:
        self.connection.retrbinary('RETR ' + path_from, f.write)
    self.connection.quit()

    data = pd.read_csv(path_to_raw)
    data.columns = ['a','b','c']
    data.to_pickle(LOCAL_PKL.format(row.name))

Unit Tests:

import pandas as pd
import unittest.mock as mock
from unittest.mock import patch, mock_open, MagicMock, call
import maintain

@patch('builtins.open', create=True)
@patch('maintain.pd.read_csv')
def test_download_path(self, mock_open, mock_pd_read_csv):

    mock_pd_read_csv.return_value = pd.DataFrame()      

    @mock.create_autospec
    def mock_pd_to_pickle(self, path):
        pass

    with patch.object(pd.DataFrame, 'to_pickle', mock_pd_to_pickle):

        real = maintain.DataFTP()
        real.connection = MagicMock(name='connection')

        row = pd.Series(data=['a','b'], index=['c','d'])
        row.name = 'anything'

        print(mock_open.assert_called_once_with(maintain.LOCAL_RAW.format(row.name), 'wb'))
        print(mock_pd_to_pickle.assert_called_once_with(maintain.LOCAL_PKL.format(row.name)))

So... this is clear wrong, but I'm not sure why. This test produces this error:

AssertionError: Expected 'read_csv' to be called once. Called 0 times.

Does anyone have any suggestions or know how to solve this. Thank you!

Upvotes: 4

Views: 6655

Answers (1)

Scott
Scott

Reputation: 330

I finally got it working with this:

@patch('builtins.open', new_callable=mock_open)
@patch('maintain.pd.read_csv', return_value=pd.DataFrame())
@patch.object(pd.DataFrame, 'to_pickle')
def test_download_path(self, mock_to_pickle, mock_read_csv, mock_open):

    real = maintain.EODDataFTP()
    real.connection = mock.Mock(name='connection')

    row = pd.Series(data=['','nyse'], index=['source','exchange'])
    row.name = 'anything'

    real.download_file(row)

    mock_open.assert_called_once_with(maintain.LOCAL_RAW.format(row.name), 'wb')
    mock_read_csv.assert_called_once()
    mock_to_pickle.assert_called_once_with(maintain.LOCAL_PKL.format(row.name))

Upvotes: 6

Related Questions