bmalbusca
bmalbusca

Reputation: 367

Using pandas.read_csv() is conflicting with csv.reader() - ValueError: I/O operation on closed file

I'm parsing a csv file, that was sent via POST FormData(), and then converting it to JSON. The problem appears when I used a package to validate the csv before pass through pandas. The validator function does her job and then, the normal reading with pandas gives the error ValueError: I/O operation on closed file


if request.method == 'POST':
        content = request.form
        data_header = json.loads(content.get('JSON'))
        filename = data_header['data'][0]['name']
        
        # Here! starts the problem
        # validator = validCSV(validator={'header': ["id","type","name","subtype","tag","block","latitude","longitude","height","max_alt","min_alt","power","tia","fwl"]})
        # print(validator.verify_header(request.files[filename]))
        # then pseudo-code: if returned false, will abort(404)
        
        try:
            df = pd.read_csv(request.files[filename], dtype='object')
            dictObj = df.to_dict(orient='records')

If we follow the issue to inside of this package, this is what we will see:

def verify_header(self, inputfile):
        with TextIOWrapper(inputfile, encoding="utf-8") as wrapper:
            header = next(csv.reader(wrapper))

It seems when the file is opened and closed by TextIOWrapper, pandas is no longer allowed to open the file using the read_csv(). But makying a copy of the file seems a waste for only read a header and i like the idea of using the csv.reader() because showed in other examples more efficiency reading a csv file than pandas.

what can be done to prevent I/O Error after another package had opened the file? Or a simple and efficient way to validate the csv without using a heavy pandas

Upvotes: 1

Views: 250

Answers (1)

bmalbusca
bmalbusca

Reputation: 367

The solution was seek() the pointer to beginning of the file after read the first line. The process of reading is almost the same as what pandas do. The only apparently advantage is that it does not depend on importing/installing pandas.

wrapper = StringIO(inputfile.readline().decode('utf-8'))
        header = next(csv.reader(wrapper,  delimiter=','))
        inputfile.seek(0,0)

Upvotes: 2

Related Questions