Adam Jacobs
Adam Jacobs

Reputation: 423

How to process uploaded csv file from web form with Python 3?

I am trying to write some Python 3 code to process a csv file uploaded via a web form (using wsgi). I have managed to get the file to upload, but I am struggling to use Python's csv tools to process it. It seems to be to do with bytes vs strings.

Here is what I have tried:

import cgi, csv
form = cgi.FieldStorage(fp=environ['wsgi.input'],environ=environ)
upload = form['upload']
file = upload.file
data = csv.DictReader(file)
for line in data:
    #Do stuff here to process csv file

It gets as far as "for line in data", and then I get the following error message:

_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

So the problem is that my file is binary, but csv wants a string file, right? Any idea how I can fix this?

One possible workaround that occurs to me would be simply to read the lines of the file without using the csv module and process the data manually, which would work, but seems a bit faffy. It would be nice to use the functionality of Python's csv module if possible.

The web form from which the file is uploaded has the attribute

enctype="multipart/form-data"

which I gather is required for uploading files.

Upvotes: 3

Views: 2780

Answers (2)

Darwin
Darwin

Reputation: 338

Using flask, I did it this way. Maybe can be useful for someone.

file = request.files['file_uploaded']
str_file_value = file.read().decode('utf-8')
file_t = str_file_value.splitlines()
csv_reader = csv.reader(file_t, delimiter=',')
for row in csv_reader:
    # Do stuff here to process csv file

pd: credits for the @tsroten answer

Upvotes: 1

tsroten
tsroten

Reputation: 2764

In Python 3, the cgi documentation says: You can then read the data at leisure from the file attribute (the read() and readline() methods will return bytes). But, csv.DictReader expects the iterator to return a string, not bytes.

I would try this:

import cgi, csv
form = cgi.FieldStorage(fp=environ['wsgi.input'],environ=environ)
upload = form['upload']
str_file_value = upload.value.decode('utf-8')  # put correct encoding here
file = str_file_value.splitlines()
data = csv.DictReader(file)
for line in data:
    #Do stuff here to process csv file

splitlines() is called because csv.DictReader expects "any object which supports the iterator protocol and returns a string each time its __next__() method is called — file objects and list objects are both suitable". So, we can use the list that splitlines() creates.

Upvotes: 8

Related Questions