Reputation: 423
I am trying to write some Python 3 code to process a csv file uploaded via a web form (using wsgi). I have managed to get the file to upload, but I am struggling to use Python's csv tools to process it. It seems to be to do with bytes vs strings.
Here is what I have tried:
import cgi, csv
form = cgi.FieldStorage(fp=environ['wsgi.input'],environ=environ)
upload = form['upload']
file = upload.file
data = csv.DictReader(file)
for line in data:
#Do stuff here to process csv file
It gets as far as "for line in data", and then I get the following error message:
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
So the problem is that my file is binary, but csv wants a string file, right? Any idea how I can fix this?
One possible workaround that occurs to me would be simply to read the lines of the file without using the csv module and process the data manually, which would work, but seems a bit faffy. It would be nice to use the functionality of Python's csv module if possible.
The web form from which the file is uploaded has the attribute
enctype="multipart/form-data"
which I gather is required for uploading files.
Upvotes: 3
Views: 2780
Reputation: 338
Using flask, I did it this way. Maybe can be useful for someone.
file = request.files['file_uploaded']
str_file_value = file.read().decode('utf-8')
file_t = str_file_value.splitlines()
csv_reader = csv.reader(file_t, delimiter=',')
for row in csv_reader:
# Do stuff here to process csv file
pd: credits for the @tsroten answer
Upvotes: 1
Reputation: 2764
In Python 3, the cgi
documentation says: You can then read the data at leisure from the file
attribute (the read()
and readline()
methods will return bytes). But, csv.DictReader
expects the iterator to return a string, not bytes.
I would try this:
import cgi, csv
form = cgi.FieldStorage(fp=environ['wsgi.input'],environ=environ)
upload = form['upload']
str_file_value = upload.value.decode('utf-8') # put correct encoding here
file = str_file_value.splitlines()
data = csv.DictReader(file)
for line in data:
#Do stuff here to process csv file
splitlines()
is called because csv.DictReader
expects "any object which supports the iterator protocol and returns a string each time its __next__()
method is called — file objects and list objects are both suitable". So, we can use the list that splitlines()
creates.
Upvotes: 8