Reputation: 3937
I am trying to rewrite code previously written for Python 2.7
into Python 3.4
. I get the error zipfile.BadZipFile: File is not a zip file
in the line zipfile = ZipFile(StringIO(zipdata))
in the code below.
import csv
try:
from StringIO import StringIO
except ImportError:
from io import StringIO
import pandas as pd
import os
from zipfile import ZipFile
from pprint import pprint, pformat
import urllib.request
import urllib.parse
try:
import urllib.request as urllib2
except ImportError:
import urllib2
my_url = 'http://www.bankofcanada.ca/stats/results/csv'
data = urllib.parse.urlencode({"lookupPage": "lookup_yield_curve.php",
"startRange": "1986-01-01",
"searchRange": "all"})
# request = urllib2.Request(my_url, data)
# result = urllib2.urlopen(request)
binary_data = data.encode('utf-8')
req = urllib.request.Request(my_url, binary_data)
result = urllib.request.urlopen(req)
zipdata = result.read().decode("utf-8",errors="ignore")
zipfile = ZipFile(StringIO(zipdata))
df = pd.read_csv(zipfile.open(zipfile.namelist()[0]))
df = pd.melt(df, id_vars=['Date'])
df.rename(columns={'variable': 'Maturity'}, inplace=True)
Thank You
Upvotes: 1
Views: 1856
Reputation: 3507
You shouldn't be decoding the data you get back in the result. The data is the bytes for the ZipFile, not bytes which are the encoding of a unicode string. I think your confusion arises because in Python 2 there is no distinction, but here in Python 3 you need a BytesIO not a StringIO.
So that part of your code should read:
zipdata = result.read()
zipfile = ZipFile(BytesIO(zipdata))
df = pd.read_csv(zipfile.open(zipfile.namelist()[0]))
The data you are getting back is not utf-8 encoded so you can't decode it that way. You would have found that more easily if you hadn't specified errors = "ignore"
, which is seldom a good idea ...
Upvotes: 1