Reputation: 2256
The following code works in Python3 but fails in Python2
r = requests.get("http://api.bitcoincharts.com/v1/csv/coinbaseUSD.csv.gz", stream=True)
decompressed_file = gzip.GzipFile(fileobj=r.raw)
data = pd.read_csv(decompressed_file, sep=',')
data.columns = ["timestamp", "price" , "volume"] # set df col headers
return data
The error I get in Python2 is the following:
TypeError: 'int' object has no attribute '__getitem__'
The error is on the line where I set data equal to pd.read_csv(...)
Seems to be a pandas error to me
Stacktrace:
Traceback (most recent call last):
File "fetch.py", line 51, in <module>
print(f.get_historical())
File "fetch.py", line 36, in get_historical
data = pd.read_csv(f, sep=',')
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 709, in parser_f
return _read(filepath_or_buffer, kwds)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 449, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 818, in __init__
self._make_engine(self.engine)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 1049, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 1695, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 562, in pandas._libs.parsers.TextReader.__cinit__
File "pandas/_libs/parsers.pyx", line 760, in pandas._libs.parsers.TextReader._get_header
File "pandas/_libs/parsers.pyx", line 965, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas/_libs/parsers.pyx", line 2197, in pandas._libs.parsers.raise_parser_error
io.UnsupportedOperation: seek
Upvotes: 1
Views: 1091
Reputation: 13274
The issue from the traceback you posted is related to the fact that the Response
object's raw attribute is a file-like object that does not support the .seek
method that typical file objects support. However, when ingesting the file object with pd.read_csv
, pandas (in python2) seems to be making use of the seek
method of the provided file object.
You can confirm that the returned response's raw data is not seekable by calling r.raw.seekable()
, which should normally return False
.
The way to circumvent this issue may be to wrap the returned data into an io.BytesIO
object as follows:
import gzip
import io
import pandas as pd
import requests
# file_url = "http://api.bitcoincharts.com/v1/csv/coinbaseUSD.csv.gz"
file_url = "http://api.bitcoincharts.com/v1/csv/aqoinEUR.csv.gz"
r = requests.get(file_url, stream=True)
dfile = gzip.GzipFile(fileobj=io.BytesIO(r.raw.read()))
data = pd.read_csv(dfile, sep=',')
print(data)
0 1 2
0 1314964052 2.60 0.4
1 1316277154 3.75 0.5
2 1316300526 4.00 4.0
3 1316300612 3.80 1.0
4 1316300622 3.75 1.5
As you can see, I used a smaller file from the directory of files available. You can switch this to your desired file.
In any case, io.BytesIO(r.raw.read())
should be seekable, and therefore should help avoid the io.UnsupportedOperation
exception you are encountering.
As for the TypeError
exception, it is inexistent in this snippet of code.
I hope this helps.
Upvotes: 3