Stackexchange API encoding

Question

I am writing following decorator for Stackexchange API:

    class StackOverflowHandler(tornado.web.RequestHandler):

            def get(self, look_up_pattern):
                url = "https://api.stackexchange.com/2.2/search?order=desc&sort=votes&intitle=%s&site=stackoverflow"
                with urllib.request.urlopen(url % look_up_pattern) as so_response:
                response = so_response.read()
            print(response)
            self.write(response)

    application = tornado.web.Application([
        (r"/search/(.*)", StackOverflowHandler),
    ])

As response I get stream of bytes:

b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x04\x00\xb5\\x0b\x93\xa3F\x92\xfe+u\xe...

The question is who encode response? What is the correct Unicode to decode this? I checked utf-8, utf-16, zlib.decompress, etc.. it doesn't help.

Ethan Furman · Accepted Answer

The relevant portion of the answer linked to by Daniel Roseman is this:

if response.info().get('Content-Encoding') == 'gzip':
    buf = StringIO( response.read())
    f = gzip.GzipFile(fileobj=buf)
    data = f.read()

In other words, the encoding should be available as response.info().get('Content-Encoding')

Stackexchange API encoding

Answers (1)

Related Questions