Ryan
Ryan

Reputation: 1432

Converting from Python 2 to Python 3: TypeError: a bytes-like object is required

I was given the following Python 2x code. I went to convert it to Python 3x by changing import urllib2 to from urllib.request import urlopen. I got rid of the urllib2 reference and ran the program. The document at the end of the url was retrieved, but the program failed at the line indicated, throwing the error

TypeError: a bytes-like object is required, not 'str'

The document looks like this: b'9306112 9210128 9202065 \r\n9306114 9204065 9301122 \r\n9306115 \r\n9306116 \r\n9306117 \r\n9306118 \r\n9306119

I tried playing with the return value at that line and the one above (e.g., converting to bytes, splitting on different values), but nothing worked. Any thoughts as to what is happening?

import urllib2


CITATION_URL = "http://storage.googleapis.com/codeskulptor-alg/alg_phys-cite.txt"

def load_graph(graph_url):
    """
    Function that loads a graph given the URL
    for a text representation of the graph

    Returns a dictionary that models a graph
    """
    graph_file = urllib2.urlopen(graph_url)
    graph_text = graph_file.read()
    graph_lines = graph_text.split('\n') <--- The Problem
    graph_lines = graph_lines[ : -1]

    print "Loaded graph with", len(graph_lines), "nodes"

    answer_graph = {}
    for line in graph_lines:
        neighbors = line.split(' ')
        node = int(neighbors[0])
        answer_graph[node] = set([])
        for neighbor in neighbors[1 : -1]:
            answer_graph[node].add(int(neighbor))

    return answer_graph

citation_graph = load_graph(CITATION_URL)
print(citation_graph)

Upvotes: 0

Views: 661

Answers (2)

zwer
zwer

Reputation: 25799

You can only split likes with likes - if you want to split with \n while still keeping graph_text as bytes, define the split as a bytes sequence, too:

graph_lines = graph_text.split(b'\n')

Otherwise, if you know the codec your graph_text data was encoded with, first decode it into a str with: graph_text.decode("<codec>") and then continue treating it as a str.

Upvotes: 1

wizzwizz4
wizzwizz4

Reputation: 6426

In order to treat a bytes object like a string, you need to decode it first. For example:

graph_text = graph_file.read().decode("utf-8")

if the encoding is UTF-8. This should allow you to treat this as a string instead of a sequence of bytes.

Upvotes: 1

Related Questions