Swooz
Swooz

Reputation: 5

TypeError: a bytes-like object is required, not 'str' Using BytesIO

I'm getting a "TypeError: a bytes-like object is required, not 'str'". I was using StringIO and I got an error "TypeError: initial_value must be str or None, not bytes" I'm using Python 3.7.

    # Location of Alexa 1M
ALEXA_1M = 'http://s3.amazonaws.com/alexa-static/top-1m.csv.zip'

# Our ourput file containg all the training data
DATA_FILE = 'traindata.pkl'

def get_alexa(num, address=ALEXA_1M, filename='top-1m.csv'):
    """Grabs Alexa 1M"""

    url = urlopen(address)
zipfile = ZipFile(BytesIO(url.read()))
return [tldextract.extract(x.split(',')[1]).domain for x in \
        zipfile.read(filename).decode('utf-8').split()[:num]]

I also get the same error for this function as well. "return pickle.load(open(DATA_FILE))"

"""Grab all data for train/test and save

    force:If true overwrite, else skip if file
          already exists
    """
    if force or (not os.path.isfile(DATA_FILE)):
        domains, labels = gen_malicious(10000)

        # Get equal number of benign/malicious
        domains += get_alexa(len(domains))
        labels += ['benign']*len(domains)

        pickle.dump(zip(labels, domains), open(DATA_FILE, 'w').decode("utf-8"))

def get_data(force=False):
    """Returns data and labels"""
    gen_data(force)

    return pickle.load(open(DATA_FILE))

Upvotes: 0

Views: 1190

Answers (1)

Zichzheng
Zichzheng

Reputation: 1290

The error basically says your string is byte string. To solve this, I think you can try to use .decode('utf-8')

url = urlopen(address)
zipfile = ZipFile(BytesIO(url.read()))
return [tldextract.extract(x.split(',')[1]).domain for x in \
        zipfile.read(filename).decode('utf-8').split()[:num]]

Upvotes: 1

Related Questions