Neo630
Neo630

Reputation: 191

UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 3131: ordinal not in range(128)

I am trying to open multiple text files from a folder and one by one, format them.

My code:

import json
import yaml
import os
import string

list_num = 0

def load_knowledge():
    if os.path.exists("knowledge"):
        with open("knowledge") as f:
            knowledge = json.load(f)
    else:
        knowledge = {}
    return knowledge

def write_knowledge(knowledge):
    with open("knowledge", "w") as f:
        json.dump(knowledge, f, indent=2, sort_keys=True)

for item in os.listdir("/Users/'My username'/Desktop/'The directory'/yml"):
    with open("/Users/'My username'/Desktop/'The directory'/yml/" + item) as f:
        data = yaml.safe_load(f)


    dataDirectory = {}

    dataDict = {}
    dataDict.update(data)

    knowledge = load_knowledge()
    tag = dataDict['categories'][0]

    for i in range(len(dataDict['conversations'])-1):
        if dataDict['conversations'][list_num][0] == dataDict['conversations'][list_num+1][0]:
            dataDict['conversations'][list_num][1] = (str(dataDict['conversations'][list_num][1]) + ';' + str(dataDict['conversations'][list_num+1][1]))
            del dataDict['conversations'][list_num+1]
            list_num = list_num - 1
        list_num = list_num + 1

    for list in dataDict['conversations']:
        user_input = list[0].lower().strip().translate(str.maketrans('', '', string.punctuation))
        response = list[1]

        if tag in knowledge:
            knowledge[tag][user_input] = response.split(';')
            write_knowledge(knowledge)
        else:
            knowledge[tag] = {}
            knowledge[tag][user_input] = response.split(';')
            write_knowledge(knowledge)
    print("Import successful!")

For some reason, I get an error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 3131: ordinal not in range(128)

The contents of the directory are, file1.json, file2.json, file3.json ... so I am getting an error that the file doesnt exist, even known it exists.

File Example:

categories: - greetings conversations: - - Hello - Hi - - Hi - Hello - - Greetings! - Hello - - Hello - Greetings! - - Hi, How is it going? - Good - - Hi, How is it going? - Fine - - Hi, How is it going? - Okay - - Hi, How is it going? - Great - - Hi, How is it going? - Could be better. - - Hi, How is it going? - Not so great. - - How are you doing? - Good. - - How are you doing? - Very well, thanks. - - How are you doing? - Fine, and you? - - Nice to meet you. - Thank you. - - How do you do? - I'm doing well. - - How do you do? - I'm doing well. How are you? - - Hi, nice to meet you. - Thank you. You too. - - It is a pleasure to meet you. - Thank you. You too. - - Top of the morning to you! - Thank you kindly. - - Top of the morning to you! - And the rest of the day to you. - - What's up? - Not much. - - What's up? - Not too much. - - What's up? - Not much, how about you? - - What's up? - Nothing much. - - What's up? - The sky's up but I'm fine thanks. What about you?

When you import the file, it converts into a dictionary:

{'categories': ['greetings'], 'conversations': [['Hello', 'Hi'], ['Hi', 'Hello'], ['Greetings!', 'Hello'], ['Hello', 'Greetings!'], ['Hi, How is it going?', 'Good'], ['Hi, How is it going?', 'Fine'], ['Hi, How is it going?', 'Okay'], ['Hi, How is it going?', 'Great'], ['Hi, How is it going?', 'Could be better.'], ['Hi, How is it going?', 'Not so great.'], ['How are you doing?', 'Good.'], ['How are you doing?', 'Very well, thanks.'], ['How are you doing?', 'Fine, and you?'], ['Nice to meet you.', 'Thank you.'], ['How do you do?', "I'm doing well."], ['How do you do?', "I'm doing well. How are you?"], ['Hi, nice to meet you.', 'Thank you. You too.'], ['It is a pleasure to meet you.', 'Thank you. You too.'], ['Top of the morning to you!', 'Thank you kindly.'], ['Top of the morning to you!', 'And the rest of the day to you.'], ["What's up?", 'Not much.'], ["What's up?", 'Not too much.'], ["What's up?", 'Not much, how about you?'], ["What's up?", 'Nothing much.'], ["What's up?", "The sky's up but I'm fine thanks. What about you?"]]}

Upvotes: 0

Views: 568

Answers (1)

lsterzinger
lsterzinger

Reputation: 717

I think it's because os.listdir is giving back the name of the files, but not the path to the files. You could either do

for item in os.listdir(directory):
    with open(directory + item) as f:
        data = yaml.safe_load(f)

Or use glob

import glob

for item in glob.glob(directory):
    with open(item) as f:
        data = yaml.safe_load(f)

Upvotes: 2

Related Questions