Reputation: 191
I am trying to open multiple text files from a folder and one by one, format them.
My code:
import json
import yaml
import os
import string
list_num = 0
def load_knowledge():
if os.path.exists("knowledge"):
with open("knowledge") as f:
knowledge = json.load(f)
else:
knowledge = {}
return knowledge
def write_knowledge(knowledge):
with open("knowledge", "w") as f:
json.dump(knowledge, f, indent=2, sort_keys=True)
for item in os.listdir("/Users/'My username'/Desktop/'The directory'/yml"):
with open("/Users/'My username'/Desktop/'The directory'/yml/" + item) as f:
data = yaml.safe_load(f)
dataDirectory = {}
dataDict = {}
dataDict.update(data)
knowledge = load_knowledge()
tag = dataDict['categories'][0]
for i in range(len(dataDict['conversations'])-1):
if dataDict['conversations'][list_num][0] == dataDict['conversations'][list_num+1][0]:
dataDict['conversations'][list_num][1] = (str(dataDict['conversations'][list_num][1]) + ';' + str(dataDict['conversations'][list_num+1][1]))
del dataDict['conversations'][list_num+1]
list_num = list_num - 1
list_num = list_num + 1
for list in dataDict['conversations']:
user_input = list[0].lower().strip().translate(str.maketrans('', '', string.punctuation))
response = list[1]
if tag in knowledge:
knowledge[tag][user_input] = response.split(';')
write_knowledge(knowledge)
else:
knowledge[tag] = {}
knowledge[tag][user_input] = response.split(';')
write_knowledge(knowledge)
print("Import successful!")
For some reason, I get an error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 3131: ordinal not in range(128)
The contents of the directory are, file1.json, file2.json, file3.json ... so I am getting an error that the file doesnt exist, even known it exists.
File Example:
categories: - greetings conversations: - - Hello - Hi - - Hi - Hello - - Greetings! - Hello - - Hello - Greetings! - - Hi, How is it going? - Good - - Hi, How is it going? - Fine - - Hi, How is it going? - Okay - - Hi, How is it going? - Great - - Hi, How is it going? - Could be better. - - Hi, How is it going? - Not so great. - - How are you doing? - Good. - - How are you doing? - Very well, thanks. - - How are you doing? - Fine, and you? - - Nice to meet you. - Thank you. - - How do you do? - I'm doing well. - - How do you do? - I'm doing well. How are you? - - Hi, nice to meet you. - Thank you. You too. - - It is a pleasure to meet you. - Thank you. You too. - - Top of the morning to you! - Thank you kindly. - - Top of the morning to you! - And the rest of the day to you. - - What's up? - Not much. - - What's up? - Not too much. - - What's up? - Not much, how about you? - - What's up? - Nothing much. - - What's up? - The sky's up but I'm fine thanks. What about you?
When you import the file, it converts into a dictionary:
{'categories': ['greetings'], 'conversations': [['Hello', 'Hi'], ['Hi', 'Hello'], ['Greetings!', 'Hello'], ['Hello', 'Greetings!'], ['Hi, How is it going?', 'Good'], ['Hi, How is it going?', 'Fine'], ['Hi, How is it going?', 'Okay'], ['Hi, How is it going?', 'Great'], ['Hi, How is it going?', 'Could be better.'], ['Hi, How is it going?', 'Not so great.'], ['How are you doing?', 'Good.'], ['How are you doing?', 'Very well, thanks.'], ['How are you doing?', 'Fine, and you?'], ['Nice to meet you.', 'Thank you.'], ['How do you do?', "I'm doing well."], ['How do you do?', "I'm doing well. How are you?"], ['Hi, nice to meet you.', 'Thank you. You too.'], ['It is a pleasure to meet you.', 'Thank you. You too.'], ['Top of the morning to you!', 'Thank you kindly.'], ['Top of the morning to you!', 'And the rest of the day to you.'], ["What's up?", 'Not much.'], ["What's up?", 'Not too much.'], ["What's up?", 'Not much, how about you?'], ["What's up?", 'Nothing much.'], ["What's up?", "The sky's up but I'm fine thanks. What about you?"]]}
Upvotes: 0
Views: 568
Reputation: 717
I think it's because os.listdir
is giving back the name of the files, but not the path to the files. You could either do
for item in os.listdir(directory):
with open(directory + item) as f:
data = yaml.safe_load(f)
Or use glob
import glob
for item in glob.glob(directory):
with open(item) as f:
data = yaml.safe_load(f)
Upvotes: 2