Reputation: 1191
i followed this example to train my chatterbot with Ubuntu corpus
my code is the next:
# import ChatBot
from chatterbot import ChatBot
# import Trainer
from chatterbot.trainers import UbuntuCorpusTrainer
# Declare a bot
bot = ChatBot('Zeus')
# Training
trainer3 = UbuntuCorpusTrainer(bot)
# Start by training our bot with the Ubuntu corpus data
trainer3.train()
# Get a response to an input statement
bot.get_response("Hello, how are you today?")
while True:
# Input from user
message = input('You: ')
# if message is not "Bye"
if message.strip() != 'Bye':
reply = bot.get_response(message)
print('Zeus:', reply)
# if message is "Bye"
if message.strip() == 'Bye':
print('Zeus: Bye')
break
The output shows that the bot does not get trained with ubuntu corpus:
/usr/local/bin/python3.7 /home/user/Documents/python-workspace/zeus_bot/UbuntuCorpus.py
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] /home/user/nltk_data...
[nltk_data] Package averaged_perceptron_tagger is already up-to-
[nltk_data] date!
[nltk_data] Downloading package stopwords to /home/user/nltk_data...
[nltk_data] Package stopwords is already up-to-date!
Training took 0.11888790130615234 seconds.
/usr/local/lib/python3.7/site-packages/chatterbot/corpus.py:38: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
return yaml.load(data_file)
You: Hi
Zeus: Hello, how are you today?
You: very well
Zeus: Hi
You: what news?
Zeus: Hello, how are you today?
You: i am fine
Zeus: Hi
You: Bye
Zeus: Bye
Process finished with exit code 0
it does not show a loading process for training. when i chat with the bot i get only the get response and nothing else.
Upvotes: 0
Views: 959
Reputation: 3467
There's no problem with the Ubuntu trainer iself. It works. The problem is that it tries to create a ChatterBot DB from 25,000+ TSV files at once! This can take, as I have calculated from tests, from 12 to 15 hours! Moreover, they don’t show the progress of the process or even any information that one has to wait for such a long time. Only an idiot could distribute do such module.
Well, I have created a script that creates a ChatterBot DB from a selection of TSV files included in the Ubuntu dialogs TGZ, which can be very fast depending on the amount of TSV files you select to be processed. Note also the size of the DBs can be very large. A DB carrying the whole package will be about 500 MB! (I have not tried to create such a monster of course. But I guess the interaction with it will be quite fruitful! 🙂
For anyone who is interested and can program in Python, the UbuntuCorpusTrainer class is found in 'trainers.py' of the 'chatterbot' package. You can adjust it to use a specific list of TSV files instead of creating a huge list of all the TSV files stored in the default directory 'dialogs'.(In my case, this is 'C:\Users{username}\ubuntu_data\ubuntu_dialogs\dialogs'.) A process, as I said, it has no end!
For anyone who is interested, I can provide the basic code. processed. Note also the size of the DBs can be very large. A DB carrying the whole package will be about 500 MB! (I have not tried to create such a monster of course. But I guess the interaction with it would be quite interesting!)
For anyone who is interested and can program in Python, the UbuntuCorpusTrainer class is found in 'trainers.py' of the 'chatterbot' package. You can adjust it to use a specific list of TSV files instead of creating a huge list of all the TSV files stored in the default directory 'dialogs'.(In my case, this is 'C:\Users{username}\ubuntu_data\ubuntu_dialogs\dialogs'.) A process, as I said, it has no end!
For anyone who is interested, I can provide the basic code.
Upvotes: -2