Reputation: 3
Due to Covid-19, I don't have access to a physical NAO and need to work with simulations. The goal is to model dialogues of different complexity, also involving gestures. Speech recognition is the most important feature here, but simulation of other features that add more realism (like voice) would be appreciated too.
I am working from a Mac (with Catalina).
What I've tried:
I'd appreciate any hint on whether Webots is even suitable for dialogues (seems to be mostly focussed on movement) or advice for other suitable simulations.
Upvotes: 0
Views: 1133
Reputation: 1808
The ALTextToSpeech
and ALSpeechRecognition
APIs don't work on the virtual robot unforunately. From the docs here
ACAPELA, microAITalk and Nuance engines are only available on the real robot. When using a virtual robot, said text can be visualized in Choregraphe Robot View and Dialog panel.
and here
[Speech Recognition] cannot be tested on a simulated robot - This module is only available on a real robot, you cannot test it on a simulated robot.
The text interaction can be used to test the flow of your dialogs, but won't allow you to test the nuances of speech recognition properly though.
Webots is not supported any more, and I've never had any luck getting it set up. The best currently available simulation environment for Pepper/NAO is the ROS Gazebo Stack. But it's really not designed for audio simulation either. It would allow you to simulate the robot making gestures and moving through the world, but you would have to write your own custom code (ROS nodes, in python or C++) to process the audio, do speech recogition, and output speech (connected up to a mic and speakers you have for example).
If you plan to use a NAOqi QiChat chatbot, you could use the naoqi python apis to run that and just connect external speech to text and text to speech services to it. Though it you want more complex speech interactions, I'd suggest a full blown chatbot (Dialogflow, IBM Watson, et c.)
Upvotes: 0