Reputation: 126
I'm writing a Discord Bot in VS2017 using Discord.Net wrapper. I've gotten everything to work (parsing/sending text commands, joining voice channels) except the main goal: Using TTS audio output stream in a voice channel.
Basically, I'm using SpeechSynthesizer to create the MemoryStream and write that to the Discord bot. The problem is, there's no audio. At all. I've been following several other answers as well as the documentation on the Discord.Net site and can't seem to find a way to get this to work. Audio streaming via url/file is well documented but not this.
var ffmpeg = CreateProcess("");
var output = ffmpeg.StandardOutput.BaseStream;
IAudioClient client;
ConnectedChannels.TryGetValue(guild.Id, out client);
var discord = client.CreatePCMStream(AudioApplication.Mixed);
await output.CopyToAsync(discord);
await discord.FlushAsync();
Above is the sample I've been using which is sourced from a file via ffmpeg. I see that it's just copying over a stream, so I've attempted the following in various methods:
IAudioClient client;
ConnectedChannels.TryGetValue(guild.Id, out client);
var discord = client.CreatePCMStream(AudioApplication.Mixed);
var synth = new SpeechSynthesizer();
var stream = new MemoryStream();
var synthFormat = new SpeechAudioFormatInfo(
EncodingFormat.Pcm,
8000,
16,
1,
16000,
2,
null);
synth.SetOutputToAudioStream(stream, synthFormat);
synth.Speak("this is a test");
await stream.CopyToAsync(discord);
await discord.FlushAsync();
I've tried changing around the SpeechAudioFormatInfo properties, changing the output on the SpeechSynthesizer, completely removing the async calls, pretty much everything that I could think of with no result.
I realize that I could just output sound to a dummy audio device and have another account/bot pick up on that but that was not the goal of this exercise. I also realize that I could just write the output to a file and just stream it but that would increase the processing time. These TTS instructions are small, never over 5 words, and need to be somewhat quick to the point since they're supposed to be "callouts".
Lastly, I couldn't exactly find a way to make this work with ffmpeg either. Everything I've read seems to indicate the need for a physical source, not just a memory stream.
So, I'm at wit's end. Any assistance would be appreciated.
Upvotes: 3
Views: 2109
Reputation: 302
Discord.NET is a bit picky with AudioStreams. You need a single PCMStream per audio connexion or it will do some weird stuff. You can create your PCMStream when connecting in voice and then call multiple SendAsync to send audio.
If I remember correctly you should be able to output the TTS stream as a media (mp3 or AAC media file) Then play the TTS audio file like this
public async Task SendAsync(float volume, string path, AudioOutStream stream)
{
_currentProcess = CreateStream(path);
while (true)
{
if (_currentProcess.HasExited)
{ break; }
int blockSize = 2880;
byte[] buffer = new byte[blockSize];
int byteCount;
byteCount = await _currentProcess.StandardOutput.BaseStream.ReadAsync(buffer, 0, blockSize);
if (byteCount == 0)
{ break; }
await stream.WriteAsync(buffer, 0, byteCount);
}
await stream.FlushAsync();
}
And call ffmpeg like this :
private static Process CreateStream(string path)
{
var ffmpeg = new ProcessStartInfo
{
FileName = "ffmpeg",
Arguments = $"-hide_banner -loglevel panic -i \"{path}\" -ac 2 -f s16le -ar 48000 pipe:1",
UseShellExecute = false,
RedirectStandardOutput = true
};
return Process.Start(ffmpeg);
}
Upvotes: 2