Reputation: 191
I'm trying to get my Macs main audio output, and tap into it, and then use speech to text on it.
The part I'm currently stuck on is getting my Macs main sound output. For example, if I were to go onto Youtube
and play a video, it outputs the sound to my laptop speakers. I'm trying to tap into that output.
I'm using
AVAudioEngine.mainMixerNode
However, I dont get any output. I've also tried AVAudioEngine.outputNode
Here's what I've tried:
engine = AVAudioEngine()
let _ = engine.mainMixerNode
engine.prepare()
do {
try engine.start()
} catch {
print("Start Error")
return
}
do {
let settings = engine.mainMixerNode.outputFormat(forBus: 0).settings
print("FileType: \(settings[AVAudioFileTypeKey])") // Returns nil
self.file = try! AVAudioFile(forWriting: url, settings: settings)
engine.mainMixerNode.installTap(onBus: 0, bufferSize: 1024, format: nil) { // I've tried with a format and without
(buffer: AVAudioPCMBuffer?, time: AVAudioTime!) -> Void in
do {
// Buffer must be empty because the CAF audio length is correct, but there's no sound. It's flat.
try self.file.write(from: buffer!)
} catch _{
print("Problem Writing Buffer")
}
}
}
The above code has Privacy - Microphone Usage Description
set. (Even though I'm not looking to tap the mic)
The first problem here is that I'm not sure what file type I should use for my output. I've tried CAF
, but opening it in GarageBand
produces no sound (although there is length)
I've tried renaming the file to m4a
and wav
, but Apple's Music
just doesn't want to play it.
What am I doing wrong? Any help is greatly appreciated.
Upvotes: 2
Views: 1645
Reputation: 52645
Unfortunately, it is not a trivial task to get access to the system sound output like this (which is probably a good thing, from a security standpoint).
The solutions will generally involve setting up a 'fake' audio device that audio can get played through instead of the default output device you've chosen. This is the approach, for example, that BlackHole (https://github.com/ExistentialAudio/BlackHole) and the more-elderly SoundFlower (https://github.com/mattingalls/Soundflower) use. Both of those links are to source code that you can start digging through to see how they accomplish this -- again, more in depth than a SO answer warrants.
In terms of the mainMixerNode
that you're exploring, although the name makes it sound like a system-wide device, it really just controls the mixer for your app's sound. So, although you could tap it, it would just give you access to what you're playing. That's why your recorded file is flat -- unless you're playing output, it won't get recorded.
Upvotes: 1