yesbutmaybeno
yesbutmaybeno

Reputation: 1148

IOS Core Audio - MP3 to WAV working only when 1 channel, how to get stereo?

Currently taking in an MP3 file spitting out a WAV. My code has worked fine for a while, but I now want to change it where the exported WAV is a 2-channel stereo file.

The problem lies here somewhere. This describes the desired output format.

This code below is what worked fine beforehand (mono):

AudioStreamBasicDescription outputFormat = new AudioStreamBasicDescription();
outputFormat.setFormat(AudioFormat.LinearPCM);
outputFormat.setFormatFlags(AudioFormatFlags.Canonical);
outputFormat.setBitsPerChannel(16);
outputFormat.setChannelsPerFrame(1);
outputFormat.setFramesPerPacket(1);
outputFormat.setBytesPerFrame(2);
outputFormat.setBytesPerPacket(2);
outputFormat.setSampleRate(pitch);

Changing it to setChannelsPerFrame(2); didn't work. Not sure what else needs changing?

The error is:

Launcher[318:12909] 224: SetDataFormat failed
Launcher[318:12909] 367: EXCEPTION (1718449215): "create audio file"

org.robovm.apple.corefoundation.OSStatusException: 1718449215
at org.robovm.apple.corefoundation.OSStatusException.throwIfNecessary(OSStatusException.java:53)
at org.robovm.apple.audiotoolbox.ExtAudioFile.create(ExtAudioFile.java:80)
at package.Launcher.mp3ToPCM(Launcher.java:1108)
...

Where the line in question is

outputFileExtAudio = ExtAudioFile.create(outputFileURL, AudioFileType.WAVE, outputFormat, null, AudioFileFlags.EraseFile);

But the problem must be stemming from my AudioStreamBasicDescription of the outputFormat, as that is the only thing changing to "2 channels" and suddenly it no longer works.

(This is Java code, utilizing RoboVM to convert to native IOS code.)

Upvotes: 1

Views: 246

Answers (1)

sbooth
sbooth

Reputation: 16976

You also need to update the sizes.

In Core Audio a sample is one single value and a frame is one sample across all channels. For PCM audio, a single frame is also a single packet.

For 16-bit mono audio, a frame and sample are synonymous and take up 2 bytes. For 16-bit stereo audio, a frame consists of two samples (left and right), with each sample taking up 2 bytes and each frame taking up 4 bytes.

The values of AudioStreamBasicDescription vary slightly on whether the format being described is interleaved or not.

You can generally think of non-interleaved PCM AudioStreamBasicDescriptions like this:

asbd.mBytesPerFrame     = asbd.mBitsPerChannel / 8;

and interleaved like this:

asbd.mBytesPerFrame     = (asbd.mBitsPerChannel / 8) * asbd.mChannelsPerFrame;

with both having

asbd.mFramesPerPacket   = 1;
asbd.mBytesPerPacket    = asbd.mBytesPerFrame;

AudioFormatFlags.Canonical is deprecated but I assume here it equates to interleaved packed native-endian signed integers.

So for your case, interleaved 16-bit stereo is:

AudioStreamBasicDescription outputFormat = new AudioStreamBasicDescription();
outputFormat.setFormat(AudioFormat.LinearPCM);
outputFormat.setFormatFlags(AudioFormatFlags.Canonical);

outputFormat.setSampleRate(pitch);
outputFormat.setChannelsPerFrame(2);
outputFormat.setBitsPerChannel(16);

outputFormat.setBytesPerFrame(4);
outputFormat.setFramesPerPacket(1);
outputFormat.setBytesPerPacket(4);

Here are two helper functions (in C++) showing the relationships:

static AudioFormatFlags CalculateLPCMFlags(UInt32 validBitsPerChannel, UInt32 totalBitsPerChannel, bool isFloat, bool isBigEndian, bool isNonInterleaved)
{
    return (isFloat ? kAudioFormatFlagIsFloat : kAudioFormatFlagIsSignedInteger) | (isBigEndian ? ((UInt32)kAudioFormatFlagIsBigEndian) : 0) | ((validBitsPerChannel == totalBitsPerChannel) ? kAudioFormatFlagIsPacked : kAudioFormatFlagIsAlignedHigh) | (isNonInterleaved ? ((UInt32)kAudioFormatFlagIsNonInterleaved) : 0);
}

static void FillOutASBDForLPCM(AudioStreamBasicDescription *asbd, Float64 sampleRate, UInt32 channelsPerFrame, UInt32 validBitsPerChannel, UInt32 totalBitsPerChannel, bool isFloat, bool isBigEndian, bool isNonInterleaved)
{
    asbd->mFormatID = kAudioFormatLinearPCM;
    asbd->mFormatFlags = CalculateLPCMFlags(validBitsPerChannel, totalBitsPerChannel, isFloat, isBigEndian, isNonInterleaved);

    asbd->mSampleRate = sampleRate;
    asbd->mChannelsPerFrame = channelsPerFrame;
    asbd->mBitsPerChannel = validBitsPerChannel;

    asbd->mBytesPerPacket = (isNonInterleaved ? 1 : channelsPerFrame) * (totalBitsPerChannel / 8);
    asbd->mFramesPerPacket = 1;
    asbd->mBytesPerFrame = (isNonInterleaved ? 1 : channelsPerFrame) * (totalBitsPerChannel / 8);
}

Upvotes: 2

Related Questions