siods333333
siods333333

Reputation: 27

How to scale and mux audio?

First problem is with audio rescaling. I'm trying to redo doc/examples/transcode_aac.c so that it also resamples from 41100 to 48000, it contained a warning that it can't do it.

Using doc/examples/resampling_audio.c as a reference, I saw that before doing swr_convert, I need to find the number of audio samples at the output with the code like this:

    int dst_nb_samples = av_rescale_rnd( input_frame->nb_samples + swr_get_delay(resampler_context, 41100),
                                         48000, 41100, AV_ROUND_UP);

Problem is, when I just set int dst_nb_samples = input_frame->nb_samples (which is 1024), it encodes and plays normally, but when I do that av_rescale_rnd thing (which results in 1196), audio is slowed down and distorted, like there are skips in the audio.

Second problem is with trying to mux webm with opus audio.

When I set AVStream->time_base to 1/48000, and increase AVFrame->pts by 960, the resulted file is played in the player as a file that is much bigger. 17 seconds audio shows as 16m11s audio, but it plays normally.

When I increase pts by 20, it displays normally, but has a lot of [libopus @ 00ffa660] Queue input is backward in time messages during the encoding. Same for pts 30, still has those messages.

Should I try time_scale 1/1000? webm always have timecodes in milliseconds, and opus have packet size of 20ms (960 samples at 48000 Hz).

Search for pts += 20;

Here is the whole file, all modification I did are marked with //MINE: http://www.mediafire.com/file/jlgo7x4hiz7bw64/transcode_aac.c

Here is the file I tested it on http://www.mediafire.com/file/zdy0zarlqw3qn6s/480P_600K_71149981_soundonly.mkv

Upvotes: 0

Views: 1699

Answers (2)

Xakiru
Xakiru

Reputation: 2866

The easiest way to achieve that is by using swr_convert_frame which take a frame and resample it to a completely different one. You can read more about it here: https://ffmpeg.org/doxygen/3.2/swresample_8h_source.html

Upvotes: 3

the kamilz
the kamilz

Reputation: 1988

dst_nb_samples can be calculated as this:
dst_nb_samples = 48000.0 / audio_stream->codec->sample_rate * inputAudioFrame->nb_samples; Yours probably correct too, I didn't check, but this one I used before, confirm with yours but the number you gave check out. So real problem is probably somewhere else. Try to supply 960 samples in sync with video frames, to do this you need to store audio frames to an additional liner buffer. See if problem fixes.

And/or: 2ndly my experiences says audio pts increase as number of samples per frame (i.e. 960 for 50fps video for 48000hz (48000/50)), not by ms. If you supply 1196 samples, use pts += 1196 (if not used additional buffer I mentioned above). This is different then video frame pts. Hope that helps.

You are definitely in right path. I'll examine the source code if I have time. Anyway hope that helps.

Upvotes: 0

Related Questions