Reputation: 5132
Execute an audio file (wav or better mp3) changing its speed smoothly and countinuosly. The pitch should change according to the speed (playback rate). My application updates several times per second a variable that contains the desired speed: i.e. 1.0 = normal speed. Required range is about 0.2 .. 3.0, with a resolution of 0.01.
The audio is likely music, expected format: mono, 16-bit, 11.025 Hz. No specific constraints about latency: below 500 ms is acceptable.
QMediaPlayer in QtMultimedia has the playbackRate property that should do exactly this. Unfortunately I have never be able to make QtMultimedia work in my systems.
It's ok to use also an external player, and send data using pipes or any IPC.
How would you achieve this?
Upvotes: 0
Views: 3989
Reputation: 7910
I don't know how much of this translates to C++. The work I did on this problem uses Java. Still, something of the algorithm should be of help.
Example data (made up):
sample value
0 0.0
1 0.3
2 0.5
3 0.6
4 0.2
5 -0.1
6 -0.4
With normal speed, we send the output line a series of values where the sample number increments by 1 per output frame.
If we were going slower, say half speed, we should output twice as many values before reaching the same point in the media data. In other words, we need to include, in our output, values that are at the non-existent, intermediate sample frame locations 0.5, 1.5, 2.5, ...
To do this, it turns out that linear interpolation works quite well for audio. It is possible to use a more sophisticated curve fitting algorithm but the increase in fidelity is not considered to be worth the trouble.
So, we end up with a stream as follows (for half speed):
sample value
0 0.0
0.5 0.15
1 0.3
1.5 0.4
2 0.5
2.5 0.55
3 0.6
etc.
If you want to play back 3/4 speed, then the positions and values used in the output would be this:
sample value
0 0.0
0.75 0.225
1.5 0.4
2.25 0.525
3 0.6
3.75 0.525
etc.
I code this via a "cursor" that is incremented each sample frame, where the increment amount determines the "speed" of the playback. The cursor points into an array, like an integer index would, but instead, is a float (or double). If there is a fractional part to the cursor's value, the fraction is used to interpolate between sample values pointed to by the integer part and the integer part plus one.
For example, if the cursor was 6.25, and the value of soundData[6] was A and the value of soundData[6+1] was B, the sound value would be:
audioValue = A * 0.75 + B * 0.25
The degree of precision with which you can define your speed increment is quite high. I think Java's floats are considered sufficient for this purpose.
As for keeping a dynamically changing speed increment smooth, I am spreading out the changes to new speeds over a series of 4096 steps (roughly 1/10th of a second, at 44100 fps). Change requests are often asynchronous, e.g., from a GUI, and are spread out over time in a somewhat unpredictable way. The smoothing algorithm should be able to recalculate and update itself with each new speed request.
Following is a link that demonstrates both strategies, where a sound's playback speed is altered in real time via a slider control.
This is a runnable copy of the jar file that also contains the source code, and executes via Java 8. You can also rename the file SlidersTest.zip and then drill in to view the source code, in context.
But links to the source files can also be navigated to directly in the two following sections of a page I posted for this code I recently wrote and made open source: see AudioCue.java see SlidersTest.java
AudioCue.java is a long file. The relevant parts are in the inner class at the end of the file: class AudioCuePlayer, and for the smoothing algorithm, check the setter method setSpeed which is about 3/4's of the way down. Sorry I don't have line numbers.
Upvotes: 2