Forivin
Forivin

Reputation: 15508

ffmpeg - How to convert massive amounts of files in parallel?

I need to convert about 1.5TiB or audio files which are in either flac or wav format. They need to be converted into mp3 files, keeping important meta data and the cover art etc. and the bitrate needs to be 320k.

This alone is easy:

ffmpeg -i "$flacFile" -ab 320k -map_metadata 0 -id3v2_version 3 -vsync 2 "$mp3File" < /dev/null

But the problem is making it faster. The command from above only uses 12.5% of the CPU. I'd much rather use like 80%. So I played around with the threads flag, but it doesn't make it faster or slower:

ffmpeg -i "$flacFile" -ab 320k -map_metadata 0 -id3v2_version 3 -vsync 2 -threads 4 "$mp3File" < /dev/null

But it only utilizes my CPU by 13%. I think it only uses one thread. My CPU has 8 physical cores btw (+ hyperthreading).

So my idea now is to somehow have multiple instances of ffmpeg running at the same time, but I have no clue how to do that properly.

This is my current script to take all flac/wav files from one directory (recursively) and convert them to mp3 files in a new directory with the exact same structure:

#!/bin/bash

SOURCE_DIR="/home/fedora/audiodata_flac"
TARGET_DIR="/home/fedora/audiodata_mp3"

echo "FLAC/WAV files will be read from '$SOURCE_DIR' and MP3 files will be written to '$TARGET_DIR'!"
read -p "Are you sure? (y/N)" -n 1 -r
echo    # (optional) move to a new line
if [[ $REPLY =~ ^[Yy]$ ]] ; then # Continue if user enters "y"

    # Find all flac/wav files in the given SOURCE_DIR and iterate over them:
    find "${SOURCE_DIR}" -type f \( -iname "*.flac" -or -iname "*.wav" \) -print0 | while IFS= read -r -d '' flacFile; do
        if [[ "$(basename "${flacFile}")" != ._* ]] ; then # Skip files starting with "._"
            tmpVar="${flacFile%.*}.mp3"
            mp3File="${tmpVar/$SOURCE_DIR/$TARGET_DIR}"
            mp3FilePath=$(dirname "${mp3File}")
            mkdir -p "${mp3FilePath}"
            if [ ! -f "$mp3File" ]; then # If the mp3 file doesn't exist already
                echo "Input: $flacFile"
                echo "Output: $mp3File"
                ffmpeg -i "$flacFile" -ab 320k -map_metadata 0 -id3v2_version 3 -vsync 2 "$mp3File" < /dev/null
            fi
        fi
    done
fi

I mean I guess I could append an & to the ffmpeg command, but that would cause throusands of ffmpeg instances to run at the same time, which is too much.

Upvotes: 1

Views: 2153

Answers (1)

Ole Tange
Ole Tange

Reputation: 33685

Something like this:

#!/bin/bash

SOURCE_DIR="/home/fedora/audiodata_flac"
TARGET_DIR="/home/fedora/audiodata_mp3"
export SOURCE_DIR
export TARGET_DIR

doone() {
    flacFile="$1"
    if [[ "$(basename "${flacFile}")" != ._* ]] ; then # Skip files starting with "._"
        tmpVar="${flacFile%.*}.mp3"
        mp3File="${tmpVar/$SOURCE_DIR/$TARGET_DIR}"
        mp3FilePath=$(dirname "${mp3File}")
        mkdir -p "${mp3FilePath}"
        if [ ! -f "$mp3File" ]; then # If the mp3 file doesn't exist already
            echo "Input: $flacFile"
            echo "Output: $mp3File"
            ffmpeg -i "$flacFile" -ab 320k -map_metadata 0 -id3v2_version 3 -vsync 2 "$mp3File" < /dev/null
        fi
    fi
}

export -f doone

# Find all flac/wav files in the given SOURCE_DIR and iterate over them:
find "${SOURCE_DIR}" -type f \( -iname "*.flac" -or -iname "*.wav" \) -print0 |
  parallel -0 doone

Upvotes: 2

Related Questions