John Allard
John Allard

Reputation: 3914

Installing "ffmpeg" package from setup.py in Apache Beam pipeline running on Google Cloud Dataflow

I'm trying to run an Apache Beam pipeline on Google Cloud Dataflow that utilizes FFmpeg to perform transcoding operations. As I understand it, since ffmpeg is not a python package (available through PIP), I need to install it from setup.py using the following lines

# The output of custom commands (including failures) will be logged in the
# worker-startup log.
CUSTOM_COMMANDS = [
    ['apt-get', 'update'],
    ['apt-get', 'install', '-y', 'ffmpeg']]

Unfortunately, this is not working. My pipeline is stalling and when I go to examine the logs I'm seeing this

enter image description here

RuntimeError: Command ['apt-get', 'install', '-y', 'ffmpeg'] failed: exit code: 100

It appears to be unable to find the package 'ffmpeg'. I'm curious as to why this is - ffmpeg is a standard package that should be available under apt-get.

Upvotes: 2

Views: 614

Answers (2)

John Allard
John Allard

Reputation: 3914

I had forgotten to properly run apt-get update before attempting the install. Make sure to run that before trying to install any packages.

Upvotes: 1

Andrew Pilloud
Andrew Pilloud

Reputation: 478

I was able to install ffmpeg on Dataflow using Beam 2.12.0 and back to 2.3.0. I saw this error on 2.2.0 and older. I also saw this issue without 'apt-get update' first, so make sure that didn't fail.

Upvotes: 0

Related Questions