EmmyStrand
EmmyStrand

Reputation: 43

Is it possible to use FFmpeg to cut "random" sections from a folder of videos and concat them into 1 video?

I realize this sounds like an easy question, and one that has been answered before. However, I cannot seem to find a script which can read a folder of videos with varying lengths, copy a random segment from each video, and concat them into a single video.

An example:

I have a folder with 150 videos labeled Fashion-Setlist-01.mp4, Fashion-Setlist-02.mp4, etc. Each are over 1 hour. I would like to pull a random 10 seconds section from each video and then randomly add them together resulting in a video. This may seem easy with only a few videos, but the plan is to read from potentially 100's of videos. It should be possible to pull multiple sections from each video as well. I suppose we could run the script twice for more segments if the video needed to be longer.

Upvotes: 3

Views: 3913

Answers (2)

hb_
hb_

Reputation: 352

moviepy is the most appropriate tool for this (it uses ffmpeg as a backend). Concatenating videos is trivial in moviepy:

import moviepy.editor
import os
import random
import fnmatch 

directory = '/directory/to/videos/'
xdim = 854
ydim = 480
ext = "*mp4"
length = 10

outputs=[]

# compile list of videos
inputs = [os.path.join(directory,f) for f in os.listdir(directory) if os.path.isfile(os.path.join(directory, f)) and fnmatch.fnmatch(f, ext)]

for i in inputs:

    # import to moviepy
    clip = moviepy.editor.VideoFileClip(i).resize( (xdim, ydim) ) 

    # select a random time point
    start = round(random.uniform(0,clip.duration-length), 2) 

    # cut a subclip
    out_clip = clip.subclip(start,start+length)

    outputs.append(out_clip)

# combine clips from different videos
collage = moviepy.editor.concatenate_videoclips(outputs) 

collage.write_videofile('out.mp4')

Upvotes: 5

Kingsley
Kingsley

Reputation: 14906

EDIT: Cutting video with ffmpeg needs to be done on a key-frame. I extensively edited this code to first find the key-frames, then cut around this. It works for me.

So to do this in bash, assuming there exists some program randtime.py which outputs a random starting time in 'H:MM:SS' format, and some other program which finds the video keyframe near a given time, here's a quick hack-version:

#! /usr/bin/env bash

CUTLIST=cut_list.txt
RANDTIME=~/Bin/randtime.py
KEYFRAMER=~/Bin/find_next_key_frame.py

count=0
echo "" > "$CUTLIST"
for file in *.mp4
do
    count=$(( $count + 1 ));
    outfile="cut_$count.mp4"
    start_time=`python "$RANDTIME"`
    # Find the next keyframe, at or after the random time
    start_keyframe_time=`$KEYFRAMER "$file" "$start_time"`
    if [ $? -eq 0 ]; then
        echo "Nearest keyframe to \"$start_time\" is \"$start_keyframe_time\""
        echo "ffmpeg -loglevel quiet -y -i \"$file\" -ss $start_keyframe_time -t 00:00:10 \"$outfile\""
        ffmpeg -loglevel quiet -y -i "$file" -ss $start_keyframe_time -t 00:00:10 "$outfile"
        if [ $? -ne 0 ]; then
            echo "ffmpeg returned an error on [$file], aborting"
        #   exit 1
        fi
        echo "file '$outfile'" >> "$CUTLIST"
    else
        echo "ffprobe found no suitable key-frame near \"$start_time\""
    fi
done

echo "Concatenating ... "
cat "$CUTLIST"

ffmpeg -f concat -i cut_list.txt -c copy all_cuts.mp4

if [ -f "$CUTLIST" ]; then
    rm "$CUTLIST"
fi

And the random time, in python:

#! /usr/bin/env python3

import random

#TODO: ensure we're at least 8 seconds before 1 hour

hrs = 0 # random.randint(0,1)
mns = random.randint(0,59)
scs = random.randint(0,59)

print("%d:%02d:%02d" % (hrs,mns,scs))

And, and again in python find the keyframe exactly on, or just after the time given.

#! /usr/bin/env python3

import sys
import subprocess
import os
import os.path

FFPROBE='/usr/bin/ffprobe'

EXE=sys.argv[0].replace('\\','/').split('/')[-1]

if (not os.path.isfile(FFPROBE)):
    sys.stderr.write("%s: The \"ffprobe\" part of FFMPEG seems to be missing\n" % (EXE))
    sys.exit(1)

if (len(sys.argv) == 1 or (len(sys.argv)==2 and sys.argv[1] in ('--help', '-h', '-?', '/?'))):
    sys.stderr.write("%s: Give video filename and time as arguments\n" % (EXE))
    sys.stderr.write("%s:     e.g.: video_file.mp4 0:25:14 \n" % (EXE))
    sys.stderr.write("%s: Outputs the next keyframe at or after the given time\n" % (EXE))
    sys.exit(1)

VIDEO_FILE = sys.argv[1]
FRAME_TIME = sys.argv[2].strip()

if (not os.path.isfile(VIDEO_FILE)):
    sys.stderr.write("%s: The vdeo file \"%s\" seems to be missing\n" % (EXE, VIDEO_FILE))
    sys.exit(1)


### Launch FFMPEG's ffprobe to identify the frames
command = "ffprobe -show_frames -pretty \"%s\"" % VIDEO_FILE
frame_list = subprocess.getoutput(command)

### The list of frames is a huge bunch of lines like:
###    [FRAME]
###    media_type=video
###    key_frame=0
###    best_effort_timestamp=153088
###    best_effort_timestamp_time=0:00:09.966667
###    pkt_duration_time=0:00:00.033333
###    height=360
###    ...
###    [/FRAME]

### Parse the stats about each frame, keeping only the Video Keyframes
key_frames = []
for frame in frame_list.split("[FRAME]"):
    # split the frame lines up into separate "x=y" pairs
    frame_dict = {}
    frame_vars = frame.replace('\r', '\n').replace("\n\n", '\n').split('\n')
    for frame_pair in frame_vars:
        if (frame_pair.find('=') != -1):
            try:
                var,value = frame_pair.split('=', 1)
                frame_dict[var.strip()] = value.strip()
            except:
                sys.stderr.write("%s: Warning: Unable to parse [%s]\n" % (EXE, frame_pair))
        # Do we want to keep this frame?
        # we want video frames, that are key frames
        if ("media_type", "key_frame" in frame_dict and frame_dict["media_type"] == "video" and frame_dict["key_frame"] == "1"):
            key_frames.append(frame_dict)

### Throw away duplicates, ans sort (why are there duplicates?)
key_frame_list = set()
for frame_dict in key_frames:        
    #print(str(frame_dict))
    if ("best_effort_timestamp_time" in frame_dict):
        key_frame_list.add(frame_dict["best_effort_timestamp_time"])
key_frame_list = list(key_frame_list)
key_frame_list.sort()
sys.stderr.write("Keyframes found: %u, from %s -> %s\n" % (len(key_frame_list), key_frame_list[0], key_frame_list[-1]))

### Find the same, or next-larger keyframe
found = False
for frame_time in key_frame_list:
    #sys.stderr.write("COMPARE %s > %s\n" % (frame_time , FRAME_TIME))
    if (frame_time > FRAME_TIME):
        print(frame_time)
        found = True
        break  # THERE CAN BE ONLY ONE!

### Failed?  Print something possibly useful
if (found == False):
    sys.stderr.write("%s: Warning: No keyframe found\n" % (EXE))
    print("0:00:00")
    sys.exit(-1)
else:
    sys.exit(0)  # All'swell

Upvotes: 0

Related Questions