hardikudeshi
hardikudeshi

Reputation: 1491

Get total length of videos in a particular directory in python

I have downloaded a bunch of videos from coursera.org and have them stored in one particular folder. There are many individual videos in a particular folder (Coursera breaks a lecture into multiple short videos). I would like to have a python script which gives the combined length of all the videos in a particular directory. The video files are .mp4 format.

Upvotes: 5

Views: 18480

Answers (6)

Shimon
Shimon

Reputation: 172

Here's my take. I did this on Windows. I took the answer from Federico above, and changed the python program a little bit to traverse a tree of folders with video files. So you need to go above to see Federico's answer, to install MediaInfo and to pip install pymediainfo, and then write this program, summarize.py:

import os
import sys
from pymediainfo import MediaInfo


number_of_video_files = 0


def get_alternate_len(media_info):
    myJson = media_info.to_data()
    myArray = myJson['tracks']
    for track in myArray:
        if track['track_type'] == 'General' or track['track_type'] == 'Video':
            if 'duration' in track:
                return int(track['duration'] / 1000)
    return 0


def get_track_len(file_path):
    global number_of_video_files
    media_info = MediaInfo.parse(file_path)
    for track in media_info.tracks:
        if track.track_type == "Video":
            number_of_video_files += 1
            if type(track.duration) == int:
                len_in_sec = int(track.duration / 1000)
            elif type(track.duration) == str:
                len_in_sec = int(float(track.duration) / 1000)
            else:
                len_in_sec = get_alternate_len(media_info)
                if len_in_sec == 0:
                    print("File path = " + file_path + ", problem in type of track.duration")
            return len_in_sec

    return 0


sum_in_secs = 0.0
os.chdir(sys.argv[1])
for root, dirs, files in os.walk("."):
   for name in files:
      sum_in_secs += get_track_len(os.path.join(root, name))

hours = int(sum_in_secs / 3600)
remain = sum_in_secs - hours * 3600
minutes = int(remain / 60)
seconds = remain - minutes * 60
print("Directory: " + sys.argv[1])
print("Total number of video files is " + str(number_of_video_files))
print("Length: %d:%02d:%02d" % (hours, minutes, seconds))

Run it: python summarize.py <DirPath>

Have fun. I found I have about 1800 hours of videos waiting for me to have some free time. Yeah sure

Upvotes: 0

user3064538
user3064538

Reputation:

First, install the ffprobe command (it's part of FFmpeg) with

sudo apt install ffmpeg

then use subprocess.run() to run this bash command:

ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 -- <filename>

(which I got from http://trac.ffmpeg.org/wiki/FFprobeTips#Formatcontainerduration), like this:

from pathlib import Path
import subprocess

def video_length_seconds(filename):
    result = subprocess.run(
        [
            "ffprobe",
            "-v",
            "error",
            "-show_entries",
            "format=duration",
            "-of",
            "default=noprint_wrappers=1:nokey=1",
            "--",
            filename,
        ],
        capture_output=True,
        text=True,
    )
    try:
        return float(result.stdout)
    except ValueError:
        raise ValueError(result.stderr.rstrip("\n"))

# a single video
video_length_seconds('your_video.webm')

# all mp4 files in the current directory (seconds)
print(sum(video_length_seconds(f) for f in Path(".").glob("*.mp4")))

# all mp4 files in the current directory and all its subdirectories
# `rglob` instead of `glob`
print(sum(video_length_seconds(f) for f in Path(".").rglob("*.mp4")))

# all files in the current directory
print(sum(video_length_seconds(f) for f in Path(".").iterdir() if f.is_file()))

This code requires Python 3.7+ because that's when text= and capture_output= were added to subprocess.run. If you're using an older Python version, check the edit history of this answer.

Upvotes: 8

Federico
Federico

Reputation: 135

Nowadays pymediainfo is available, so Janus Troelsen's answer could be simplified.
You need to install MediaInfo and pip install pymediainfo. Then the following code would print you the total length of all video files:

import os
from pymediainfo import MediaInfo

def get_track_len(file_path):
    media_info = MediaInfo.parse(file_path)
    for track in media_info.tracks:
        if track.track_type == "Video":
            return int(track.duration)
    return 0
    
print(sum(get_track_len(f) for f in os.listdir('directory with video files')))

Upvotes: 1

Emmett Butler
Emmett Butler

Reputation: 6207

This link shows how to get the length of a video file https://stackoverflow.com/a/3844467/735204

import subprocess

def getLength(filename):
  result = subprocess.Popen(["ffprobe", filename],
    stdout = subprocess.PIPE, stderr = subprocess.STDOUT)
  return [x for x in result.stdout.readlines() if "Duration" in x]

If you're using that function, you can then wrap it up with something like

import os

for f in os.listdir('.'):
    print "%s: %s" % (f, getLength(f))

Upvotes: 0

Janus Troelsen
Janus Troelsen

Reputation: 21270

  1. Download MediaInfo and install it (don't install the bundled adware)
  2. Go to the MediaInfo source downloads and in the "Source code, All included" row, choose the link next to "libmediainfo"
  3. Find MediaInfoDLL3.py in the downloaded archive and extract it anywhere. Example location: libmediainfo_0.7.62_AllInclusive.7z\MediaInfoLib\Source\MediaInfoDLL\MediaInfoDLL3.py
  4. Now make a script for testing (sources below) in the same directory.
  5. Execute the script.

MediaInfo works on POSIX too. The only difference is that an so is loaded instead of a DLL.

Test script (Python 3!)

import os

os.chdir(os.environ["PROGRAMFILES"] + "\\mediainfo")
from MediaInfoDLL3 import MediaInfo, Stream

MI = MediaInfo()

def get_lengths_in_milliseconds_of_directory(prefix):
  for f in os.listdir(prefix):
    MI.Open(prefix + f)
    duration_string = MI.Get(Stream.Video, 0, "Duration")

    try:
      duration = int(duration_string)
      yield duration
      print("{} is {} milliseconds long".format(f, duration))
    except ValueError:
      print("{} ain't no media file!".format(f))

    MI.Close()

print(sum(get_lengths_in_milliseconds_of_directory(os.environ["windir"] + "\\Performance\\WinSAT\\"
)), "milliseconds of content in total")

Upvotes: 3

arnm
arnm

Reputation: 1765

In addition to Janus Troelsen's answer above, I would like to point out a small problem I encountered when implementing his answer. I followed his instructions one by one but had different results on windows (7) and linux (ubuntu). His instructions worked perfectly under linux but I had to do a small hack to get it to work on windows. I am using a 32-bit python 2.7.2 interpreter on windows so I utilized MediaInfoDLL.py. But that was not enough to get it to work for me I was receiving this error at this point in the process:

"WindowsError: [Error 193] %1 is not a valid Win32 application".

This meant that I was somehow using a resource that was not 32-bit, it had to be the DLL MediaInfoDLL.py was loading. If you look at the MediaInfo intallation directory you will see 3 dlls MediaInfo.dll is 64-bit while MediaInfo_i386.dll is 32-bit. MediaInfo_i386.dll is the one which I had to use because of my python setup. I went to MediaInfoDLL.py (which I already had included in my project) and changed this line:

MediaInfoDLL_Handler = windll.MediaInfo

to

MediaInfoDLL_Handler = WinDLL("C:\Program Files (x86)\MediaInfo\MediaInfo_i386.dll")

I didn't have to change anything for it to work in linux

Upvotes: 2

Related Questions