TestinginProd
TestinginProd

Reputation: 1156

Python method or class to compare two video files?

I'm trying to write a program to compare files and show the duplicates in python. Anyone know any good functions or methods related to this? I am sorta lost...

Upvotes: 1

Views: 5154

Answers (3)

Blender
Blender

Reputation: 298246

If you're just looking for exact duplicates, do an MD5 hash on both and see if they match:

import hashlib

file1 = open('file1.avi', 'r').read()
file2 = open('file2.avi', 'r').read()

if hashlib.sha512(file1).hexdigest() == hashlib.sha512(file2).hexdigest():
  print 'They are the same'
else:
  print 'They are different'

If not, I'd try OpenCV's Python Bindings and check if they match up frame by frame.

Upvotes: 3

Hugh Bothwell
Hugh Bothwell

Reputation: 56654

I would use os.walk to go through the file tree.

For each file, I would store the absolutepath+filename, indexed by file size and signature (first 16 bytes? Hash of first 512 bytes? Hash on full file?).

When finished, you end up with a dict of file sizes; for each size, a dict of file signatures; for each signature, a list of all files sharing that signature. If your file signature is not based on the full file, or has significant chance of collisions, you can then do a more in-depth comparison of just those colliding files.

Upvotes: 1

MattoTodd
MattoTodd

Reputation: 15209

I would first start out comparing filenames and filesizes. If you find a match, you could then loop through the bytes of the file to compare them, although this is probably pretty intensive.

I do not know of a library that can do this in python.

Upvotes: 0

Related Questions