Reputation: 5413
I am having too many duplicate songs on my hard-disk and i want to remove duplicates.What i want to do is write a python program to check md5sum of all files and remove file with similar md5sum but keeping at least 1 copy.The issue i am facing is that music files have spaces in their name and path is not evaluated correctly.Below is my code to check md5sum
__author__ = 'akthakur'
import os
import subprocess
def listFiles(location):
for name in os.listdir(location):
path=os.path.join(location,name)
if os.path.isdir(path):
listFiles(path)
else:
cmd= 'md5sum ' + path
print(cmd)
fp= os.popen(cmd)
print(fp.read())
fp.close()
listFiles("E:\offwork")
I want to just pass path of my e drive to "listFiles" function and it should provide me md5sum off all the files in all directories recursively.
Spaces in file names are causing serious trouble,is there any way of dealing with the same
Upvotes: 0
Views: 227
Reputation: 1473
You call md5sum
as an external program. That means a file path needs to be escaped with quotes if it contains a space. Please change
cmd= 'md5sum ' + path
to
cmd = 'md5sum "' + path + '"'
Upvotes: 1
Reputation: 1408
You can pass a list of args in popen
https://docs.python.org/2/library/subprocess.html
args = path.split(' ')
fp= os.popen(['md5sum']+args)
Upvotes: 0