Reputation: 1867
I made a python script using glob (https://docs.python.org/2/library/glob.html) this script matches JPG image with its matching XML annotation and moves it to a different folder (for example I have 1.jpg, 2.jpg, 3.jpg, 1.xml, 3.xml) then it moves (1.jpg,1.xml and 3.jpg,3.xml) to a new folder. 2.jpg is not moved because there no xml matching this image
import os
import glob
import os.path
import shutil
path = os.getcwd()
j=0
os.chdir("path\\to\\folder\\")
os.mkdir("image_with_xml") # create a new folder
newpath = "path\\to\\folder\\"+"image_with_xml"
while j < len(glob.glob(path+"\\*"))-1:
a=glob.glob(path+"\\*")[j]
b=glob.glob(path+"\\*")[j+1]
print(a)
a1 = os.path.splitext(a)[0]
b1 = os.path.splitext(b)[0]
if a1==b1:
j=j+2
shutil.move(a,newpath) # move image to new path.
shutil.move(b,newpath) # move image to new path.
else:
j=j+1
The above code works well for moving a few, but not all images to new folder, in order to move the remaining images I have to create new folder inside the script then the remaining images are moved there(For example: Lets say I have 100 jpg with 100 matching XML then the first time I run this script only 62 are moved to new folder, the second time I run the script with different folder name the remaining 38 are moved to next folder). How do I modify script such that all images with matching XML are moved to one folder?
Upvotes: 1
Views: 2067
Reputation: 5177
This one should do the job. I created two lists, one of the xmls, one of the jpgs. Then I check, whether a filename exists in both lists. If yes: move!
For readability, I added a new function to create the lists.
import os
import glob
import shutil
def remove_ext(list_of_pathnames):
"""
removes the extension from each filename
"""
return [os.path.splitext(filename)[0] for filename in list_of_pathnames]
path = os.getcwd()
os.chdir("path\\to\\folder\\")
os.mkdir("image_with_xml") # create a new folder
newpath = os.path.join("path\\to\\folder\\","image_with_xml") # made it os independent...
list_of_jpgs = glob.glob(path+"\\*.jpg")
list_of_xmls = glob.glob(path+"\\*.xml")
print(list_of_jpgs, "\n\n", list_of_xmls) #remove
jpgs_without_extension = remove_ext(list_of_jpgs)
xmls_without_extension = remove_ext(list_of_xmls)
print(jpgs_without_extension, "\n\n", xmls_without_extension) #remove
for filename in jpgs_without_extension:
if filename in xmls_without_extension:
print("moving", filename) #remove
shutil.move(filename + '.jpg'), newpath) # move image to new path.
shutil.move(filename + '.xml'), newpath) # move image to new path.
Upvotes: 1