Reputation: 87
I have a list of directory paths that I am trying to reduce to the smallest list possible. For example:
bionic/main/installer-amd64/20101020ubuntu543/images/hd-media
bionic/main/installer-amd64/20101020ubuntu543/images/cdrom/xen
bionic/main/installer-amd64/20101020ubuntu543/images/cdrom
bionic/main/installer-amd64/20101020ubuntu543/images
bionic/main/installer-amd64/20101020ubuntu543
I can get rid of bionic/main/installer-amd64/20101020ubuntu543
because it can collapse into bionic/main/installer-amd64/20101020ubuntu543/images
which I can also get rid of because it can collapse into bionic/main/installer-amd64/20101020ubuntu543/images/cdrom
which then collapses into bionic/main/installer-amd64/20101020ubuntu543/images/cdrom/xen
I know there is a way to do this, but I can't figure it out for the life of me.
Upvotes: 2
Views: 78
Reputation: 2569
There is a simpler solution: If you sort the list, it is enough to compare every dir with the next one in the list:
dirs = sorted(dirs)
result = []
for i, d in enumerate(dirs):
if i == len(dirs) - 1 or not dirs[i+1].startswith(d): # unique
result.append(d)
for d in result:
print(d)
or, as a list comprehension:
result = [
d for i, d in enumerate(sorted(dirs))
if i == len(dirs) - 1 or not dirs[i+1].startswith(d)
]
Upvotes: 1
Reputation: 82929
If I understand your question correctly, you could put the directories into a tree structure and then get the leaf nodes from that tree, which are the directories that are not a parent to another directory in the list.
dirs = ["bionic/main/installer-amd64/20101020ubuntu543/images/hd-media",
"bionic/main/installer-amd64/20101020ubuntu543/images/cdrom/xen",
"bionic/main/installer-amd64/20101020ubuntu543/images/cdrom",
"bionic/main/installer-amd64/20101020ubuntu543/images",
"bionic/main/installer-amd64/20101020ubuntu543"]
tree = {}
for d in dirs:
t = tree
for x in d.split("/"):
t = t.setdefault(x, {})
# {'bionic': {'main': {'installer-amd64': {'20101020ubuntu543': {'images': {'hd-media': {}, 'cdrom': {'xen': {}}}}}}}}
def leafs(tree):
for x in tree:
if tree[x]:
for l in leafs(tree[x]):
yield x + "/" + l
else:
yield x
for l in leafs(tree):
print(l)
# bionic/main/installer-amd64/20101020ubuntu543/images/hd-media
# bionic/main/installer-amd64/20101020ubuntu543/images/cdrom/xen
Upvotes: 1