Reputation: 127
How can I add unique file paths to the groups_of_files
list while avoiding duplication due to the cycles in my loop?
for file in files_names:
for name_group, formats in groups_of_format.items():
if file.split('.')[-1].upper() in groups_of_format.values():
groups_of_files[groups_of_format.keys()].append(file)
Upvotes: 1
Views: 65
Reputation: 27120
Build a dictionary keyed on the filename extensions. Associated values should be a set.
Subsequently, build the required dictionary by converting the sets to lists as follows:
import os
temp = dict()
files_names = ['a.txt', 'b.txt', 'b.txt', 'c.py', 'e.txt', 'f.py']
for file in files_names:
_, ext = os.path.splitext(file)
temp.setdefault(ext.upper()[1:], set()).add(file)
groups_of_files = {k: list(v) for k, v in temp.items()}
print(groups_of_files)
Output:
{'TXT': ['e.txt', 'b.txt', 'a.txt'], 'PY': ['c.py', 'f.py']}
Upvotes: 1
Reputation: 2003
Use sets instead of lists. Elements in sets are kept unique using an hash.
Something like:
groups_of_files = defaultdict(set)
for file in files_names:
for name_group, formats in groups_of_format.items():
if file.split('.')[-1].upper() in groups_of_format.values():
groups_of_files[groups_of_format.keys()].add(file)
I assumed that groups_of_files
is a dictionary. In the code example, when the element of the dictionary is missing, instead of raising exceptions, the element is created and the value is an empty set to which you can add your file
. If file
is of a custom type, make sure to define the __hash__
and the __eq__
methods.
If in the end you need anyway a list, you can convert a set to a list just using list()
and the set as the argument.
Upvotes: 3
Reputation: 239
You can use a set to keep track of the files that have already been added to the groups_of_files list.
added_files = set()
for file in files_names:
for name_group, formats in groups_of_format.items():
if file.split('.')[-1].upper() in formats and file not in added_files:
groups_of_files[name_group].append(file)
added_files.add(file)
Upvotes: 2