Pain
Pain

Reputation: 127

How to avoid repeating adding elements to a list?

How can I add unique file paths to the groups_of_files list while avoiding duplication due to the cycles in my loop?

for file in files_names:
for name_group, formats in groups_of_format.items():
    if file.split('.')[-1].upper() in groups_of_format.values():
        groups_of_files[groups_of_format.keys()].append(file)

Upvotes: 1

Views: 65

Answers (3)

Adon Bilivit
Adon Bilivit

Reputation: 27120

Build a dictionary keyed on the filename extensions. Associated values should be a set.

Subsequently, build the required dictionary by converting the sets to lists as follows:

import os

temp = dict()

files_names = ['a.txt', 'b.txt', 'b.txt', 'c.py', 'e.txt', 'f.py']

for file in files_names:
    _, ext = os.path.splitext(file)
    temp.setdefault(ext.upper()[1:], set()).add(file)

groups_of_files = {k: list(v) for k, v in temp.items()}

print(groups_of_files)

Output:

{'TXT': ['e.txt', 'b.txt', 'a.txt'], 'PY': ['c.py', 'f.py']}

Upvotes: 1

Riccardo Petraglia
Riccardo Petraglia

Reputation: 2003

Use sets instead of lists. Elements in sets are kept unique using an hash.

Something like:

groups_of_files = defaultdict(set)
for file in files_names:
  for name_group, formats in groups_of_format.items():
    if file.split('.')[-1].upper() in groups_of_format.values():
      groups_of_files[groups_of_format.keys()].add(file)

I assumed that groups_of_files is a dictionary. In the code example, when the element of the dictionary is missing, instead of raising exceptions, the element is created and the value is an empty set to which you can add your file. If file is of a custom type, make sure to define the __hash__ and the __eq__ methods.

If in the end you need anyway a list, you can convert a set to a list just using list() and the set as the argument.

Upvotes: 3

Razvan I.
Razvan I.

Reputation: 239

You can use a set to keep track of the files that have already been added to the groups_of_files list.

added_files = set()
for file in files_names:
    for name_group, formats in groups_of_format.items():
        if file.split('.')[-1].upper() in formats and file not in added_files:
            groups_of_files[name_group].append(file)
            added_files.add(file)

Upvotes: 2

Related Questions