Mints
Mints

Reputation: 113

Python: How to segregate files by name?

I'm trying to segregate files by their name to perform a operation to each group. For example, if I have the following files:

name_a_1
name_a_2
name_a_3
name_b_4
name_b_5
name_b_6

I would like to first work with group a. When the operation is done, do the same operation to group b and so on. Suggestions on how can this be approached?

Upvotes: 1

Views: 184

Answers (3)

user1277476
user1277476

Reputation: 2909

A solution using defaultdict:

#!/usr/local/cpython-3.9/bin/python3

"""Split strings and aggregate."""

import collections

list_ = [
    'name_b_5',
    'name_a_1',
    'name_b_4',
    'name_b_6',
    'name_a_2',
    'name_a_3',
]

dict_ = collections.defaultdict(list)


def get_2nd(string):
    """Get the second _ delimited field."""
    return string.split('_')[1]


for string in list_:
    dict_[get_2nd(string)].append(string)

for key, value in sorted(dict_.items()):
    print(key, ' '.join(sorted(value)))

Upvotes: 0

user3064538
user3064538

Reputation:

Define a function that extracts the group from a filename, and then use it as the key= parameter to sorted and then itertools.groupby:

import itertools

filenames = ["name_a_1", "name_a_2", "name_a_3", "name_b_4", "name_b_5", "name_b_6"]

def get_group(filename):
    return filename.split("_")[1]

for group_name, group in itertools.groupby(sorted(filenames, key=get_group), get_group):
    for filename in group:
        print(group_name, filename)  # "a name_a_1" and so on

Upvotes: 0

Andrej Kesely
Andrej Kesely

Reputation: 195418

You can group the files to a temporary dictionary and then do an operation on each group. For example:

filenames = [
    'name_a_1',
    'name_a_2',
    'name_a_3',
    'name_b_4',
    'name_b_5',
    'name_b_6'
]

# group the filenames
groups = {}
for f in filenames:
    g = f.split('_')[1]
    groups.setdefault(g, []).append(f)

#groups is now:
# {'a': ['name_a_1', 'name_a_2', 'name_a_3'], 
#  'b': ['name_b_4', 'name_b_5', 'name_b_6']}

for grp, items in groups.items():
    # your operation on files from group `grp`
    for f in items:
        work(f)

Upvotes: 2

Related Questions