user20139344
user20139344

Reputation: 3

Remove Duplicates On a Single Line From a Text File

I have a text file that contains repeated words on certain lines. I need to count the different colors that appear in each line, but not any of the duplicates (so, for example: Red Red Blue Green --> would need to be Red Blue Green). I believe this can be done with sets in some way, but I'm new to Python and am having trouble figuring out how that might work.

Here is the code I have so far:

"""Counts color names in the slot machine file without duplicates."""


def main():
    data_directory_name = 'data'
    infile_name = input('Please enter the input filename: ')
    infile_path_and_name = f'{data_directory_name}/{infile_name}'
    infile = open(infile_path_and_name, 'r')
    color_count = {}

    for line in infile:
        slot_values = line.split()
        for slot_value in slot_values:
            color_count[slot_value] = color_count.get(slot_value, 0) + 1

    infile.close()

    these_keys = list(color_count.keys())
    these_keys.sort()

    print()
    print(f'{"COLOR":<10}{"COUNT":>7}')
    for this_key in these_keys:
        print(f'{this_key:<10}{color_count.get(this_key):>7,}')


main()

Upvotes: 0

Views: 33

Answers (2)

Mark Reed
Mark Reed

Reputation: 95242

Just use set:

slot_values = set(line.split())

That's assuming you want to remove duplicates before adding up. If for some reason you need the value to be a list, you can convert to one by calling list(...) on it.

Also, for things like opening and reading a file where you have to remember to close it at the end, it's better to use a context manager:

with open(infile_path_and_name, 'r') as infile:
    # ... do stuff with infile here ...

# ... when you exit the block (outdent), file is closed for you

Upvotes: 0

BeRT2me
BeRT2me

Reputation: 13242

Given: text.txt

Red Red Blue Green
Red Blue Green
Red Blue Green Green

Doing:

from collections import Counter

color_count = Counter()
with open('text.txt') as file:
    for line in file:
        color_count.update(set(line.split()))

print(color_count)

Output:

Counter({'Blue': 3, 'Green': 3, 'Red': 3})

Upvotes: 1

Related Questions