Ame
Ame

Reputation: 93

Python: find duplicate in a list of list to edit them

In my python project I've a list like this:

l1 = [['Bacon', 1995,'394'], ['Bacon', 1995, '39-46'], ['Bacon', 1998,'7'], ['Egg', 1998, '122-165'], ['Chicken', 2009, '10'], ['Shrimp', 1994, '67']]

My goal is to write the list in a word document, my problem is that when the name AND the year are duplicate in the list (like "Bacon" and "1995") I'd like to add a letter to the year to make them different. So The results should be:

l1 = [['Bacon', 1995a,'394'], ['Bacon', 1995b, '39-46'], ['Bacon', 1998,'7'], ['Egg', 1998, '122-165'], ['Chicken', 2009, '10'], ['Shrimp', 1994, '67']]

I've tried this:

n = ["a", "b", "c", "d", "e", "f", "g", "h"] # [...]

count = 0
for info in l1:
    if info[0] and info[1] in l1:
        print(f"{info[1]}{a[count]}")
#         para.add_run(info[0])
#         para.add_run(f"{info[1]}{n[count]}")
        count += 1     
    else:
        print(info[1])
#         para.add_run(info[0])
#         para.add_run(info[1])

and to create a new list, like this:

l2 = []
count = 0
for info in l1:
    if info[0] and info[1] in l2:
        l2.append(info[0], f"{info[1]}{a[count]}", info[2])
        count += 1     
    else:
        l2.append((info[0],info[1], info[2]))

But neither is working.

Upvotes: 3

Views: 80

Answers (2)

hiro protagonist
hiro protagonist

Reputation: 46869

an approach using collections.defaultdict. collect everything first in a nested dict and then re-create the list:

from collections import defaultdict
from string import ascii_letters

l1 = [['Bacon', 1995, '394'], ['Bacon', 1995, '39-46'],
      ['Bacon', 1998, '7'], ['Egg', 1998, '122-165'],
      ['Chicken', 2009, '10'], ['Shrimp', 1994, '67']]

index = defaultdict(lambda: defaultdict(list))

# collect
for food, year, page in l1:
    index[food][year].append(page)

# re-create list:
res = []
for food, rest in index.items():
    for year, pages in rest.items():
        if len(pages) == 1:
            res.append([food, year, pages[0]])
        else:
            for letter, page in zip(ascii_letters, pages):
                res.append([food, f"{year}{letter}", page])
print(res)

after the collect phase index looks something like this (i got rid of the defaultdicts for clarity):

{'Bacon': {1995: ['394', '39-46'], 1998: ['7']},
 'Chicken': {2009: ['10']},
 'Egg': {1998: ['122-165']},
 'Shrimp': {1994: ['67']}}

it outputs

[['Bacon', '1995a', '394'], ['Bacon', '1995b', '39-46'],
 ['Bacon', 1998, '7'], ['Egg', 1998, '122-165'],
 ['Chicken', 2009, '10'], ['Shrimp', 1994, '67']]

you might want to convert the year without a letter to a string as well. that way the types in your list are consistent.

Upvotes: 2

Amir
Amir

Reputation: 2031

you first need to go over the list and save which one has duplicate, and then increment a counter whenever you find one that has a duplicate:

from collections import defaultdict

l1 = [['Bacon', 1995,'394'], ['Bacon', 1995, '39-46'], ['Bacon', 1998,'7'], ['Egg', 1998, '122-165'], ['Chicken', 2009, '10'], ['Shrimp', 1994, '67']]
# count how many uniques you have
count_duplicates = defaultdict(int)
for x in l1:
    count_duplicates[(x[0], x[1])] += 1
# save only duplicates and set counter to 0
count_duplicates = {k: 0 for k, v in count_duplicates.items() if v > 1}
for x in l1:
    key = (x[0], x[1])
    if key in count_duplicates:
        print(x[0], str(x[1]) + 'abcdefgh'[count_duplicates[key]], x[2])
        count_duplicates[key] += 1
    else:
        print(x[0], x[1], x[2])

Upvotes: 1

Related Questions