nic
nic

Reputation: 115

Join lists without redundant elements in a list [python]

I'm having this error when trying to remove topics in a List[Tuple[Union[bytes, str], Union[dict, dict]]]

Here a sample of the list:

 analyzed_comments = [('setup.py', {'Topic_0': ['version', 'get'], 'Topic_1': ['version', 'get']}), 
    ('translation.py', {'Topic_0': ['multiline', 'pattern', 'skip'], 'Topic_1': ['multiline', 'concat', 'text']})]

I would like to have a resulting list that stores:

Something like:

comment_topics = [('setup.py', ['version', 'get']), 
                    ('translation.py', ['multiline', 'pattern', 'skip', 'concat', 'text'])]

This is what I wrote, but it doesn't seem to do this job well:

comment_topics = list()
temp_comments = list()
for file, comment in analyzed_comments:
    for topic in comment:
        elem = body[topic]
        temp_comments = list(set(elem + temp_comments))

    tupla = (file, temp_comments)
    comment_topics.append(tupla)
print(comment_topics)

have you got any ideas?

Upvotes: 2

Views: 110

Answers (1)

JANO
JANO

Reputation: 3066

My idea would be to iterate over the files and simply combine all topics. In the end we create a set from the topics to remove all duplicates:

bodies = [('setup.py', {'Topic_0': ['version', 'get'], 'Topic_1': ['version', 'get']}), 
    ('translation.py', {'Topic_0': ['multiline', 'pattern', 'skip'], 'Topic_1': ['multiline', 'concat', 'text']})]


result = {}
for x in bodies:
    values = []
    for v in x[1].values():
        values.extend(v)
    result[x[0]] = list(set(values))
    
result

Output:

{'setup.py': ['version', 'get'],
 'translation.py': ['multiline', 'pattern', 'text', 'skip', 'concat']}

You can also do it in one line:

{k: list(set(sum(v.values(),[]))) for k, v in bodies}

or with itertools which should be faster than sum() in most cases:

import itertools
result = {k: list(set(itertools.chain.from_iterable(v.values()))) for k, v in bodies}

Upvotes: 2

Related Questions