yayu
yayu

Reputation: 8088

Creating list of lists without duplication

I have a data structure with a lot of duplication. If I am to create a filtered list to list out all the unique types, I do

type_ids = []
for record in file:
    type_id = record['type_id']
    if type_id not in type_ids:
        type_ids.append(type_ids)

and I will get something like type_ids=['A','B','G']. Now I want something like a descriptive name fo the type along with the id, in a structure as types = [ ['A','Alpha'], ['B','Beta'], ['G','Gamma'] ] I tried

types = []
for record in file:
    type_id = record['type_id']
    type_name = record['type_name']
    if [type_id,type_name] not in types:
        types.append([type_id,type_name])

I get a list but with a lot of duplication and not all types represented. What is wrong in this code?

Upvotes: 0

Views: 77

Answers (2)

jayelm
jayelm

Reputation: 7657

In your original code, your if statement will always be executed, which is probably resulting in a lot of repetition. type_ids is a list of strings; your if statement is checking for membership of a list. There are no lists of the form [type_id, type_name] in type_ids. I'm not sure if you're looking for membership in the already existing type_ids or membership in types list you're building.

Rather, you want something like this:

types = []
for record in file:
    type_id = record['type_id'] # Assuming these two lines get the data correctly
    type_name = record['type_name']
    if type_id not in type_ids: # e.g. if 'A' in ['A', 'B', 'C']
    # OR, if [type_id, type_name] not in types:
        types.append([type_id], [type_name])

But, I'd recommend storing your information in a dictionary format, which is specifically designed for associated key-value pairs:

types = {}
for record in file:
    type_id = record['type_id']
    type_name = record['type_name']
    if type_id not in type_ids:
    # OR, if type_id not in types:
        types[type_id] = type_name

Upvotes: 1

kwarrick
kwarrick

Reputation: 6190

types = set((r['type_id'], r['type_name']) for r in file)

Python has a set type builtin that is an unordered collection of elements. You can create a set of unique (type_id, type_name) tuples with this one line.

Upvotes: 1

Related Questions