Jim T
Jim T

Reputation: 65

Unique lists within list of lists if those lists have list as one of the elements

If I have:

l = [['98765', ['Einstein, A', 'SFEN'], 'SSW 540', 3], ['98765', ['Einstein, A', 'SFEN'], 'SSW 540', 3],
     ['98764', ['Feynman, R', 'SFEN'], 'SSW 564', 3], ['98764', ['Feynman, R', 'SFEN'], 'SSW 564', 3]]

What would be the best way to get:

k = [['98765', 'Einstein, A', 'SFEN', 'SSW 540', 3], ['98764', 'Feynman, R', 'SFEN', 'SSW 564', 3]]

If I try:

uniqinstruct = set(map(tuple, l))

I get TypeError: unhashable type: 'list'. I don't want to remove all layers of nesting, because that would just combine everything into one list:

output = []

def reemovNestings(l):
    for i in l:
        if type(i) == list:
            reemovNestings(i)
        else:
            output.append(i)

reemovNestings(l)
print(sorted(set(output), key=output.index))

Output:

['98765', 'Einstein, A', 'SFEN', 'SSW 540', 3, '98764', 'Feynman, R', 'SSW 564']

If two instructors have the same count (i.e. 3 in this case), then only one 3 remains because it's a set, and I can't group the elements of the list by every x intervals. What would be a good way to preserve that last value?

Upvotes: 0

Views: 97

Answers (4)

jizhihaoSAMA
jizhihaoSAMA

Reputation: 12672

use itertools.groupby to divided them, and flatten them by a list comprehension. To ensure the order of list, you could use dict.fromkeys().

If you don't mind this too long list comprehension:

from itertools import groupby

l = [['98765', ['Einstein, A', 'SFEN'], 'SSW 540', 3], ['98765', ['Einstein, A', 'SFEN'], 'SSW 540', 3],
     ['98764', ['Feynman, R', 'SFEN'], 'SSW 564', 3], ['98764', ['Feynman, R', 'SFEN'], 'SSW 564', 3]]

s = [list(dict.fromkeys(e for i in item for j in i for e in (j if type(j) is list else [j])).keys()) for _, item in groupby(l)]
print(s)

Result:

[['98765', 'Einstein, A', 'SFEN', 'SSW 540', 3], ['98764', 'Feynman, R', 'SFEN', 'SSW 564', 3]]

Upvotes: 1

dmitryro
dmitryro

Reputation: 3506

Here's another possible solution.

First, flatten the original list:

def flatten(s):
    if s == []:
        return s
    if isinstance(s[0], list):
        return flatten(s[0]) + flatten(s[1:])
    return s[:1] + flatten(s[1:])

Your original input:

l = [['98765', ['Einstein, A', 'SFEN'], 'SSW 540', 3], 
     ['98765', ['Einstein, A', 'SFEN'], 'SSW 540', 3],
     ['98764', ['Feynman, R', 'SFEN'], 'SSW 564', 3], 
     ['98764', ['Feynman, R', 'SFEN'], 'SSW 564', 3]]

Now let's flatten and iterate over it

final = [] # the resulting list

items = flatten(l) # first flatten the input

# take every 5 elements and add them to the final list, if not there yet.
for i in range(0, len(items), 5):
    if not (items[i:i+5] in final):
        final.append(items[i:i+5])

#let's print [['98765', 'Einstein, A', 'SFEN', 'SSW 540', 3], 
              ['98764', 'Feynman, R', 'SFEN', 'SSW 564', 3]]
print(final) 

Upvotes: 0

Muslimbek Abduganiev
Muslimbek Abduganiev

Reputation: 941

Given that you know which layer you want to unwrap, you could just iterate through that layer. In your particular example, it's the second layer:

res = []
for inner_list in l:
    inner = []
    for el in inner_list:
        if type(el) == list:
            inner.extend(el)
        else:
            inner.append(el)
    if not (inner in res):
        res.append(inner)

Note that list.extend adds multiple values to the list.

if not (inner in res): res.append(inner) gives you unique items in the top layer. Thanks to @dmitryro for the tip.

Upvotes: 2

Canasta
Canasta

Reputation: 228

Use Numpy.unique with flatten list.

np.unique(flattened, axis=0)

Upvotes: 0

Related Questions