Maria Georgali
Maria Georgali

Reputation: 659

Cannot remove duplicates from a list using Python

I have a csv file which I want to edit so I read the file and copy the contents in a list. The list contains duplicates. So I do:

csv_in = list(set(csv_in))

But I get:

Unhashable list Error

with open(source_initial2, 'r', encoding='ISO-8859-1') as file_in, open(source_initial3, 'w', encoding='ISO-8859-1',newline='') as file_out:
  csv_in = csv.reader(file_in, delimiter=',')
  csv_out = csv.writer(file_out, delimiter=';')
  csv_in = list(set(csv_in))


for row in csv_in:

    for i in range(len(row)):
        if "/" in row[i]:
            row[i] = row[i].replace('/', '')

        if "\"" in row[i]:
            row[i] = row[i].replace('\"', '')
        if "Yes" in row[i]:
            row[i] = row[i].replace('Yes', '1')
        if "No" in row[i]:
            row[i] = row[i].replace('No', '0')
        if myrowlen > 5:
            break
    print(row)    
    csv_out.writerow(row)

The list is something like

[['DCA.P/C.05820', '5707119001793', 'P/C STEELSERIES SUR... QcK MINI', '5,4', 'Yes'],['DCA.P/C.05820', '5707119001793', 'P/C STEELSERIES SUR... QcK MINI', '5,4', 'Yes'].....['DCA.P/C.05820', '5707119001793', 'P/C STEELSERIES SUR... QcK MINI', '5,4', 'Yes']]

Why I get this, how can I solve it? thank you

Upvotes: 0

Views: 78

Answers (2)

RomanPerekhrest
RomanPerekhrest

Reputation: 92874

csv.reader contains rows where each row read from the csv file is returned as a list of strings.

While set object requires its items to be an immutable data type (thereby hashable), list type is not one of those.

test_reader = [[0,1,2], [3,4,5]]
print(set(test_reader))  # throws TypeError: unhashable type: 'list'

# after casting to tuple type
test_reader = [(0,1,2), (3,4,5)]
print(set(test_reader))   # {(0, 1, 2), (3, 4, 5)}

Upvotes: 1

matevzpoljanc
matevzpoljanc

Reputation: 211

The problem you have is that csv_in is a list of lists and list is not hashable datatype. In order to get around the issue you can do the following:

csv_in = list(set([tuple(row) for row in csv_in]))

or if you need it as a list of lists:

csv_in = [list(element) for element in set([tuple(row) for row in csv_in])]

Upvotes: 3

Related Questions