goldisfine
goldisfine

Reputation: 4850

Error: list has no add() attribute?

I'm trying to delete duplicate entries from data that look like this:

name    phone   email   website
Diane Grant Albrecht M.S.           
Lannister G. Cersei M.A.T., CEP 111-222-3333    [email protected]  www.got.com
Argle D. Bargle Ed.M.           
Sam D. Man Ed.M.    000-000-1111    [email protected]   www.daManWithThePlan.com
Sam D. Man Ed.M.            
Sam D. Man Ed.M.    111-222-333     [email protected]   www.daManWithThePlan.com
D G Bamf M.S.           
Amy Tramy Lamy Ph.D.            

So that it looks like this:

name    phone   email   website
Diane Grant Albrecht M.S.           
Lannister G. Cersei M.A.T., CEP 111-222-3333    [email protected]  www.got.com
Argle D. Bargle Ed.M.           
Sam D. Man Ed.M.    000-000-1111, 111-222-333   [email protected]   www.daManWithThePlan.com
D G Bamf M.S.           
Amy Tramy Lamy Ph.D.    

Here's my code:

from collections import defaultdict
import csv
import re

input = open('ieca_first_col_fake_text.txt', 'rU')

# default to empty set for phone, email, website, area, degrees
extracted_data = defaultdict(lambda: [set(), set(), set()])

for row in input:
    for index, value in enumerate(row):    
        name = row[0]
        data = extracted_data[name].add(row)

for row in data: print row

I get this error:

AttributeError: 'list' object has no attribute 'add'
logout

UPDATE:

from collections import defaultdict
import csv
import re

input = open('ieca_first_col_fake_text.txt', 'rU')
input_r = csv.reader(input, delimiter = '\t')

# default to empty set for phone, email, website, area, degrees
extracted_data = defaultdict(lambda: [set(), set(), set()])

data = []

# Index on the name and then for that name add the rest of the information. 
for row in input_r:

    data_set = extracted_data[row[0]]
    for index, value in enumerate(row[1:]):
        data_set[index].add(value)

print data_set

output:

[set(['']), set(['']), set([''])]
logout

Upvotes: 0

Views: 2256

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121854

extracted_data values are lists of 3 sets each:

extracted_data = defaultdict(lambda: [set(), set(), set()])

You need to read the previous answer more closely and pick the right set to call .add() on.

The previous answer loops over 4 elements in your input line, uses the first element to find the list of sets, and adds each of the other 3 elements to those sets:

for index, value in enumerate(split(entry)):
    if index == 0:
        data_set = extracted_data[name]
    elif value:
        data_set[index - 1].add(value)

Personally, I'd use:

entry = entry.split()  # split on whitespace
for value, dset in zip(entry[1:], extracted_data[entry[0]]):
    dset.add(value)

to achieve the same thing.

Upvotes: 3

Related Questions