andilabs
andilabs

Reputation: 23301

CSV DictReader, how to force a part in "" to be read as list not as string

persons.CSV file looks like:

Firstname,Surname,Birth Year,Hobby
John,Smith,1990,"tenis,piano"
Andrew,Josh,1988,"surfing,art"

I would like that in program hobby will be represented as list not as string. How can I force to that DictReader?

The python code I use look as follows:

import csv
class Person(object):
extPerson = []
counter = 0

def __init__(self, **args):
    for k, v in args.items():
            setattr(self, k, v)
    Person.counter += 1
    Person.extPerson.append(self)
def __str__(self):
    s=""
    for k,v in self.__dict__.items():
        s+=k+": "+v+", "
    return s


csvdr = csv.DictReader(open('persons.csv'))


for p  in csvdr:
print p
Person(**p)

for p in Person.extPerson:
print p
print p.Hobby

The output looks as follows:

{'Birth Year': '1990', 'Hobby': 'tenis,piano', 'Surname': 'Smith', 'Firstname': 'John'}
{'Birth Year': '1988', 'Hobby': 'surfing,art', 'Surname': 'Josh', 'Firstname': 'Andrew'}
Birth Year: 1990, Hobby: tenis,piano, Surname: Smith, Firstname: John, 
tenis,piano
Birth Year: 1988, Hobby: surfing,art, Surname: Josh, Firstname: Andrew, 
surfing,art

I would like hobbys being packed in constructor into list:

(...)
Birth Year: 1990, Hobby: ['tenis','piano'], Surname: Smith, Firstname: John, 
['tenis', 'piano']
Birth Year: 1988, Hobby: ['surfing','art'], Surname: Josh, Firstname: Andrew, 
['surfing', 'art']

Upvotes: 0

Views: 638

Answers (3)

andilabs
andilabs

Reputation: 23301

I solved it as follows. Marius your answer was a kind of hint.

for p  in csvdr:
    #p["Hobby"] = p["Hobby"].split(',') not working, TypeError: cannot concatenate 'str' and 'list' objects

    l=p["Hobby"].split(',') #this will be list
    p["Hobby"]=l #let key show on value being list
    print p
    Person(**p)

We can make sure:

for p in Person.extPerson:
    print p
    print p.Hobby
    print type(p.Hobby)

results in:

{'Birth Year': '1990', 'Hobby': ['tenis', 'piano'], 'Surname': 'Smith', 'Firstname': 'John'}
{'Birth Year': '1988', 'Hobby': ['surfing', 'art'], 'Surname': 'Josh', 'Firstname': 'Andrew'}
Birth Year: 1990, Hobby: ['tenis', 'piano']Surname: Smith, Firstname: John, 
['tenis', 'piano']
<type 'list'>
Birth Year: 1988, Hobby: ['surfing', 'art']Surname: Josh, Firstname: Andrew, 
['surfing', 'art']
<type 'list'>
[Finished in 0.1s]

by the way str needed modification with checking type, and appropriately treating list:

def __str__(self):
    s=""
    for k,v in self.__dict__.items():
        if type(v) is not list:
            s+=k+": "+v+", "
        else:
            s+=k+": "+str(v)    
    return s

I am new to python, so ANY suggestions for better code practice will be appreciated.

Upvotes: 0

Marius
Marius

Reputation: 60080

As you're reading the rows in, you need to split() the hobby field:

one_row = {'Birth Year': '1990', 'Hobby': 'tenis,piano', 'Surname': 'Smith', 'Firstname': 'John'}
one_row['Hobby'] = one_row['Hobby'].split(',')
one_row
Out[7]: 
{'Birth Year': '1990',
 'Firstname': 'John',
 'Hobby': ['tenis', 'piano'],
 'Surname': 'Smith'}

In your current code, this would go here:

for p  in csvdr:
    p['Hobby'] = p['Hobby'].split(',')
    print p
    Person(**p)

Your current __str__ method won't work with the lists, but you only need a small change to fix that- you convert the list values to strings using str, and the string values are unaffected:

def __str__(self):
    s=""
    for k,v in self.__dict__.items():
        s += k + ": " + str(v) + ", "
    return s

Upvotes: 1

Joran Beasley
Joran Beasley

Reputation: 113988

class MyDictReader(csv.DictReader):
    def next(self):
        if self.line_num == 0:
            # Used only for its side effect.
            self.fieldnames
        row = self.reader.next()
        self.line_num = self.reader.line_num

        # unlike the basic reader, we prefer not to return blanks,
        # because we will typically wind up with a dict full of None
        # values
        while row == []:
            row = self.reader.next()
        row = map(lambda x:x.split(",") if "," in x else x,row)
        d = dict(zip(self.fieldnames, row))
        lf = len(self.fieldnames)
        lr = len(row)
        if lf < lr:
            d[self.restkey] = row[lf:]
        elif lf > lr:
            for key in self.fieldnames[lr:]:
                d[key] = self.restval
        return d

Upvotes: 0

Related Questions