user611105
user611105

Reputation:

Python CSV DictReader ignore columns?

If I'm using CSV.dictReader to read in an CSV, how would I go about having it ignore certain columns in the CSV?

For example,

"id","name","address","number","created"
"123456","someName","someAddress","someNumber","2003-5-0294"

And I want to just get the id and name using the reader, discarding and ignoring the rest. I tried using fieldnames but it still reads it in and sets it as "None". I noticed that the csv.DictWriter has an 'ignore' function but it seems the DictReader does not. Was hoping there was a more elegant way to do this versus just reading and then writing only the columns I want to another CSV and then reading that CSV using DictReader to do further processing.

Thanks guys!

Upvotes: 4

Views: 8917

Answers (4)

Raymond Hettinger
Raymond Hettinger

Reputation: 226544

The other posted solutions build new smaller dicts from the larger fully populated dicts returned by DictReader.

Something like this will be necessary because the DictReader API was intentionally designed not to skip fields. Here is an excerpt from the source:

    # unlike the basic reader, we prefer not to return blanks,
    # because we will typically wind up with a dict full of None
    # values
    while row == []:
        row = self.reader.next()
    d = dict(zip(self.fieldnames, row))

You can see that every fieldname gets assigned to the dictionary without filtering.

FWIW, it is not hard make your own variant of DictReader will the desired behavior. Model it after the existing CSV source.

Upvotes: 5

Austin Marshall
Austin Marshall

Reputation: 3107

from operator import itemgetter

cols=('name', 'id') #Tuple of keys you want to keep
valuesfor=itemgetter(*cols)

for d in dictreader_input:
    print dict(zip(cols, valuesfor(d))) # dict from zipping cols and values

Upvotes: 2

Steven Rumbalski
Steven Rumbalski

Reputation: 45552

This simple generator will do it.

def dict_filter(it, *keys):
    for d in it:
        yield dict((k, d[k]) for k in keys)

Use it like this:

dreader = [{'id':1, 'name':'Bob', 'other_stuff':'xy'},
           {'id':2, 'name':'Jen', 'other_stuff':'xx'}]

for d in dict_filter(dreader, 'id', 'name'):
    print d

gives:

{'id': 1, 'name': 'Bob'}
{'id': 2, 'name': 'Jen'}

Upvotes: 5

retracile
retracile

Reputation: 12339

Read in each row, then create a list of dicts with just the keys you want.

[{'id':r['id'], 'name':r['name']} for r in mydictreader]

Upvotes: 6

Related Questions