how to correctly handle csv.reader headers

Question

When reading csv files, sometimes the first row (or more than one) are headers that we don't want to include in our data.

If I don't need the data from the headers I just use next before declaring the reader (if more than one row is used for headers I can call next multiple times):

with open('myfile.csv', 'rb') as f:
    next(f)                         #skip first row
    reader = csv.reader(f)
    for row in reader:
        #process my data

Sometimes however I don't want to include the headers in my data but still need their values. In that case I transform the csv.reader into a list and handle the headers separately.

with open('myfile.csv', 'rb') as f:
    reader = list(csv.reader(f))

    my_header = reader.pop(0)   #remove header

    for row in reader:
        #process my data

This works and I'm happy about it. But I'm not sure if it's the "best practice" way of using csv.reader and there are other ways worth exploring.

bruno desthuilliers · Accepted Answer

It's indeed not the best practice - it reads the whole file in memory for no good reason. The funny part is that there's almost nothing to change to your first snippet to get the headers...

next(iterator) does return the "current" element:

>>> it = iter(["hello", "world"])
>>> next(it)
'hello'
>>> next(it)
'world'
>>> next(it)
Traceback (most recent call last):
  File "", line 1, in 
StopIteration

So all you have to do is

with open('myfile.csv', 'rb') as f:
    reader = csv.reader(f)
    headers = next(reader)
    for row in reader:
        #process my data

FWIW, the way you skip "the first row" in your first snippet is brittle - you're actually skipping the first line, which is not necessarily the first row (some csv format have newlines embeded in rows), so for the "no header" version you actually want:

with open('myfile.csv', 'rb') as f:
    reader = csv.reader(f)
    next(reader) # skip first row
    for row in reader:
        #process my data

how to correctly handle csv.reader headers

Answers (2)

Related Questions